[Corpora-List] Croatian National Corpus reached 100 million tokens

From: Marko Tadic (mtadic@ffzg.hr)
Date: Sat Feb 04 2006 - 08:50:05 MET

  • Next message: Nicholas Sanders: "Re: [Corpora-List] Croatian National Corpus reached 100 million tokens"

    Dear colleagues,
    it is our privilege to inform you that the Croatian National Corpus (HNK v
    2.0) reached 100 million tokens (101.3 in fact) at the end of December
    2005.
    The corpus is available for public and free access using Bonito
    (http://www.textforge.cz/download) free client program.
    All neccessary details are available at the HNK web site:
    http://www.hnk.ffzg.hr.
    The new version of HNK v 2.5 (scheduled for spring 2006) will feature
    lemma and MSD search possibility as well. Meanwhile only a small test
    subcorpus (cw2000) offers this kind of search.
    Anyway, I hope that you will find HNK useful as a source of primary
    linguistic data for Croatian.
    Any comment, suggestion, criticism etc. is more than welcome.
    All the best
    Marko Tadic
    -----------------------------------------------------------------------
    Marko Tadic, Associate Professor
    Head of the Department of Linguistics
    Faculty of Philosophy, University of Zagreb
    Ivana Lucica 3, HR-10000 Zagreb, Croatia
    tel. +385 1 6120-142, 6120-045
    fax. +385 1 6156-879
    personal homepage: www.hnk.ffzg.hr/mt/

    *** Visit the pages of Croatian national corpus: www.hnk.ffzg.hr ***
    *** Visit the Croatian Morphological Lexicon: hml.ffzg.hr ***
    *** Visit the Croatian Language Technologies portal: www.hnk.ffzg.hr/jthj/
    ***
    *** Visit the pages of Croatian Language Technologies Society: www.hdjt.hr
    ***



    This archive was generated by hypermail 2b29 : Mon Feb 06 2006 - 09:45:17 MET