RE: Corpora: Russian POS tagger

From: Alexander Gelbukh (gelbukh@cic.ipn.mx)
Date: Wed Jul 18 2001 - 02:08:06 MET DST

  • Next message: Mohamed Noamany: "Corpora: Annotation tool & Arabic POS Respons."

    Hi,

    I have developed a Russian morphological analyzer / synthesyzer (not
    tagger!), which I can send you for reserach purpose and without the right to
    transfer it to others. I would apprciate if in the resulting work you cite
    my papers that describe the analyzer (I will provide you with the references
    and full texts; see also my publications for 1989 to 1993 at my webpage, see
    below).

    Unfortunately, because of copyright issues I only can send you a rather old
    version, which is a bit incomplete (perfection does not exist :-) and the
    dictionary is limited to some 90,000 lexemes.

    As far as I recall, that version does not handle a few frequent words like
    "to be" (becuase of their complicated morphological patterns), though it
    does handle more regular words.

    To add a new word to the dictionary is possible but a bit complicated (I
    would need to write English documentation on how to do this for you). For
    now, it just says "unknown word" for the words it does not have in the
    dictionary.

    Thank you.
    Alexander

    =====================================
    Prof. Dr. Alexander Gelbukh (Alexandre Guelboukh Kahn),
    Professor and researcher, head of NLP Lab,
    Centro de Investigacion en Computacion (CIC),
    Instituto Politecnico Nacional (IPN).
    Address: CIC, IPN, entrada por calle Venus (cerca de Metro Poli),
             Col. Zacatenco, CP 07738, Mexico DF., Mexico
    Office: (+52) 5729-6000 ext. 56544, 56518, 56602, home 5597-0709
    Fax: +1 (520) 441-1817 (personal), (+52) 5586-2936 (shared)
    gelbukh@earthling.net, gelbukh@cic.ipn.mx, www.cic.ipn.mx/~gelbukh
    =====================================

    > -----Original Message-----
    > From: owner-corpora@lists.uib.no [mailto:owner-corpora@lists.uib.no]On
    > Behalf Of Hailing Jiang
    > Sent: Tuesday, July 17, 2001 4:24 PM
    > To: corpora@hd.uib.no
    > Subject: Corpora: Russian POS tagger
    >
    >
    > Dear list members,
    >
    > Does anyone know a Russian morphological analyzer or
    > POS tagger that is publicly available or free for
    > research purpose? I searched the net and only found
    > this online demo of Brill's tagger trained for Russian:
    > (http://www.ling.gu.se/~lager/Home/brilltagger_ui.html)
    > There is no information on how to use it.
    >
    > any related information is appreciated.
    >
    > Thanks in advance,
    > Hailing Jiang
    >
    >
    >



    This archive was generated by hypermail 2b29 : Wed Jul 18 2001 - 01:09:19 MET DST