[Corpora-List] Cyrillic tokenizer and sentence splitter

From: George Mitrevski (mitrege@auburn.edu)
Date: Thu May 12 2005 - 23:17:38 MET DST

  • Next message: Stoia Laura: "Re: [Corpora-List] portuguese parser --Thank you!"

    Hi folks.

    Can anyone reccomend a good perl sentence splitter and tokenizer that
    works well with Cyrillic characters/texts (Russian, Bulgarain, etc.)?
    I've tried some for English, German and other langauges, but they don;t
    do well with Cyrillic.

    thanks,

    George.

    Foreign Languages tel. 334-844-6376
    6030 Haley Center fax. 334-844-6378
    Auburn University
    Auburn, AL 36849
    home: www.auburn.edu/~mitrege



    This archive was generated by hypermail 2b29 : Thu May 12 2005 - 23:52:03 MET DST