Corpora: Tokenizer for French/English

Noemi Preissner (noemi@CoLi.Uni-SB.DE)
Thu, 16 Jul 1998 20:03:45 +0200 (MET DST)

Hi,

can anybody give me a hint where I can find tokenizers for French and/or
English text? Even rather simple scripts (e.g. perl) would be helpful!
(Please don't recommend scripts splitting on white space only though ... )

Thanks in advance,

Noemi

(noemi@coli.uni-sb.de)