Re: [Corpora-List] Natural Language Toolkit: NLTK-Lite version 0.6.5 released

From: Hamish Cunningham (hamish@dcs.shef.ac.uk)
Date: Tue Jul 11 2006 - 12:41:26 MET DST

Next message: Diana Maynard: "Re: [Corpora-List] Natural Language Toolkit: NLTK-Lite version 0.6.5 released"

Previous message: Markus Heller: "Re: [Corpora-List] Natural Language Toolkit: NLTK-Lite version 0.6.5 released"
In reply to: Markus Heller: "Re: [Corpora-List] Natural Language Toolkit: NLTK-Lite version 0.6.5 released"
Next in thread: Diana Maynard: "Re: [Corpora-List] Natural Language Toolkit: NLTK-Lite version 0.6.5 released"
Reply: Diana Maynard: "Re: [Corpora-List] Natural Language Toolkit: NLTK-Lite version 0.6.5 released"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Markus,

You might try the unicode-based tokeniser included with GATE
(http://gate.ac.uk), or ask on the user list for a German specialisation of
it.

Best

-- 
Hamish
http://www.dcs.shef.ac.uk/~hamish/
Markus Heller wrote:
> Dear Corpora Community,
> 
> I recently saw that the tokenizer from the nltk package requires a good regex. 
> Does anybody have a reasonable regex for this package which can produce 
> decent tokens from modern texts, preferably German texts? I have tried out 
> the ones on the tutorial pages but I see a common package user is required to 
> develop his own regex for tokenizing purposes. Are there good (free) 
> tokenizer regexes around for this package? 
> 
> Thanks in advance,
> Markus
> 
> 
>

Next message: Diana Maynard: "Re: [Corpora-List] Natural Language Toolkit: NLTK-Lite version 0.6.5 released"
Previous message: Markus Heller: "Re: [Corpora-List] Natural Language Toolkit: NLTK-Lite version 0.6.5 released"
In reply to: Markus Heller: "Re: [Corpora-List] Natural Language Toolkit: NLTK-Lite version 0.6.5 released"
Next in thread: Diana Maynard: "Re: [Corpora-List] Natural Language Toolkit: NLTK-Lite version 0.6.5 released"
Reply: Diana Maynard: "Re: [Corpora-List] Natural Language Toolkit: NLTK-Lite version 0.6.5 released"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Tue Jul 11 2006 - 12:42:51 MET DST