Dear all,
I am a PhD student in the School of Information, University of Texas
at Austin. For my dissertation, I will text mine a large set of
corporate electronic records in Spanish. For this, I need to find an
open source spanish tokenizer, if possible in C++ although other
languages would be fine as well. I am familiar with the Lucene tool
set so if you know about another source where I can find this tool I
will appreciate your help.
Thanks in advance,
Maria Esteva
This archive was generated by hypermail 2b29 : Mon Oct 16 2006 - 15:39:03 MET DST