The freeling suite includes an open source Spanish tokenizer implemented in
C++:
http://garraf.epsevg.upc.es/freeling/index.php
Regards,
Marco
Maria Esteva wrote:
> Dear all,
>
> I am a PhD student in the School of Information, University of Texas at
> Austin. For my dissertation, I will text mine a large set of corporate
> electronic records in Spanish. For this, I need to find an open source
> spanish tokenizer, if possible in C++ although other languages would be
> fine as well. I am familiar with the Lucene tool set so if you know
> about another source where I can find this tool I will appreciate your
> help.
>
> Thanks in advance,
>
> Maria Esteva
>
This archive was generated by hypermail 2b29 : Mon Oct 16 2006 - 16:57:01 MET DST