[Corpora-List] spanish tokenizer

From: Maria Esteva (mesteva@mail.utexas.edu)
Date: Mon Oct 16 2006 - 15:31:10 MET DST

  • Next message: Marco Baroni: "Re: [Corpora-List] spanish tokenizer"

    Dear all,

    I am a PhD student in the School of Information, University of Texas
    at Austin. For my dissertation, I will text mine a large set of
    corporate electronic records in Spanish. For this, I need to find an
    open source spanish tokenizer, if possible in C++ although other
    languages would be fine as well. I am familiar with the Lucene tool
    set so if you know about another source where I can find this tool I
    will appreciate your help.

    Thanks in advance,

    Maria Esteva



    This archive was generated by hypermail 2b29 : Mon Oct 16 2006 - 15:39:03 MET DST