Re: [Corpora-List] spanish tokenizer

From: Jorge Civera Saiz (jorcisai@iti.upv.es)
Date: Mon Oct 16 2006 - 16:07:35 MET DST

  • Next message: Daniel Wiechmann: "Re: [Corpora-List] Installing R-2.x.y on a unix system"

    Hi Maria,

    Take a look at Freeling:

    FreeLing 1.5 An Open Source Suite of Language Analyzers

    Here you can find information about FreeLing, an open source language analysis
    tool suite, released under the GNU Lesser General Public License (LGPL) of the
    Free Software Foundation.

    These tools have been developed at TALP Research Center, in Universitat
    Politècnica de Catalunya. Spanish and Catalan morphological dictionaries and
    grammars were initially developed by Centre de Llenguatge i Computació, in
    Universitat de Barcelona, and since then improved and extended to other
    languages thanks to many contributions.

    www: http://garraf.epsevg.upc.es/freeling/

    Best regards,

    Jorge

    Mensaje citado por Maria Esteva <mesteva@mail.utexas.edu>:

    > Dear all,
    >
    > I am a PhD student in the School of Information, University of Texas
    > at Austin. For my dissertation, I will text mine a large set of
    > corporate electronic records in Spanish. For this, I need to find an
    > open source spanish tokenizer, if possible in C++ although other
    > languages would be fine as well. I am familiar with the Lucene tool
    > set so if you know about another source where I can find this tool I
    > will appreciate your help.
    >
    > Thanks in advance,
    >
    > Maria Esteva
    >
    >

    -------------------------------------------------
    This mail sent through IMP: http://horde.org/imp/



    This archive was generated by hypermail 2b29 : Mon Oct 16 2006 - 17:22:24 MET DST