Re: [Corpora-List] Dictionaries/Lexical Databases

From: maxwell@ldc.upenn.edu
Date: Mon Nov 27 2006 - 21:36:50 MET

  • Next message: Shane Axtell: "Re: [Corpora-List] Dictionaries/Lexical Databases"

    Quoting Shane Axtell <shane.axtell@gmail.com>:
    > I'm looking for lexical databases (a.k.a. dictionaries) that are freely
    > available and contain at least the part of speech information for each
    > entry.

    I presume you're speaking of languages other than (or in addition to)
    English. You might have a look at the links at
    http://www.netvouz.com/mcswell/folder/7773878411777326817/Dictionaries
    These are mostly links to _on-line_ dictionaries, which are not
    necessarily dowloadable. If you automatically submit a large number of
    queries to them, some of them might overload or cut you off, for all I
    know.

    > This database will be connected to an NLP system that will take in
    > unstructure corpora as input and output the data in a structured manner. Any
    > leads along these lines would be greatly appreciated

    I'm not quite sure what you're trying to do--add POS tags to text? Of
    course that will be ambiguous in many languages (like English), if you
    just take the POS from a dictionary. And it won't work at all if the
    language has much in the way of inflectional morphology (without a
    morphological parser or stemmer, or at least some heuristics). Not to
    mention named entities.

    But you probably have a plan for dealing with these sorts of problems.

       Mike Maxwell
       CASL/ U MD

    ----------------------------------------------------------------
    This message was sent using IMP, the Internet Messaging Program.



    This archive was generated by hypermail 2b29 : Mon Nov 27 2006 - 21:54:05 MET