Re: [Corpora-List] Downloadable English-language resources

From: Martin Reynaert (Reynaert@uvt.nl)
Date: Mon Jan 29 2007 - 10:06:33 MET

  • Next message: Bas Aarts: "[Corpora-List] Survey of English Usage Annual Report 2006"

    Hi,

    I am sure your search will be aided by using the right terminology.

    You are looking for:

    1/ a POS-tagger (POS = Part Of Speech). POS- taggers come with different
    tag-sets, offering varying levels of detail.

    2/ a lemmatizer, which given a derived word form, returns its lemma.

    These two programmes often form a pair.

    Greetings,

    Martin Reynaert
    Postdoc
    ILK
    Tilburg University
    The Netherlands

    Gordana Ilic Holen wrote:
    > Dear list members,
    >
    > We are looking for software/data that help in performing the following
    > task programmatically, i.e., we want to use the described capability
    > form a piece of software we are writing.
    >
    > The task is to look up an English word in order to determine its
    > class.
    >
    > We would also like to be informed if the word is a derived form of
    > another "main entry" or form. In the latter case we would like to be
    > told what the main form is: e.g., "children" has main form "child",
    > "ran" has main form "run". (Of course, these main form need not be
    > unique, so the look up might result in several main forms.)
    >
    > Note: it is essential that lookup can be performed locally (offline).
    > The reason is that we want to lookup a lot of words. (The
    > software/data does not need to be free, but we would prefer it to be.)
    >
    > Thanks in advance for any pointers.
    >
    >
    > Gordana Ilic Holen and Bjarte M. Østvold
    >
    >



    This archive was generated by hypermail 2b29 : Mon Jan 29 2007 - 09:55:38 MET