Re: [Corpora-List] Q: morpheme lexicons on WWW?

From: Mike Maxwell (maxwell@ldc.upenn.edu)
Date: Wed Oct 05 2005 - 02:22:16 MET DST

  • Next message: Cédrick Fairon: "[Corpora-List] TALN2006: Call for Workshops & Tutorials"

    Eric Atwell wrote:
    > Can anyone recommend a source of morpheme lexicons/dicitonaries findable
    > on WWW, covering a wide range of languages? Basically a list of all
    > morphemes for each language,

    Isn't what you really want a list of allomorphs--or more precisely,
    allographs? E.g. for English, not just the suffix -s, but its allograph
    -es; and not just the root 'try', but also its allograph 'tri' or 'trie'
    (as in 'he tries too hard'; note that the morpheme boundary here is
    unclear, although from their example "invited" -> invit-ed rather than
    *invite-d, I would assume the 'tri-ed' segmentation is the one they're
    looking for). Likewise for Turkish, since vowel harmony is represented in
    the orthography (similarly for Finnish, I _think_).

    For Turkish, years ago Jorge Hankamer wrote a morphological parser in C
    which had a large list of roots and affixes. I have no idea whether he
    ever put that in the public domain.

    BTW, I thought the point of the MorphoChallenge was not just to infer the
    morpheme boundaries in words, but to infer the lists of morpheme
    themselves. But the rules don't explicitly say that...

    -- 
    	Mike Maxwell
    	maxwell@ldc.upenn.edu
    



    This archive was generated by hypermail 2b29 : Wed Oct 05 2005 - 04:43:24 MET DST