Re: Corpora: morphems

From: Markus Schulze (max@linguistik.uni-erlangen.de)
Date: Tue Feb 22 2000 - 14:22:34 MET

  • Next message: Christopher Tribble: "Corpora: Looking for "ing" - re Ute Römer''s request for help"

    Dear Mrs. Mühlmeyer,

    at the URL http://www.linguistik.uni-erlangen.de/LAPTDA/laptda.html,
    you will find various list of allomorphs, morphemes and wordforms
    exracted from eight corpora each of the size of one million
    types. There are seven corpora of the domains computer science,
    geography, law, medicine, sports, linguistics and economy as well as a
    representative reference corpus.

    The morphemes were extracted with the morphological analyser DMM (see:
    http://www.linguistik.uni-erlangen.de/~orlorenz/DMM/DMM.en.html)
    which was developed with MALAGA
    (see: http://www.linguistik.uni-erlangen.de/Malaga.en.html).

    MALAGA is freely available for non-commercial use - and the DMM soon
    will be, so that you will then be able to extract your own morpheme
    lists from any corpus.

    Hope that helps
    Markus Schulze

    ----------------------------------------------------------------------
                Department for Computational Linguistics
                Markus Schulze
                Bismarckstr. 6 fon: +49-9131-85-29252
                91054 Erlangen fax: +49-9131-85-29251
                http://www.linguistik.uni-erlangen.de/~max/
    ----------------------------------------------------------------------

    AM> Dear Collegues,
    AM> I'm looking for a list of morphems of German language, as complete as
    AM> possible, the morphems as short as possible. For example:
    AM> /zer/riss/en
    AM> /mög/lich/keit/en
    AM> /grübel/n
    AM> /grübl/er/isch
    AM> best regards
    AM> Agnes Mühlmeyer-Mentzelbegin:vcard
    AM> n:Mühlmeyer-Mentzel;Agnes
    AM> tel;fax:030 838-55986
    AM> tel;work:030 838-55723
    AM>



    This archive was generated by hypermail 2b29 : Tue Feb 22 2000 - 14:22:06 MET