[Corpora-List] Summary of responses: German lemma list

From: Niels Ott (niels@drni.de)
Date: Sat Mar 10 2007 - 17:57:10 MET

  • Next message: Nicolas Nicolov: "[Corpora-List] Recent Advances in NLP: 2nd CFP"

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    Dear all,

    over a week ago I asked for a German lemma list. I received a number of
    replies. From all suggestions made, the one of extracting a lemma list
    from the ispell word list won the race... because this was the easiest
    thing to do in the limited time we had.

    Let me briefly summarize the suggestions I received both on the list and
    in private (in no particular order):

    Annette Klosa offered a contract over academic use of the word list from
    the Elexico project which is based in frequency data from the German IDS
    corpora. http://www.elexiko.de/

    Lars Aronson was the one who suggested to use German spell checker
    dictionaries, namely those of ispell/aspell/myspell/hunspell.*

    René Witte suggested to have a look at the Durm Lemmatizer which
    apparently comes with a lexicon.*
    http://www.ipd.uni-karlsruhe.de/~durm/tm/lemma/

    Yannick Versley suggested to use the lexicon of the CDG parser.*
    http://nats-www.informatik.uni-hamburg.de/view/CDG/DownloadPage

    Peter Adolphs suggested to have a look at Morphy by Wolfgang Lezius
    which can export the lexical data it uses. http://www.wolfganglezius.de/

    [*]: Those are (part of) open source projects.

    Thank you very much for your assistance!

    Regards,

       Niels Ott

    Niels Ott schrieb:
    > Dear all,
    >
    > about a month ago there as a little discussion going on here about
    > English lemma lists.
    >
    > We should have a lemma list for German. There is no special requirement
    > but containing lemmata, e.g.
    >
    > Haus
    > Katze
    > gehen
    > sitzen
    >
    > Furthermore it would be nice if the list was equipped with POS. But
    > that's not a strict requirement.
    >
    > It would be admirable if this list was free in the sense of free
    > speech/open source or if use was restricted to non-commercial
    > applications. (This is for a student's project at Univ.)
    >
    > Thank you very much in advance for your assistance.
    >
    > Regards,
    >
    > Niels Ott
    >
    >

    - --
    Niels Ott - Computational Linguist (B.A.) - http://www.drni.de/niels/
    Tangente: Veralgter Wasservogel
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.2.2 (GNU/Linux)

    iD8DBQFF8uNmbosnVosUgx0RAkg/AJ4wKmPcKI3s0aSiDB6OL7QfYJyKfgCeLZ8a
    Byz/Td4bitSXc3nUcymTmWw=
    =88T4
    -----END PGP SIGNATURE-----



    This archive was generated by hypermail 2b29 : Sat Mar 10 2007 - 17:56:01 MET