Corpora: Frequency Information and Lexical Verb Subcategorisation for German

From: Sabine Schulte im Walde (schulte@IMS.Uni-Stuttgart.DE)
Date: Fri Dec 07 2001 - 10:52:02 MET

  • Next message: Sylviane Granger: "Corpora: German linguistics post at Louvain University"

    Dear list members,

    we created frequency lists on word forms, word-tag pairs, lemma-tag
    pairs, etc. for German. The lists are similar in content and style to
    those from Adam Kilgariff for the BNC. In addition, we provide verb
    subcategorisation information for German, such as frequency and
    probability distributions over frames types. All data was obtained
    from a lexicalised statistical grammar model, trained on 35 million
    words of German newspaper data.

    Examples for the lexical information are given on
     http://www.ims.uni-stuttgart.de/tcl/RESOURCES/German-Lexicon-en.html
    The full data is freely available on request for non-commercial
    purposes.

    Regards,
    Sabine Schulte im Walde.



    This archive was generated by hypermail 2b29 : Fri Dec 07 2001 - 10:54:35 MET