Re: [Corpora-List] Hierarchically classified corpora?

From: Tony Abou-Assaleh (taa@acm.org)
Date: Tue Jan 16 2007 - 17:14:59 MET

  • Next message: Jose Maria Gomez Hidalgo: "Re: [Corpora-List] Hierarchically classified corpora?"

    Hi Daniel,

    Some datasets that come to mind are ACM digital library for CS-related
    publications (but need to be careful about licensing issues), and dmoz.org
    for Web pages. The open directory dmoz.org is available for several
    languages.

    Cheers,

    TAA

    -----------------------------------------------------
    Tony Abou-Assaleh
    Email: taa@acm.org
    Web site: http://tony.abou-assaleh.net
    ----------------------[THE END]----------------------

    On Tue, 16 Jan 2007, Daniel Beck wrote:

    > Hello corpora mailing list,
    >
    > I'm working on my master thesis "Accurate Hierarchical Classification
    > using NLP Techniques". I hope to improve the accuracy of hierarchical
    > classification on English and German corpora by using additional
    > information extracted with aid of linguistic tools.
    >
    > I would like to ask where I can obtain corpora which are already
    > classified in a hierarchy. I need several English and German corpora. I
    > would prefer if the topics of the corpora are about linguistic or
    > computer science.
    >
    > Regards & Thanks,
    >
    > Daniel
    >
    >
    >



    This archive was generated by hypermail 2b29 : Tue Jan 16 2007 - 17:12:20 MET