[Corpora-List] Wacky! Working Papers on the Web as Corpus

From: Marco Baroni (baroni@sslmit.unibo.it)
Date: Tue Sep 26 2006 - 14:42:45 MET DST

  • Next message: Mai Zaki: "[Corpora-List] slips of the tongue"

    Dear All,

    We are glad to announce that the book:

    Wacky! Working Papers on the Web as Corpus

    is freely available online from the address:

    http://wackybook.sslmit.unibo.it/

    Alternatively, a hard-copy of the book can be purchased from the publisher
    (http://www.gedit.it/).

    Details follow.

    Best regards,

    Marco Baroni and Silvia Bernardini

    ****************************

    Baroni, Marco and Bernardini, Silvia (eds). Wacky! Working Papers
    on the Web as Corpus. Bologna: GEDIT. 2006. [ISBN: 88-6027-004-9]

    The book collects articles deriving from presentations at two Web as Corpus
    workshops (held in Forlì and Birmingham in 2005) and articles that were
    born out of discussions and collaborative experimentation among the WaCky
    community members. WaCky (for "Web as Corpus kool ynitiative") brings
    together linguists who think the World Wide Web is a great resource for
    their research, and that it would be even greater if it could be annotated
    and interrogated in a more linguist-friendly way.

    Topics covered in the book include practical experiences with the
    construction and evaluation of Web corpora, methods to classify and
    represent Web corpora, and applications to terminology. The introduction
    provides an accessible account of the various steps and issues involved in
    building very large Web corpora and making them available to the linguistic
    community. English, Chinese and Japanese are among the studied languages.

    Web corpora are undoubtedly a timely and important topic for the
    corpus/computational linguistics community. This book is unique in that it
    provides detailed technical discussion of the issues related to
    constructing Web corpora, as well as examples of concrete applications to
    terminology practice and teaching. As such, it should be of interest to a
    wide audience of linguists, language technologists, language/translation
    teachers and language professionals.

    Table of Contents:

    A WaCky Introduction
    Silvia Bernardini, Marco Baroni and Stefan Evert

    Experience Building a Large Corpus for Chinese Lexicon Construction
    Thomas Emerson and John O'Neil

    Creating General-Purpose Corpora Using Automated Search Engine Queries
    Serge Sharoff

    Evaluation of Japanese Web-Based Reference Corpora: Effects of Seed
    Selection and Time Interval
    Motoko Ueyama

    Measuring Web Corpus Randomness: A Progress Report
    Massimiliano Ciaramita and Marco Baroni

    Using the Web as a Source of LSP Corpora in the Terminology Classroom
    Sara Castagnoli

    Specialized Corpora from the Web and Term Extraction for Simultaneous
    Interpreters
    Claudio Fantinuoli

    The Net for the Graphs: Towards Webgenre Representation for Corpus
    Linguistic Studies
    Alexander Mehler and Rüdiger Gleim

    ****************************



    This archive was generated by hypermail 2b29 : Tue Sep 26 2006 - 15:03:49 MET DST