Re: [Corpora-List] error tagging

From: isahara@crl.go.jp
Date: Fri Sep 26 2003 - 20:22:42 MET DST

  • Next message: Alexander Gelbukh: "[Corpora-List] CFP: CICLing 2004 (Computational Linguistics) news: Speech Processing keynote speaker; late submissions"

    Dear Tim,
    How are you?
    Thank you very much for mentioning our SST corpus.

    Dear Belen,
    If you are interested in the SST corpus, please
    contact Ms. Emi Izumi (emi@crl.go.jp) and me.

    Best,

    Hitoshi Isahara (isahara@crl.go.jp)
    Leader of the Computational Linguistics Group
    Communications Research Laboratory, Japan

    At Fri, 26 Sep 2003 10:49:31 -0700 (PDT),
    Timothy Baldwin <tbaldwin@csli.stanford.edu> wrote:
    >
    > > I am interested in error tagging and I am looking for corpora which are (or are being) error tagged. Do you know of any? And do you know of any available error tagset?
    >
    > One more recent effort I know of is the SST Corpus, which is a 1m word corpus
    > of transcribed English speech by Japanese learners of English. Various errors
    > are tagged, although I can't find any online account of the full tagset. There
    > are a couple of papers in English on the corpus, notably:
    >
    > Tono, Y., Kaneko, T., Isahara, H., Saiga, T. and Izumi, E. The Standard
    > Speaking Test (SST) Corpus: A 1 million-word spoken corpus of Japanese
    > learners of English and its implications for L2 lexicography. Lee, S. (ed.)
    > ASIALEX 2001 Proceedings: Asian Bilingualism and the Dictionary. The Second
    > Asialex International Congress, August 8-10, 2001, Yonsei University, Korea,
    > pp. 257-262
    >
    > There is a web page with some documentation and a copy of this paper at:
    >
    > http://leo.meikai.ac.jp/~tono/sst/
    >
    > There was also a paper at this year's ACL:
    >
    > Emi Izumi, Kiyotaka Uchimoto, Toyomi Saiga, Thepchai Supnithi and Hitoshi
    > Isahara (2003) Automatic error detection in the Japanese learners' English
    > spoken data. In Companion Volume to the Proceedings of the 41st Annual Meeting
    > of the Association for Computational Linguistics (ACL '03), pp. 145-8.
    >
    > which is also available online at:
    >
    > http://acl.ldc.upenn.edu/acl2003/posterdemo/pdf/Izumi.pdf
    >
    >
    >
    > Tim
    >
    > *-----------------------------------*
    >
    > Timothy Baldwin
    > Senior research engineer
    > Multiword Expression project
    > CSLI LinGO Lab
    >
    >
    > Contact details:
    >
    > Email: tbaldwin@csli.stanford.edu
    > Tel: (+1)-650-723-0515
    > Fax: (+1)-650-723-2166
    >
    > *-----------------------------------*
    >



    This archive was generated by hypermail 2b29 : Fri Sep 26 2003 - 20:21:41 MET DST