[Corpora-List] "errors and the art of correcting"

From: Kristina Hmeljak (kristina.hmeljak@guest.arnes.si)
Date: Mon Nov 13 2006 - 16:38:35 MET

  • Next message: Alexandre Rafalovitch: "Re: [Corpora-List] BILINGUAL PARALLEL CORPORA"

    Another corpus of learner English with native-speaker corrections
    is being developed by Kevin Mark at Meiji University. It is made up of
    sentences produced by Japanese university students and their
    reformulations, written by Mark himself.

    A paper on this subject at the 2003 Corpus Linguistics conference in
    Lancaster
    is cached at
    http://scholar.google.com/scholar?hl=en&lr=&q=cache:
    8XJdEcxkfxUJ:korpus.dsl.dk/cl2003/cdrom/papers/mark.pdf+Kevin+Mark,
    +learner+corpus

    Kristina Hmeljak
    Dept. of Asian and African Studies, Faculty of Arts, University of
    Ljubljana

    On 11. Nov 2006 , at 9:50 PM, TadPiotr wrote:

    > A collection of corpora along those lines -- native vs non-native
    > English --
    > have been compiled by Sylviane Granger. At least the Polish sub-corpus
    > contained texts corrected later by native speakers. The analysis of
    > the
    > errors was done by Przemek Kaszubski in his PhD. Here are some
    > quotations
    > and links:
    >
    > " One of the major international collections built on strict sampling
    > principles is the International Corpus of Learner English (ICLE),
    > which
    > contains argumentative essays acquired from learners in more than a
    > dozen
    > different EFL countries in Europe and beyond. Although the ICLE
    > corpus is
    > not yet available to the public, research on it has been carried
    > out for
    > years. "
    > Przemek Kaszubski http://www.hltmag.co.uk/dec99/idea.htm
    >
    > The Louvain Centre for English Corpus Linguistics has played a
    > pioneering
    > role in promoting computer learner corpora (CLC) and was among the
    > first, if
    > not the first, to begin compiling such a corpus. The Centre's
    > computerised
    > databank is known as the International Corpusof Learner English
    > (ICLE) and
    > is the result of over ten years of collaborative activity between
    > a number
    > of universities internationally and currently contains over 2
    > million words
    > of writing by learners of English from 19 different mother tongue
    > backgrounds. The writing in the corpus has been contributed by
    > advanced
    > learners of English as a foreign language rather than as a second
    > language
    > and is made up of 19 distinct sub-corpora,each containing one language
    > variety (E2French, E2German, E2Swedish etc). The type of writing being
    > collected is essay writing (see below for fuller details). Advanced
    > students
    > can, for the purpose of the project, be broadly defined as university
    > students of English in their 3rd or 4th year of study. In cases
    > where the
    > comparability of the level is in doubt, sample pieces of writing
    > should be
    > submitted beforehand.
    > http://cecl.fltr.ucl.ac.be/Cecl-Projects/Icle/icle.htm#heading1
    >
    > Best
    > Tadeusz Piotrowski



    This archive was generated by hypermail 2b29 : Mon Nov 13 2006 - 16:39:31 MET