Re: Corpora: error tagging of learners' English

From: John Milton (lcjohn@ust.hk)
Date: Fri Nov 17 2000 - 13:01:13 MET

  • Next message: Steffen Staab: "Corpora: CfP - Semantic Web 2001 Workshop at WWW10"

    Care to share how the coding was done, and an example of tagged text? I
    annotated about 100,000 words of written HK IL by first POS-tagging it
    (with CLAWS), mapping a keyboard with a set of error tags that describe
    mostly morpheme-level errors, and (guided by the CLAWS tags), going
    through manually inserting error tags. Then I concordanced on the error
    tags to determine higher constituent errors. Of course, this is fraught
    with subjectivity since you have to project what the L2 writer would have
    written had s/he used an acceptable structure, which is often quite
    different from what an NS might have written, and there are usually
    multiple possibilities... tricky stuff this...

    Your example reminds me of the instructor who objected to having his
    students accessing a concordancer because he found a single line in French
    in thousands of corcordanced examples of English texts that read something
    like "Réservations et Informations Pour le Passager", and was convinced
    that his students would use this to challenge the countability rule...

    John



    This archive was generated by hypermail 2b29 : Fri Nov 17 2000 - 12:59:11 MET