Re: [Corpora-List] Constitution

From: Detmar Meurers (dm@ling.ohio-state.edu)
Date: Sun May 15 2005 - 19:41:30 MET DST

  • Next message: Lou Burnard: "Re: [Corpora-List] Constitution"

    Hi Jean,

        Anyway, it occurred to me that now that an aligned version exists
        (it was announced on this list the other day :
        http://logos.uio.no/opus), an interesting application would be to
        develop programs for the (semi?) automatic verification of
        translations! Has anybody done this before?

    One can see this as an instance of the task of detecting variation
    in corpus annotation. The variation n-gram approach for detecting
    inconsistencies/errors in corpus annotation that Markus Dickinson
    and I have worked on (cf. references below) should be able to do
    this task for aligned parallel corpora (we included it in a recent
    project proposal) - it'll be interesting to see what equivalence
    classes of nuclei and contexts work best for this task.

    Best,
    Detmar

    Markus Dickinson & Detmar Meurers (2005): `Detecting Errors in
      Discontinuous Structural Annotation'. Proceedings of the 43rd
      Annual Meeting of the Association for Computational Linguistics
      (ACL-05). Ann Arbor, Michigan.

    Markus Dickinson & Detmar Meurers (2005): `Detecting Annotation
      Errors in Spoken Language Corpora'. Proceedings of the Special
      session on treebanks for spoken language and discourse at the 15th
      Nordic Conference of Computational Linguistics (NODALIDA-05).
      Joensuu, Finland.

    Markus Dickinson & Detmar Meurers (2003): `Detecting Inconsistencies
      in Treebanks'. Proceedings of the Second Workshop on Treebanks and
      Linguistic Theories (TLT 2003). Växjö, Sweden.

    Markus Dickinson & Detmar Meurers (2003): `Detecting Errors in
      Part-of-Speech Annotation'. Proceedings of the 10th Conference of
      the European Chapter of the Association for Computational
      Linguistics (EACL-03). Budapest, Hungary.

    Available from http://ling.osu.edu/~dm/papers.html

    --
    Detmar Meurers, Assistant Professor, Dept. of Linguistics, OSU
    201a Oxley Hall, 1712 Neil Avenue, Columbus OH 43210-1298, USA
    http://ling.osu.edu/~dm/                 GnuPG key on web page
    



    This archive was generated by hypermail 2b29 : Sun May 15 2005 - 20:59:48 MET DST