[Corpora-List] ACL 2007: Second Workshop on Statistical Machine Translation

From: Philipp Koehn (pkoehn@inf.ed.ac.uk)
Date: Wed Feb 14 2007 - 15:19:52 MET

  • Next message: Constantin Orasan: "[Corpora-List] Job announcement: Professor or Reader in Computational Linguistics"

    ACL 2007: SECOND WORKSHOP ON
    STATISTICAL MACHINE TRANSLATION

    Saturday, June 23, 2007
    http://www.statmt.org/wmt07/

    Translating documents between two different languages by computer
    has been one of the oldest goals in computational linguistics. Now,
    armed with vast amounts of translated text and powerful computers,
    we are witnessing significant progress toward achieving that goal.

    Statistical methods allow the analysis of parallel corpora and the
    automatic construction of machine translation systems. For some
    language pairs such as Chinese-English or Arabic-English,
    statistical machine translation (SMT) systems built at research labs
    currently outperform commercial systems.

    This workshop focuses on statistical and hybrid methods for
    machine translation and features a shared translation task. The
    evaluation of machine translation systems is a growing field and
    this workshop will also focus on determining the best methodology
    for evaluating translation quality both with automatic metrics and
    through subjective human evaluation.

    This workshop builds on the success of the 2005 ACL Workshop
    on Parallel Text and the 2006 NAACL Workshop on Statistical
    Machine Translation.

    Topics of interest include, but are not limited to:
       * word-based, phrase-based, syntax-based SMT
       * using comparable corpora for SMT
       * using morphological and POS information for SMT
       * integration of rule-based MT and statistical MT
       * decoding
       * error analysis
       * evaluation techniques for MT

    SHARED TASK

    In addition to soliciting research papers on the topics listed above,
    the workshop will also feature a shared translation task. The
    workshop organizers will provide common test sets for translation
    between four language pairs in both directions:

       * English-German and German-English
       * English-French and French-English
       * English-Spanish and Spanish-English
       * English-Czech and Czech-English

    Participants may submit translations for any or all of the language
    directions. In addition to the common test sets the workshop
    organizers will provide optional training resources, including a
    newly expanded release of the Europarl corpora, and additional
    out-of-domain corpora.

    All participants who submit entries will have their translations
    evaluated. In addition to automatic scoring, we will also evaluate
    translation performance by human judgment. To facilitate the
    human evaluation we will require participants in the shared task
    to manually judge some of the submitted translations.

    A more detailed description of the shared task (including
    information about the test and training corpora, a freely
    available MT system, and a number of other resources) is
    available from http://www.statmt.org/wmt07/shared-task.html .
    We also provide a baseline machine translation system, whose
    performance matches the best systems from last year's shared
    task.

    SUBMISSION INFORMATION

    Submissions will consist of regular full papers of max. 8 pages,
    formatted following the ACL 2007 guidelines. Authors of regular
    full papers will be required to indicate a track for their submission.
    In addition, teams participating in the shared tasks will be invited
    to submit short papers (max. 4 pages) describing their systems.
    Both submission and review processes will be handled electronically.

    We encourage individuals who are submitting research papers to
    evaluate their approaches using the training resources provided by
    this workshop, so that their experiments can be repeated by others
    using these publicly available corpora.

    Given the overlap of the paper submission time frame with that of
    EMNLP 2007, we accept papers that are also submitted to the
    EMNLP conference, but would like to know as soon as possible
    after the notification if an accepted paper will be withdrawn.

    IMPORTANT DATES

    Regular paper submissions April 2
    (shared task) Results submissions March 30
    (shared task) Short paper submissions April 6
    Notification April 23
    Camera-ready papers May 9

    ORGANIZERS

    Philipp Koehn (University of Edinburgh)
    Christof Monz (University of London)
    Cameron Shaw Fordyce (Center for the Evaluation of Language and
    Communication Technologies)
    Chris Callison-Burch (University of Edinburgh)

    PROGRAM COMMITTEE

    Lars Ahrenberg (Linköping University)
    Francisco Casacuberta (University of Valencia)
    Colin Cherry (University of Alberta)
    Stephen Clark (Oxford University)
    Brooke Cowan (Massachusetts Institute of Technology)
    Mona Diab (Columbia University)
    Chris Dyer (University of Maryland)
    Andreas Eisele (University Saarbrücken)
    Marcello Federico (ITC-IRST)
    George Foster (Canada National Research Council)
    Alex Fraser (ISI/University of Southern California)
    Ulrich Germann (University of Toronto)
    Rebecca Hwa (University of Pittburgh)
    Kevin Knight (ISI/University of Southern California)
    Philippe Langlais (University of Montreal)
    Alon Lavie (Carnegie Melon University)
    Lori Levin (Carnegie Mellon University)
    Daniel Marcu (ISI/University of Southern California)
    Bob Moore (Microsoft Research)
    Miles Osborne (University of Edinburgh)
    Michel Simard (Canada National Research Council)
    Eiichiro Sumita (ATR Spoken Language Translation Research Laboratories)
    Jörg Tiedemann (University of Groningen)
    Christoph Tillmann (IBM Research)
    Dan Tufiº (Romanian Academy)
    Taro Watanabe (NTT)
    Dekai Wu (HKUST)
    Richard Zens (RWTH Aachen)

    CONTACT
    For questions, comments, etc. please send email to pkoehn@inf.ed.ac.uk



    This archive was generated by hypermail 2b29 : Wed Feb 14 2007 - 15:17:42 MET