[Corpora-List] JRC Workshop on Exploiting parallel corpora in up to 20 languages

From: Ralf Steinberger (ralf.steinberger@jrc.it)
Date: Wed Jun 15 2005 - 10:03:53 MET DST

  • Next message: David Oakey: "[Corpora-List] Additions to amazon.com "Search Inside" feature"

    Call for contributions / Call for participation

     

    Please distribute widely

     

    - JRC Ispra (Northern Italy), 26-27 September 2005 (Monday and
    Tuesday)

    - Travel expenses and daily allowance will be reimbursed

    - Focus on the participation from the new and future EU Member
    States and their languages

    - Parallel corpus in up to twenty languages of the 'Acquis
    Communautaire' (AC)

    - Workshop web page:
    http://www.jrc.it/langtech/0509_EU-Enlargement-Workshop.html

     

     

    The European Commission's Joint Research Centre (JRC) is going to hold a
    workshop on the exploitation of parallel corpora available for the twenty
    official EU languages and is seeking scientists who can actively contribute
    by presenting tools and ideas. At the same time, we are looking for persons
    who would like to participate in the workshop without giving a presentation.
    We are particularly interested in dealing with the new EU languages (Czech,
    Estonian, Hungarian, Latvian, Lithuanian, Maltese, Polish, Slovene, Slovak)
    and in creating resources for these languages, as well as in
    language-independent or knowledge-poor methods.

     

    Applications and resources of interest include, but are not restricted to:

     

    - sentence, phrase and word alignment

    - term and collocation extraction

    - generation of bilingual or multilingual dictionaries

    - automatic thesaurus construction and word clustering

    - information extraction

    - training and tuning of Machine Learning systems for statistical
    MT and more

    - automatic classification methods using the Eurovoc thesaurus

    - usage of the generated resources for real-life applications

    - .

     

    This workshop is part of the European Commission's effort to integrate
    scientists from the new and future EU Member States
    (http://www.jrc.cec.eu.int/enlargement/action2005/index.htm) into the
    so-called European Research Area
    (http://www.jrc.cec.eu.int/default.asp@sidsz=what_we_do
    <http://www.jrc.cec.eu.int/default.asp@sidsz=what_we_do&sidstsz=european_res
    earch_space.htm> &sidstsz=european_research_space.htm). For this reason, we
    are particularly looking for participants from the new EU Member States,
    from Candidate and Acceding Countries and from Potential Candidate Countries
    (Western Balkans).

     

    THE CORPUS

     

    When the ten countries joined the European Union in 2004, they had to
    translate and ratify an existing collection of about ten thousand legal EU
    documents covering a large variety of subject areas. This document
    collection is referred to as the 'Acquis Communautaire' (AC). The JRC has
    collected large parts of this document collection and intends to exploit it
    to build multilingual term dictionaries. Due to the fact that the AC (as
    well as most other EC documents) has been classified according to the
    multilingual subject domain classification system Eurovoc
    (http://europa.eu.int/celex/eurovoc/), it should be possible to
    automatically generate subject-specific term dictionaries. For some
    information about the AC corpus, see:

     

         Tomaz Erjavec, Camelia Ignat, Bruno Pouliquen & Ralf Steinberger
    (2005).
         Massive multilingual corpus compilation; Acquis Communautaire and
    totale.
         In: 2nd Language & Technology Conference: Human Language Technologies
         as a Challenge for Computer Science and Linguistics (L&T'05). Poznań,
         Poland, 21-23 April 2005. Available at
    http://www.jrc.cec.eu.int/langtech/

     

    PLACE AND DATE

     

    We plan to hold the workshop on Thursday and Friday 22/23 September 2005 at
    the Joint Research Centre in Ispra. Ispra is located at the Lago Maggiore
    lake, about 60 km West of Milan. The nearest airport is Milano Malpensa. For
    more details, see
    http://www.jrc.cec.eu.int/langtech/WorkatJRC.html#JRC-Ispra.

     

    EXPRESSIONS OF INTEREST

     

    If you are interested in participating in this workshop, please send a
    message to Ralf.Steinberger@jrc.it before 27 June. If you can give a
    presentation, please attach an abstract of what you propose to present. If
    you prefer to simply attend the workshop, please explain in a few lines why
    you are interested in this workshop. We plan to let you know about your
    acceptance by mid-July.

     

    CONDITIONS OF REIMBURSEMENT

     

    Participants giving a presentation and participants from the new EU Member
    States (with or without presentation) will be reimbursed for the incurred
    travel cost, and they will receive a daily allowance of 149 Euro for each of
    the two working days. Participants will have to pay for the hotel and for
    all other expenses out of this daily allowance. About 30 participants will
    be reimbursed, with a maximum of two persons per EU Member State. Additional
    persons can participate at their own cost. Please note that the
    reimbursement usually takes several months, but the JRC can pre-pay the
    travel tickets.

     

    CONTACT

    For scientific-technical issues and requests for participation, please
    contact Mr. Ralf Steinberger (Ralf.Steinberger@jrc.it
    <mailto:Ralf.Steinberger@jrc.it?subject=JRC%20Enlargement%20Workshop> ).

    For organisational issues, travel, accommodation, reimbursement, etc.,
    another contact person will soon be announced on the workshop web page.

     

     

     

     

     

     

    Ralf Steinberger ( <mailto:Ralf.Steinberger@jrc.it> Ralf.Steinberger@jrc.it)

    IPSC - SeS - Language Technology ( <http://www.jrc.it/langtech>
    http://www.jrc.it/langtech)

    T.P. 267, Via Fermi 1

    21020 Ispra (VA), Italy

    Tel: +39 0332 78 6271

    Fax: +39 0332 78 5154

    Secretary D. Negri: +39 0332 78 5648

     



    This archive was generated by hypermail 2b29 : Thu Jun 16 2005 - 11:08:41 MET DST