Re: [Corpora-List] starting a machine translation project

From: Felipe Sánchez Martínez (fsanchez@dlsi.ua.es)
Date: Wed Sep 13 2006 - 12:01:08 MET DST

  • Next message: Mike Maxwell: "Re: [Corpora-List] starting a machine translation project"

    Hi,

    The Transducens Group at the University of Alicante, under the
    supervision of Mikel L. Forcada, has been working on an open-source MT
    engine for related languages (such as Portuguese and Spanish) called
    Apertium (http://apertium.sourceforge.net). Right now we are working to
    enhance the MT architecture so as to deal with less related language
    pairs like Catalan<->English or Spanish<->English.

    Apertium is a rule-based MT system. So It is not necessary to provide
    the system with parallel corpora, but monolingual and bilingual
    dictionaries, and structural transfer rules.

    Perhaps you could start with the apertium as is, because by the end this
    year there will be available a more powerful engine that will be
    compatible with the data you develop for the current version of
    apertium.

    Please, feel free to contact my thesis advisor (Mikel L. Forcada,
    mlf@dlsi.ua.es); we are interested on Indonesian-to-English and on
    Indonesian-to-Malay.

    Regards,

    -- 
    Felipe Sánchez Martínez
    -------------------------------------------------------------------
    Departamento de Lenguajes       E-mail: fsanchez@dlsi.ua.es
    y Sistemas Informáticos       Homepage: www.dlsi.ua.es/~fsanchez
    Universidad de Alicante            Fax: +34 965 90 93 26
    E-03071 Alicante (Spain)         Phone: +34 965 90 34 00, ext: 2038
    

    El mié, 13-09-2006 a las 15:26 +0700, Nano Surbakti escribió: > Hi, > > We want to start an English-Indonesian MT project. We found that > there is an opensource MT toolkit, "Moses", in http://www.statmt.org > > I don't know much about machine translation. From some articles I've > been reading, it looks like Statistical translation method is a rather > easy but yet produce a reasonable result. > > I got some newbie-like questions: > - Our main purpose is to make an opensource English-to-Indonesian MT, > can we use Moses for this purpose, or perhaps Moses is specific for > Foreign-to-English translation only? > - AFAIK, we have to provide bilingual corpus to do the statistical > training. Some articles mentioned about "phrase translation". Do we > need to provide some kind of phrase table, or perhaps it is generated > automatically by a special program? > - If we can't use Moses, do you have some guidance for us, perhaps > like some pointers to opensource toolkit? > - As a rough prediction, how many months is it going take to develop > an "early-version" of English-to-ForeignLanguage MT ? > > > Regards, > > -- > Nano Surbakti > (sorry if you got double posting)



    This archive was generated by hypermail 2b29 : Wed Sep 13 2006 - 23:39:39 MET DST