Re: [Corpora-List] starting a machine translation project

From: Francis Bond (fcbond@gmail.com)
Date: Thu Sep 14 2006 - 02:58:54 MET DST

  • Next message: TadPiotr: "[Corpora-List] corpus analysis of loanwords"

    G'day,

    There was some work on Indonesian MT in the CICC project
    <http://www.cicc.or.jp/english/kyoudou/mt.html>, which ended up with a
    fairly useful lexicon Indonesian-English on CD-ROM. If you email CICC
    then you should be a be able to get a copy. There are two relevant
    CDs, the Indonesian one, and the terminological one, which includes a
    technical lexicon for English, Malay, Thai, Indonesian, Chinese and
    Japanese.

    A lexicon is essential for rule- based MT and also a useful tool for
    aligning and backing-off in statistical MT.

    -- 
    Francis Bond  <www.kecl.ntt.co.jp/icl/mtg/members/bond/>
    NTT Communication Science Laboratories | Natural Language Research Group
    

    P.S. Here is a sample entry:

    @2110IVMT &2111mengabadikan &2112135 &2113630 #2100to preserve; to keep alive; to immortalize #2101IVABS #2110to perpetuate #2111IVABS #2120to memorialize #2121IVABS #2130to capture (on canvas, etc.) #2131IVABS #2140to take a picture of; to photograph #2141IVABS



    This archive was generated by hypermail 2b29 : Thu Sep 14 2006 - 21:18:38 MET DST