[Corpora-List] Collocations from parallel corpora

From: Philippa Maurer-Stroh (aon.912130291@aon.at)
Date: Wed Jul 16 2003 - 12:13:58 MET DST

  • Next message: Magali Jeanmaire: "[Corpora-List] ELRA News"

    Dear all,

    I've recently come across Frank Smadja et al.'s Xtract and Champollion and I wonder if the two programs are available for research purposes.

    I'm now doing test runs with Mike Barlow's ParaConc and Collocate as well as Bill Fletcher's kfngrams on a (admittedly) rather small sentence-aligend German-English parallel corpus (about 10,000 words each). For my Ph.D., however, I am planning to work with Philipp Koehn's EU proceedings with about 11 mio words each (does anyone know if its also available already tagged?)

    Furthermore, following the discussion on legal aspects of corpus compilation & exploitation on this list, I'd like to know if there are any legal problems concerning the use of the EU texts for (Ph.D.) research work?

    Thanks

    Philippa



    This archive was generated by hypermail 2b29 : Wed Jul 16 2003 - 12:27:53 MET DST