Dear Noemie,
For the JRC-Acquis multilingual parallel corpus with alignments (sentence or
paragraph) in all language combinations for the languages Czech, Danish,
German, Greek, English, Spanish, Estonian, Finnish, French, Hungarian,
Italian, Lithuanian, Latvian, Maltese, Dutch, Polish, Portuguese, Romanian,
Slovak, Slovene and Swedish (190 language pair combinations), see
http://langtech.jrc.it/JRC-Acquis.html. It is freely available for research
purposes. Unfortunately, we do not know the source language of the
translations, but we are told that most of the time it is English or French.
I hope this is useful for your work.
With kind regards,
Ralf
Ralf Steinberger ( <mailto:Ralf.Steinberger@jrc.it> Ralf.Steinberger@jrc.it)
European Commission - Joint Research Centre (JRC)
IPSC - SeS - Language Technology ( <http://langtech.jrc.it/>
http://langtech.jrc.it, <http://press.jrc.it/NewsExplorer/>
http://press.jrc.it/NewsExplorer)
-----Original Message-----
From: owner-corpora@lists.uib.no [mailto:owner-corpora@lists.uib.no] On
Behalf Of Nomi Guthmann
Sent: 01 March 2007 17:29
To: CORPORA@UIB.NO
Subject: [Corpora-List] Corpus of translated material
Dear corpora list members,
We are doing a project concerned with corpus-based translation studies.
For this purpose, we are trying to collect a corpus of translated
material in the target language. The main requirement is to know
exactly what the source language was. Otherwise, we are happy with
data in any language and of any domain. For example, parallel corpora
(not necessarily aligned) would be an excellent resource, provided
that we know what the source language is.
We would highly appreciate any suggestions and references you may
have. I will post a summary of the replies.
Thanks,
Noemie Guthmann
Translation and Interpreting Studies Department
Bar Ilan University
This archive was generated by hypermail 2b29 : Fri Mar 02 2007 - 09:31:57 MET