Hello.
We are currently working with sentences alignment in multilingual corpora. We us
e the TEI guidelines to encode the corpora and to keep a track of the alignments
.
The languages dealt with include some less diffused community languages both out
of personnal choice. For the time being, they are English, French, German, Ita
lian, Greek (!) and Danish. We are considering bringing in Spanish and Portugues
e.
You can have access to a demonstration by Internet at the following URL:
http://www.loria.fr/~bonhomme/lingua
For a future work, we are developping an interface (running first on Unix/Xwindo
ws but later on windows and apple) to manipulate some multilingual corpora in us
ing the principle of a tool-box to have access to a corpus.
This interface will incorporate:
- the TEI guidelines,
- tools for terminology, lexicology
- the multilingual aspects of the texts
- Import/Export text (txt <-> TEI <-> html <-> ...)
- the multilingual alignments,
- ...
This software (n.b.: Xcorpus) will be subject to the license agreement set forth
in the CNRS license.
If you need more information about our work e-mail me.
********************************************************
* patrice.bonhomme@loria.fr * Office : B.228 *
* http://www.loria.fr/~bonhomme * Phone : 83 59 20 37 *
********************************************************