[Corpora-List] Teaching corpora for romance languages

From: Carlos Rodriguez (crodriguezp@gmail.com)
Date: Tue Apr 26 2005 - 16:25:20 MET DST

Next message: Raza Shahid: "[Corpora-List] Arabic Phoneme Segmentation"

Previous message: Andrius Utka: "Re: [Corpora-List] European Constitution in parallel"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi all,
I am trying to coordinate compilation, adaptation and licencing of
various language resources (corpora, treebanks, ontologies) for
non-commercial use in teaching computational linguistics and Natural
Language Processing programming techniques in Romance languages, using
the Natural Language ToolKit (NLTK, at http://nltk.sf.net, is a
Python-based plattform that already provides with its processing
modules, for didactic purposes, sample data for English from the Brown
corpus, the Penn treebank, among other sources ). We will soon have
available some Spanish and Catalan datasets, interfases and tutorial
translations, but will be great to have also Portuguese, French,
Italian, and so on. There is a gap in these teaching resources for
languages other than English, and this initiative can help fill it.
If anyone is interested in providing and licensing corpora and other
resources (formatted in internationally and scientifically-accepted
standards), please contact me at CRodriguezP@gmail.com.

Thanks,

Carlos Rodríguez
-----------------
IIMAS-National Autonomous University (Mexico)

Next message: Raza Shahid: "[Corpora-List] Arabic Phoneme Segmentation"
Previous message: Andrius Utka: "Re: [Corpora-List] European Constitution in parallel"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Tue Apr 26 2005 - 16:37:15 MET DST