Corpora: corpus standards

Nancy M. Ide (ide@cs.vassar.edu)
Thu, 29 Apr 1999 09:45:17 -0400 (EDT)

Vladimir Rykov writes:
>
> Dear list members !
>
> Here I try to build a corpus of Belorussian newspaper
> language.
>
> Maybe anyone can send me smth to the matter of standards, samples , any
> notes of lectures or so (any experience) to make it standard, compatible
> and accessable for any corpus researcher?
>
>

Take a look at the Corpus Encoding Standard (CES) at

http://www.cs.vassar.edu/CES

This EAGLES standard provides encoding conventions and a data architecture
for corpora intended for NLP and corpus linguistics research.

-----------------------------------------------------------------------------

Nancy Ide
Professor and Chair Tel: (+1 914) 437 5988
Department of Computer Science Fax: (+1 914) 437 7498
Vassar College WWW: http://www.cs.vassar.edu/~ide
124 Raymond Avenue E-mail: ide@cs.vassar.edu
Poughkeepsie, New York 12604-0520 USA

-----------------------------------------------------------------------------