Corpora: Corpus of spoken Bulgarian

Kjetil Ra Hauge (
Thu, 9 Oct 1997 11:30:21 +0200

A corpus of spoken Bulgarian (approximately 95 000 word tokens) is now
available at:

It consists of conversations in family contexts, recorded with a hidden
microphone (and with the subsequent permission of the persons involved). It
was recorded and transcribed by Krasimira Aleksova of Sofia University for
her dissertation _Ezikovi procesi v semejstvoto_ (1994), and has been made
available by her for free use for research purposes. So far, the corpus is
presented as a sequence of text files, but an interactive concordancer will
be made available later.

Aleksova's "avtoreferat" of her dissertation is also available from this sit=

--- Kjetil Ra Hauge, U. of Oslo.
--- Tel. +47/22 85 67 10, fax +47/22 85 41 40