Corpora: New: The Oslo Corpus of Bosnian Texts
Thu, 12 Feb 1998 12:17:41 +0100 (MET)


We are pleased to announce that the first version of the Oslo Corpus of
Bosnian Texts is now available on the Internet.

The Oslo Corpus of Bosnian Texts consists of approximately 1.5 million
words from different genres: fiction, essays, children's stories, folklore,
islamic texts, legal texts, and newspapers and journals. The texts are
written by authors from Bosnia and Herzegovina and have for the most part
been written and published in the 1990s. To some extent, it can serve as a
basis for research on the post-war language of Bosnia and Herzegovina.

The corpus has been encoded with the IMS Corpus Workbench (developed at the
Institut fur Maschinelle Sprachverarbeitung at the University of
Stuttgart). The Text Laboratory has provided a suitable web interface. The
corpus is thus easily accessible for queries from the Internet.

The corpus is only available for research purposes, and anyone wanting to
use it must state this purpose explicitly by following the instructions
given on the appropriate web-pages.

General information about The Oslo Corpus of Bosnian Texts (including
information about contents and how to get permission) can be found on these

In English:
In Bosnian:

Dr.philos.Janne Bondi Johannessen Tel: + 47-22 85 68 14
The Text Laboratory E-mail:
Department of Linguistics Fax: +47-22 85 69 19
University of Oslo Internet: 1102 Blindern
N-0317 Oslo, Norway