here are the results and a list of people who were so helpful to send me
suggestions and hints concerning my question posted to the linguist list
on friday 18th september.
My question was as follows:
>Dear list members,
>it's a while I'm looking for Spanish Corpora of business Spanish. Does
>anybody know if there are Spanish Newspapers on CD-ROM (eg. all the
>of one year as it is possible for the german newspaper Sueddeutsche
>Zeitung)? I tried to contact EL PAIS but never received an answer.
>Actually, I would be interested in any kind of Corpus of contemporary
>Spanish (mainly european), - to buy or not to buy - but 'economia'-
>arguments would be even greater.
>Thank you for an answer. Of course, I will post a message with the
I want to thank:
Andreas Eisele
Antoine Consigny
Valerie Mapelli
Iain Downs
Purificacion Fdez-Nistal
Susana Sotelo Docio
Eva Easton
Leonel Ruiz Miyares
José Luis Sancho
Raphael Salkie
Rene' Schneider
The summary of the results:
Among the commercial corpora there is ELRA
they have an Multilingual corpus (MLCC) consisting of 6 European financial
newspapers (Het Financieele Dagblad, Handelsblatt, Financial Times, Le
Monde, Il Sole 24 Ore, Expansion); the spanish subcorpus (Expansion) has
about 10 million words (21.10.1991-24.10.91 and 14.5.94-27.12.94). The
entire corpus is available via ELRA at the following costs:
- For ELRA members for research use: 360 ECU
- For non members for research use: 750 ECU
Another commercial publisher of research material and a provider
of newspapers on CD-ROM is Newsbanks: They offer Noticias en Espanol on
monthly CD-ROMs:
Yet another commercial service is ProQuest; they seem to have EL Norte and
Reforma (Mexico)
There must be a CD-ROM edition of the 1994 volume of El Mundo (in to
disks); the text is in ASCII format and classified in categories (economy,
national, etc); I'm not sure if it is still available.
There is a link collection to Spanish online-newspapers at:
There is a website about corpora-FAQs of the Language technology group
(the interesting one is the tool section I guess):
El Observatorio Español de Industrias de la Lengua, could be interesting;
it also has some more links: (click on recursos
There a several corpora available at the Department of Romance Languages
of the University of Goeteborg (Banco de datos de Prensa Espanola 1977,
Banco de Datos de Once Novelas Espanolas 1951-1971, A Concordance based on
the Corpus oral the referencia del Espanol contemporaneo.)
Professor Barry Ife, at School of Humanities, King's College / London
is reffered to be compiling a large corpus of modern Spanish.
Spanisch newspaper corpus that consists of 200 newspaper texts of
latinamerican newspapers on CD-ROM (Tiff and a ASCII Version). The corpus
includes 39.081 tokens and is available (to buy) at the
Information Science Research Institute / University of Nevada at Las Vegas
4505 Maryland Parkway
Las Vegas, Nevada 89154-4201
For information contact ISRI by
Phone: +1 702 895 - 3338
Fax: +1 702 895 -1560
At the University of Murcia there is the CUMBRE Corpus: Contact Prof.
Aquilino Sanchez:
The CRATER corpus consists of morphosyntactically tagged communication:
Dr. Purificacion Fdez.- Nistal and the Instituto de Terminologia Bilingue
y Traduccion Especializada (ITBYTE) at the Universidad de Valladolid/Spain
are in the process of building their own corpus.
Ing. Leonel Ruiz Miyares (Director of Applied Linguistics Centre /
Santiago de Cuba) keeps a Spanish-corpus of children's vocabulary
(by the way, there is a European Spanish Corpus of child language, the
The Lingua project (EU-funded project on multilingual concordancing:
but as far they have only English, French, German, Italian, Greek, Danish
texts - they are considering bringing in Spanish and portoghese:
Thanks a lot Eva Remberger
-- ________________________________________________________________________ Sprachliche Informationsverarbeitung Eva Maria Remberger Philosophische Fakultaet Universitaet zu Koeln Albertus-Magnus-Platz D-50923 Koeln ------------------------------------------------------------------------ Visit our web-site at: ________________________________________________________________________