Xabier,
The multilingual (over 20 languages), wide-coverage Eurovoc thesaurus with
its approximately 6000 classes has a subset of about 60 science-oriented
classes, plus many related terms and classes in other domains that may also
be useful (e.g. politics, law, economics, trade, finance, social questions,
education, employment, transport, envirosnment, agriculture, energy,
geography). The science-oriented classes provide the major science domains,
but may not be detailed enough for your purposes. Please check out for
yourself.
Eurovoc is browsable at http://europa.eu/eurovoc/ and is available free for
research purposes. For details on where to get Eurovoc, see
http://langtech.jrc.it/0509_EU-Enlargement-Workshop.html#HOW_TO_GET_THE_AC_C
ORPUS_AND_EUROVOC.
Eurovoc was developed for manual cataloguing of mainly parliamentary
documents, but collections of multi-label classified documents such as the
JRC-Acquis (http://langtech.jrc.it/JRC-Acquis.html) have been used to train
an automatic multi-label Eurovoc classification system.
I hope this helps. All the best,
Ralf
Ralf Steinberger
European Commission - Joint Research Centre (JRC)
IPSC - SeS - Language Technology ( <http://langtech.jrc.it/>
http://langtech.jrc.it, <http://press.jrc.it/NewsExplorer/>
http://press.jrc.it/NewsExplorer)
T.P. 267, Via Fermi 1
21020 Ispra (VA), Italy
-----Original Message-----
From: owner-corpora@lists.uib.no [mailto:owner-corpora@lists.uib.no] On
Behalf Of Xabier Saralegi Urizar
Sent: 02 October 2006 12:26
To: CORPORA@uib.no
Subject: [Corpora-List] Standard ontology for document classification?
Dear all,
I want to classify many scientific documents among different categories
based on their knowledge area, such as health, geography...
My question is whether there is a standard ontology for such a
classification.
Regards,
--Xabier Saralegi Urizar
Elhuyar I+G+B
Zelai Haundi kalea, 3
Osinalde industrialdea
20170 Usurbil
(+34) 943 36 30 40
xabiers@elhuyar.com / www.elhuyar.org
This archive was generated by hypermail 2b29 : Tue Oct 03 2006 - 09:08:26 MET DST