Hi,
I was wondering if anyone could point me to domain corpora with the
following characteristics:
1.- Written texts (ASCII, xml, txt,pdf, no need to be tagged) from
specialized or technical domains.
2.- Open source, or reasonably priced, that can be downloaded to be
processed (web-accesible through proprietary interfaces won't cut it).
3.- If possible, with machine-readable or electronic lexicons or
dictionaries available for the domain represented by the corpora.
I am thinking about experimenting with techniques for lexical acquisition.
Thanks and best to all,
Carlos Rodríguez
This archive was generated by hypermail 2b29 : Sat Mar 26 2005 - 23:18:12 MET