Re: [Corpora-List] Spanish reference corpus

From: Mario Crespo Miguel (mario.crespo@uca.es)
Date: Thu Feb 01 2007 - 14:17:17 MET

  • Next message: Andrei Popescu-Belis: "[Corpora-List] CFP: MLMI'07 (4th Workshop on Machine Learning and Multimodal Interaction)"

    Thank you very much for helping me, but I think it is more
    convenient for me if the frequencies of the words of this open
    domain / general corpus could be obtained. Does anybody know if
    such an information is available some way? Best,

    Mario

    El dia 30 ene 2007 16:10, Serge Sharoff <s.sharoff@leeds.ac.uk>
    escribió:

    > one answer is the Spanish Internet corpus with the interface from
    > http://corpus.leeds.ac.uk/internet.html
    > and the URL list
    > http://corpus.leeds.ac.uk/internet/final-url-es.gz
    >
    > This is a random snapshot of the Spanish Internet of about 120
    > million
    > words, see
    > Sharoff, S (2006) Creating general-purpose corpora using
    > automated
    > search engine queries. In Marco Baroni and Silvia Bernardini,
    > editors,
    > WaCky! Working papers on the Web as Corpus. Gedit, Bologna.
    > http://wackybook.sslmit.unibo.it/
    >
    > S
    >
    > On Tue, 2007-01-30 at 15:54 +0100, Mario Crespo Miguel wrote:
    >> Dear everybody,
    >>
    >> Thank you again for all the help that I always get with this
    >> mailing list, and this time I would like to ask if there is
    >> some reference / open-domain corpus for Spanish which is freely
    >> available and could be downloaded. Thank you in advance. Best
    >> wishes,
    >>
    >> Mario Crespo Miguel
    >>
    >>
    >
    >



    This archive was generated by hypermail 2b29 : Thu Feb 01 2007 - 14:15:39 MET