Re: [Corpora-List] Spanish reference corpus

From: Serge Sharoff (s.sharoff@leeds.ac.uk)
Date: Fri Feb 02 2007 - 09:24:50 MET

  • Next message: Condamines: "[Corpora-List] conference announcement"

    yes, the frequency list is also available:
    http://corpus.leeds.ac.uk/frqc/internet-es-forms.num (for word forms)
    http://corpus.leeds.ac.uk/frqc/internet-es.num (for lemmas, though you'd
    better take the results of automatic lemmatisation with caution).

    BTW, the frequencies (the second column) are in terms of ipm (instances
    per million words).

    Serge

    On Thu, 2007-02-01 at 14:17 +0100, Mario Crespo Miguel wrote:
    > Thank you very much for helping me, but I think it is more
    > convenient for me if the frequencies of the words of this open
    > domain / general corpus could be obtained. Does anybody know if
    > such an information is available some way? Best,
    >
    > Mario
    >
    >
    >
    > El dia 30 ene 2007 16:10, Serge Sharoff <s.sharoff@leeds.ac.uk>
    > escribió:
    >
    > > one answer is the Spanish Internet corpus with the interface from
    > > http://corpus.leeds.ac.uk/internet.html
    > > and the URL list
    > > http://corpus.leeds.ac.uk/internet/final-url-es.gz
    > >
    > > This is a random snapshot of the Spanish Internet of about 120
    > > million
    > > words, see
    > > Sharoff, S (2006) Creating general-purpose corpora using
    > > automated
    > > search engine queries. In Marco Baroni and Silvia Bernardini,
    > > editors,
    > > WaCky! Working papers on the Web as Corpus. Gedit, Bologna.
    > > http://wackybook.sslmit.unibo.it/
    > >
    > > S
    > >
    > > On Tue, 2007-01-30 at 15:54 +0100, Mario Crespo Miguel wrote:
    > >> Dear everybody,
    > >>
    > >> Thank you again for all the help that I always get with this
    > >> mailing list, and this time I would like to ask if there is
    > >> some reference / open-domain corpus for Spanish which is freely
    > >> available and could be downloaded. Thank you in advance. Best
    > >> wishes,
    > >>
    > >> Mario Crespo Miguel
    > >>
    > >>
    > >
    > >
    >
    >
    >
    >



    This archive was generated by hypermail 2b29 : Fri Feb 02 2007 - 09:24:22 MET