[Corpora-List] Texts 1900-1970

From: Chris Butler (csblists@telefonica.net)
Date: Thu Dec 15 2005 - 08:55:02 MET

  • Next message: Alessandro Oltramari: "[Corpora-List] OntoLex 2006 (hosted by LREC2006) - Call for Papers"

    My thanks to the following people, who all provided information on the
    availability of texts: Wendy Anderson, Carmela Chateau, Constantin Orasan,
    Raf Salkie, Dirk Siepmann, Pedro Ureña, Romain Vanoudheusden. The sources
    which were suggested are as follows:

    There are old (and some recent) texts at the project Gutenberg.
    www.gutenberg.org/

    the public library of science has open access texts.
    http://www.plos.org/about/openaccess.html

    A selection of online math text books
    http://www.math.gatech.edu/~cain/textbooks/onlinebooks.html

    the Intratext digital library (contains many religious texts, as well as a
    lot of literature)
    http://www.intratext.com/

    The SCOTS Corpus (which is freely accessible and searchable at
    www.scottishcorpus.ac.uk) contains texts in Scottish English (as well as
    dialects of Scots), from 1940 to the present day.

    The New York Times Archive
    (http://pqasb.pqarchiver.com/nytimes/advancedsearch.html) goes back to 19th
    century

    The collection of texts hosted by archive.org
    (http://www.archive.org/details/texts) includes texts from the Gutenberg
    Project

    The Victorian Literary Studies archive at
    http://victorian.lang.nagoya-u.ac.jp/index.html, which has a list of authors
    at http://victorian.lang.nagoya-u.ac.jp/concordance.html

    The archive at www.questia.com

    ******

    I'd also like to mention the Corpus of Late Modern English Texts compiled by
    Hendrik de Smet at the Catholic University of Leuven
    (http://perswww.kuleuven.be/~u0044428/), a principled collection of texts
    (10 million words, 1720-1920) drawn from archives such as Project Gutenberg
    and the Oxford Text Archive. A username and password must be obtained from
    Hendrik (Hendrik.desmets@arts.kuleuven.be) in order to access the corpus.

    Chris Butler
    Honorary Professor, University of Wales Swansea, UK



    This archive was generated by hypermail 2b29 : Thu Dec 15 2005 - 09:24:31 MET