[Corpora-List] New resources from the BNC

From: Ylva Berglund (ylva.berglund@computing-services.oxford.ac.uk)
Date: Tue Sep 13 2005 - 17:10:56 MET DST

  • Next message: FIDELHOLTZ_DOOCHIN_JAMES_LAWRENCE: "[Corpora-List] Re: SIL & MSLL references"

    New resources from the BNC

    We are pleased to announce the release of BNC Baby v 2 – a new CD
    containing three English XML corpora (BNC Baby, BNC Sampler and Brown)
    along with the latest release of the Xaira corpus search toolkit.

    Further information about the CD and how to obtain it can be found below
    and at http://www.natcorp.ox.ac.uk/babyinfo.html

    BNC Baby is intended for use in teaching and learning about language
    from a corpus perspective. Xaira is an open source indexing program,
    developed specifically to give students the ability to experiment with
    many kinds of searching strategies on many kinds of corpora. You can use
    the software on the CD to develop your own searchable XML corpora, as
    well as to search the sample corpora supplied with it.

    The BNC-Baby disk (second edition) contains:
    • BNC-Baby
    a subset of the British National Corpus. This contains four million-word
    samples, representing four major text types in Modern English: informal
    conversation, academic prose, fiction, and newspaper text. The texts are
    annotated with part-of-speech information and come with detailed
    metadata. Documentation of the corpus design and contents, and
    demonstration materials for using it in English language teaching is
    also provided.
    • The BNC Sampler
    a different subset of the British National Corpus. This contains two
    million-word samples, representing spoken and written texts. These texts
    were all hand-tagged and corrected during production of the BNC.
    • A Standard Corpus of Present Day Edited American English (Brown)
    the original Brown corpus, converted to XML, with POS tagging and lemmata
    • Xaira
    a new search and retrieval program for use with these — or any other —
    XML corpora. The tool will allow you to search the corpora making use of
    the metadata and tagging. For more information about Xaira see
    http://xaira.sf.net.

    Price: €30 per CD
    Special offer: Order 10 copies or more at and only pay €10 per CD!
    (prices include standard airmail delivery charges but exclude VAT)
    Order the CD at: http://www.natcorp.ox.ac.uk/orderform.html

    -------------
    British National Corpus
    http://www.natcorp.ox.ac.uk/
    natcorp@oucs.ox.ac.uk
    -------------



    This archive was generated by hypermail 2b29 : Tue Sep 13 2005 - 17:24:27 MET DST