[Corpora-List] American and British English

From: Nancy Ide (ide@cs.vassar.edu)
Date: Fri Nov 03 2006 - 20:14:47 MET

  • Next message: Oliver Mason: "Re: [Corpora-List] American and British English spelling converter"

    On Nov 3, 2006, at 5:15 AM, Eric Atwell wrote:
    >
    > I wonder if American corpora eg ANC have evidence of British
    > spellings?
    >

    We do indeed. A quick sampling of the 22 million words in the ANC so
    far gave us about 240 instances of "colour" and 160 of "behaviour".
    Some were in quotations and several were in a blog which is
    supposedly "guaranteed" to be produced by native speakers of American
    English. A few others were in the Berlitz Travel guides written
    especially for an American audience.

    More generally, as Paul Heacock pointed out, the differences between
    British and American English are becoming increasingly obscure,
    although we see continued differences in syntactic structures,
    adverbial usage, etc. as in “She could not endure to live with him”
    vs. “She could not endure living with him”, “Immediately I get home”
    vs. “As soon as I get home” and of course the famous "make a
    decision" vs. "take a decision".

    Even worse for us, a definition of American English is becoming very
    hard to provide--we could not get a definitive answer of a native
    speaker of American English from the American Dialect Society or LSA.
    Furthermore, with the influx of so many non-native English speakers
    who are learning and speaking English here in the US, we see the
    emergence of a brand of English spoken primarily (only?) in the US
    that is not exactly like what we might regard as "native American"
    English. The emergence of "Chicano English" is one obvious example,
    but this is slowly broadening to other language groups.

    We are planning for the future to include data that may not be
    produced by those we have so far considered to be native American
    English speakers in the ANC, but we *hope* to provide identification
    where possible of the linguistic background of the producer.

    Nancy Ide

    =======================================================
    Nancy Ide

    Professor and Chair
    Department of Computer Science
    Vassar College
    Poughkeepsie, New York 12604-0520
    USA

    tel: (+1 845) 437 5988
    fax: (+1 845) 437 7498
    email: ide@cs.vassar.edu
    http://www.cs.vassar.edu/~ide
    =======================================================



    This archive was generated by hypermail 2b29 : Fri Nov 03 2006 - 20:44:58 MET