Re: [Corpora-List] estimates of written/spoken input

From: Paul Bennett (paul.bennett@manchester.ac.uk)
Date: Mon Nov 28 2005 - 11:15:10 MET

  • Next message: Vlado Keselj: "[Corpora-List] CFP Canadian AI'06 (papers due in 2 weeks)"

    Geoffrey Pullum and Barbara Scholze (in Linguistic Review 19, 2002, p44) cite
    evidence that by the age of three a child in a professional household might
    have heard 30 million word tokens (but far fewer for children in other social
    classes). I know this relates to children rather than adults, but presumably
    the amount of language heard does not differ much by age.

    Their source is B. Hart and T. Risley: Meaningful Differences in the Everyday
    Experiences of Young Children (Paul H Brookes, 1995). I haven't read this, but
    I guess this would be a place to look for more information.

    Paul Bennett

    > Does anybody know of studies that present estimates of how many
    > words (or sentences, or utterances, etc.) an "average" adult human
    > being hears and/or reads during a certain time span (days, months,
    > years, etc.)? I realize that this is problematic (what is a word? who
    > counts as "average adult"? in which anguage? etc.), but I would be
    > happy even with very rough ballpark estimates.
    >
    > I am interested in this because I would like to know to what extent a
    > corpus the size of the BNC (or even larger) can be seen (of course,
    > again, with all sorts of methodolocial caveats) as a surrogate for
    > the amount of linguistic input that the average adult human receives
    > in a certain period of time...
    >
    > I am aware of an estimate that fifth-graders read about 1M words per
    > year (quoted in Anglin: Vocabulary development: A morphological
    > analysis (Monographs of the Society for Research in Child
    > Development, 1993) -- I don't have the book with me right now, so I
    > could be wrong regarding the grade and/or the amount of words...),
    > but I've found nothing about adults.



    This archive was generated by hypermail 2b29 : Mon Nov 28 2005 - 11:29:28 MET