Re: Corpora: What is a corpus

From: ramesh@clg.bham.ac.uk
Date: Sat Jan 29 2000 - 00:50:29 MET

  • Next message: François Maniez: "Corpora: Corpora & Proverbs"

    A belated comment:
    Following on from Lou's comments, which I generally agree with,
    isn't the point that one can apply filters at the *text* level,
    i.e. one can say `this corpus contains only 19th century English
    texts' or `this corpus contains only French Symbolist poetry', and
    then one has to explain why certain texts were included and others excluded
    from the corpus (i.e. what one means by `19th century English' or
    `French Symbolist poetry'), but as long as the purpose of a corpus is
    to study the linguistic features of a collection of texts, one cannot
    apply any constraints to the linguistic features themselves. Proverbs,
    like `past tense', or `imperatives' are a) in the eye of the beholder
    (is the use of historic present an example of `past tense' or not?)
    b) to be studied as they occur in authentic texts. A corpus is like
    a natural landscape in which one might look at the distribution of
    dandelions. A collection of proverbs is like a collection of dandelions
    in a botanist's laboratory. So a dictionary of proverbs would be the
    ideal resource for the original emailer, if he/she is not interested in
    the contexts and texts in which they occur, the purposes for which they are
    used, who uses them and when, etc. But of course, using a dictionary means
    that one is bound by the decisions made by someone else about what a
    proverb is, and which proverbs are worth including in the dictionary, etc
    Ramesh

    Ramesh Krishnamurthy
    Corpus Research Group
    University of Birmingham



    This archive was generated by hypermail 2b29 : Sat Jan 29 2000 - 00:49:19 MET