Re: Corpora: frequency lists for clusters & MWU

Mike Scott (Mike.Scott@liverpool.ac.uk)
Thu, 08 Oct 1998 14:58:26 +0100

In response to Ted Dunning's point :-

td> But what I was trying to say had more to do with the futility of using
td> such frequency sorted lists as generalizations. The features that I
td> pointed out and that you pointed out demonstrate exactly this point.
td> Essentially all of these points of interest are due *precisely* to the
td> specific nature of the text that I analysed. The fact that the
td> particular nature of the text I used is this prominent is a strong
td> argument *against* the general utility of such frequency sorted lists
td> of collocates.

doesn't this beautifully illustrate the important difference between doing
Corpus Linguistics in order to generalise about "the language" (or even
"language") and a text-oriented CL where the idea is to find out about a
text or set of texts! The frequency sorted cluster lists typically enable
more useful text-generalisations than language-generalisations. There is
value in the latter (eg. proof that "White+House", "Saudi+Arabia" behave
economically as 1 unit) but those (like me) seeking text-generalisations
are playing a different game I think, so the prizes are quite different.
Best wishes -- Mike
******************************************
Mike Scott
Applied English Language Studies Unit
University of Liverpool, Liverpool L69 3BX
http://www.liv.ac.uk/~ms2928/homepage.html
http://www.liv.ac.uk/~ms2928/wordsmith/index.htm