Re: Corpora: Balancing Acts

Chris Tribble (ctribble@ikp.atm.com.pl)
Thu, 09 Oct 1997 15:04:14 +0200

Jem Clear's comments are correct in so far as they go - it depends,
however, (as most things do) on what you are trying to do (and I
advisedly don't say "prove"). Clearly Jean Hudson has used some bogey
words - "control the balance within a large text corpus" - "a list of
the words that occur significantly more frequently within the sample
than they do within the language as a whole" so I am not surprised that
the ton of bricks is descending headwards. What I take a bit of
exception to is Jem's assertion that "Single word frequencies are
sort-of interesting as an obvious and preliminary investigation for any
corpus."

If you are involved in language teaching rather than lexicography,
single word lists from small selective corpora can be seriously useful -
look at the arguments in Guy Aston's PALC paper - or mine for that
matter (both available in HTML at Tim John's home page
http://sun1.bham.ac.uk/johnstf/homepage.htm - go to the bibliography).
My growing experience is that so long as you declare what you are
comparing with what, you can say some pedagically useful things with
this sort of data. For example using a wordlist from the one million
word written set in "Core" BNC (ie the future BNC Sampler) as a
reference list + Mike Scott's Keywords propgram (WordSmith Tools -
http://www.liv.ac.uk/~ms2928/homepage.html), I'm finding plenty of
things to say things about a research corpus (112,000 words) of a
specific genre - and that the conclusions that you reach from this sort
of study can be useful for teaching purposes.

It may only be a drop in Jem's ocean - but it can still be an
interesting drop. There's more to corpus linguistics than lexicography.

Chris Tribble

-- 
POLAND          Warszawa, ul. Zelazna 67 m.19, Poland
                TEL +48 22 6245158 | FAX: +48 22 652 1806
UK              122, Queen Alexandra Mansions, Judd Street
                London WC1 H 9DQ 
                TEL +44 171 833 4271
UK Mailing      c/o The British Council: Poland, FCO (WARSAW) 
                King Charles Street, London SW1A 2AH
E-mail          Christopher_Tribble@compuserve.com
                ctribble@ikp.atm.com.pl
Home Page      
http://ourworld.compuserve.com/homepages/Christopher_Tribble