Re: Corpora: Corpus Linguistics User Needs

Patrick Juola (patrick.juola@psy.ox.ac.uk)
Wed, 29 Jul 1998 11:27:33 +0100

Geoffrey Sampson <geoffs@cogs.susx.ac.uk> wrote:


I'm afraid my response risks sounding a little arrogant, but this is a point
that has puzzled me for years. You are quite right to say that many
corpus linguists do not know how to write programs, and rely on software
produced by others which may not meet their needs. It has always seemed to
me that the answer to a corpus linguist who sees this as a problem is
"Learn to program, then". I have never understood why it has become socially
acceptable for even quite junior academics to say "I can't program, someone
else will have to do this for me", while they wouldn't dream of saying
"I don't know how big library catalogue systems work, someone else will
have to fetch my books".

(In case anyone thinks "It's all very well for him to write that way, he is
a computer specialist", perhaps I should mention that my first degree was
in Chinese, mainly classical Chinese language, literature, Chinese history,
etc., plus a little general linguistics. I decided to learn about computers
as a graduate student because it was clear that they were destined to
become useful tools in linguistics.)

I believe this situation is not just a social oddity but is having unfortunate
consequences for progress in corpus linguistics....

Funny, I don't understand how the Bodleian library works -- and someone
else *does* have to fetch my books for me.

Seriously, though. I disagree with the opinion that one needs to be
able to program to do corpus linguistics, any more than one needs to
be able to program in order to do "artificial neural network" research.
Yes, being able to program is a convenience -- and a necessity for those
of us who are pushing back the methodological frontiers. But I think
there's an existing body of methods and techniques that can be usefully
and fruitfully applied to new areas, languages, and corpora that really
only need canned software.

I think that telling people "learn how to program" is counterproductive.
If what we've discovered so far -- we, of course, being a largely
English-speaking and technically focused group -- is at all useful, then
we should be making our techniques available to people who aren't
technically inclined but who happen to have forgotten more about
classical Armenian than we will ever learn.

Patrick