RE: Corpora: Corpus Linguistics User Needs

Gregory Grefenstette (Gregory.Grefenstette@xrce.xerox.com)
Tue, 4 Aug 1998 15:54:05 +0200

> > From: Dave Moffat
> > I suppose we all have to share opinions in order to shape future
> > generations; we have to decide what to teach them.
> > That is the purpose of this debate I guess, so it is important.
>
> I've found the discussion of the value of linguists being programmers
> interesting, but I've seen only one post (from Bill Teahan) that actually
> addressed Oliver Mason's original question: what sorts of tools and
> capabilities do corpus linguists generally require from their corpus
> analysis tools? True, not all needs can be anticipated, but surely we can
> define a core set of requirements?
>

In French, Benoit Habert has just published a book
about using generic Unix tools for corpus treatment.

The reference is

@Book{habert-et-al98a,
author = {Benoît Habert and Cécile Fabre and Fabrice Issac},
title = {De l'écrit au numérique~: constituer, normaliser, exploiter
les corpus électroniques},
publisher = {InterEditions/Masson},
year = 1998,
series = {Informatiques},
address = {Paris}
}

The book contains a CDROM with a French corpus and code for
all the programs in the book.

--Gregory Grefenstette

See also his recent book:

@Book{habert-et-al97c,
author = {Benoît Habert and Adeline Nazarenko and André Salem},
title = {Les linguistiques de corpus},
publisher = {Armand Colin/Masson},
year = 1997,
series = {U Linguistique},
address = {Paris}
}

____________________________________________________________________________
Gregory Grefenstette, Multilingual Theory and Technology
Xerox Research Centre Europe
6 chemin de Maupertuis, 38240 Meylan, France
Gregory.Grefenstette@xrce.xerox.com
Phone : (33) 4 76 61 50 82 fax : (33) 4 76 61 50 99
Inside France: 04-76-61-50-82