In French, Benoit Habert has just published a book
about using generic Unix tools for corpus treatment.
The reference is
@Book{habert-et-al98a,
author = {Benoît Habert and Cécile Fabre and Fabrice Issac},
title = {De l'écrit au numérique~: constituer, normaliser, exploiter
les corpus électroniques},
publisher = {InterEditions/Masson},
year = 1998,
series = {Informatiques},
address = {Paris}
}
The book contains a CDROM with a French corpus and code for
all the programs in the book.
--Gregory Grefenstette
See also his recent book:
@Book{habert-et-al97c,
author = {Benoît Habert and Adeline Nazarenko and André Salem},
title = {Les linguistiques de corpus},
publisher = {Armand Colin/Masson},
year = 1997,
series = {U Linguistique},
address = {Paris}
}
____________________________________________________________________________
Gregory Grefenstette, Multilingual Theory and Technology
Xerox Research Centre Europe
6 chemin de Maupertuis, 38240 Meylan, France
Gregory.Grefenstette@xrce.xerox.com
Phone : (33) 4 76 61 50 82 fax : (33) 4 76 61 50 99
Inside France: 04-76-61-50-82