Summary: Corpora: French stop words?

Noemi Preissner (noemi@CoLi.Uni-SB.DE)
Tue, 16 Feb 1999 18:05:21 +0100 (MET)


now that everyone received Jean Veronis' French stop-list (which is
A LOT bigger than the others I was given!), I would like to summarize the
other reactions to my posting.

My asking for a stop-list in order to create a list of words that can't
appear at ends of sentences was probably kind of misleading. My idea was
(and still is) that most (forms of) nouns, verbs, adjectives and adverbs
probably CAN appear at ends of French sentences and that most words that
cannot belong to other categories (closed class - at least in my preferred
definition of closed class!). So what I was REALLY looking for probably
was a list of closed class words (I remember I mentioned it in brackets).

On the other hand, most closed class words should probably be contained
in a stop-list since they are not really informative of content, but I
know that there are lots of counter-examples of stop-lists ...

In addition to his stop-list, Jean Veronis <>
allowed me to exploit the French Multext dictionary for my research, so
I can directly extract the categories I am interested in from it ...
(by the way, that's what I did to create my German and English stop-lists,
I exploited some German and English lexical databases ... I was lacking one
for French though ... ).

Patrice Bonhomme <> has a collection of stop-lists
for English, French and German available at
Apart from stop words you will find lists of word frequencies there. Those
lists have been created on the base of the corpora at the Silfide server

Ralf Steinberger <> sent me another list of 68 French
stop words ...

Finally, Antoinette Renouf <> suggested to contact Max
Silberztein or Jean Senellart at the Laboratoire d'Automatique Documentaire
et Linguistique (LADL), University Paris VII, which I did not do though,
since I already had enough material!

Thanks a lot to all of them!