At 10:19 11-5-2004, you wrote:
>At 09:24 11/05/2004, Murk Wuite wrote:
>>Dear all,
>>Does anyone know of a tool (or algorithm), preferably available freely
>>for research purposes, that takes as its input a corpus only and
>>produces as its output clusters of tokens that occur close to each other
>>relatively often?
>It is possible that the document clustering toolkit CLUTO fit your
>necessities, perhaps with some adaptation.
WordSmith Tools (not free) has a Cluster function which takes a corpus and
outputs word clusters based on co-occurence statistics.
Version 4, while still in beta, can be used freely for about a month.
Wordsmith can be used also with annotated corpora (it can ignore or use tags).
The freeware AntConc program has a similar function for outputting word
And here's a further list of links to some similar programs:
Hope this helps,
Maarten Jansonius
Maarten Jansonius
Université catholique de Louvain
Collège Erasme, C468
010 / 47.49.73
This archive was generated by hypermail 2b29 : Mon May 24 2004 - 10:19:15 MET DST