Dear Linda,
There was a thread about near duplicate detection on the list in late
December/early January -- perhaps, there is also something useful to your
problem there.
In particular, Marc Kupietz made his tool for near dup detection
available:
http://torvald.aksis.uib.no/corpora/2004-3/0374.html
We also have a tool, that we hope to be able to make available in a week
or so (it requires mysql, and I'm not sure it would run on any platform
but linux...)
Best regards,
Marco
This archive was generated by hypermail 2b29 : Thu Jun 02 2005 - 14:33:09 MET DST