Re: [Corpora-List] token clustering tool

From: Hal Daume III (hdaume@ISI.EDU)
Date: Tue May 11 2004 - 15:58:19 MET DST

Next message: Ed Zwart: "[Corpora-List] annotated email corpora"

Previous message: Tony Berber Sardinha: "Re: [Corpora-List] token clustering tool"
In reply to: Tony Berber Sardinha: "Re: [Corpora-List] token clustering tool"
Next in thread: Normand Peladeau: "Re: [Corpora-List] token clustering tool"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Also,

http://www.isi.edu/~och/mkcls.html

works quite well.

On Tue, 11 May 2004, Tony Berber Sardinha wrote:

> Hi Murk
>
> (1) SImple chunker:
> -First, upload your corpus at http://lael.pucsp.br/corpora/enviar and obtain a
> password
> -Then go to http://lael.pucsp.br/corpora/ngrama/index.html, enter your password
> and cluster size, click on Fazer
> -See results
> (2) N-gram Statistics Package v.0.5 (by Ted Pedersen and Satanjeev Banerjee)
> -First, upload your corpus at http://lael.pucsp.br/corpora/enviar and obtain a
> password
> -Go to http://lael.pucsp.br/corpora/nsp/index.html, enter your password and
> other options, click on Fazer
> -See results
>
> If you're on Linux / Mac OSX / Unix / Cygwin I can send you a simple Unix Shell
> script for that.
>
> cheers
> tony.
> -------------------------------------
> Dr Tony Berber Sardinha
> LAEL, PUC/SP
> (Catholic University of Sao Paulo, Brazil)
> tony4@uol.com.br
> http://lael.pucsp.br/~tony
> [New website]
>
> ----- Original Message -----
> From: "Murk Wuite" <Murk@polderland.nl>
> To: <CORPORA@HD.UIB.NO>
> Sent: terça-feira, 11 de maio de 2004 04:24
> Subject: [Corpora-List] token clustering tool
>
>
> Dear all,
>
> Does anyone know of a tool (or algorithm), preferably available freely
> for research purposes, that takes as its input a corpus only and
> produces as its output clusters of tokens that occur close to each other
> relatively often?
>
> Best wishes,
>
> Murk Wuite
> MA student at the Department of Language and Speech, Katholieke
> Universiteit Nijmegen, The Netherlands
>
>
>

-- 
 Hal Daume III                                   | hdaume@isi.edu
 "Arrest this man, he talks in maths."           | www.isi.edu/~hdaume

Next message: Ed Zwart: "[Corpora-List] annotated email corpora"
Previous message: Tony Berber Sardinha: "Re: [Corpora-List] token clustering tool"
In reply to: Tony Berber Sardinha: "Re: [Corpora-List] token clustering tool"
Next in thread: Normand Peladeau: "Re: [Corpora-List] token clustering tool"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Tue May 11 2004 - 15:57:22 MET DST