[Corpora-List] Discover Word Meanings with SenseClusters!

From: tpederse@d.umn.edu
Date: Sun Jan 04 2004 - 22:25:10 MET

  • Next message: Patricia Rodríguez Inés: "[Corpora-List] CULT (Corpus Use and Learning to Translate) programme"

    We are pleased to announce the release of SenseClusters, a free software
    package that does unsupervised discovery of word senses by clustering
    together instances of a word (or words) that are used in similar contexts
    in raw text. It supports a wide range of clustering techniques based on
    both context vectors and similarity matrices.

    SenseClusters is flexible, and can be used in any application that
    requires clustering of similar instances of text. Examples could include
    word sense discrimination, synonymy identification, text classification,
    and summarization. It can also be used to implement models such as Latent
    Semantic Analysis (LSA).

    SenseClusters takes a user through the entire process of unsupervised
    learning of word senses, including text preprocessing, feature selection,
    context vector and similarity matrix construction, dimensionality
    reduction via singular value decomposition (SVD), and clustering via both
    agglomerative and partitional algorithms.

    SenseClusters provides a great deal of native functionality, and also
    provides seamless interfaces to take advantage of a number of powerful
    tools, including Cluto (a Clustering toolkit), SVDPACKC (which carries
    out singular value decomposition), and the Ngram Statistics Package.

    For general information please visit:

    For immediate download of the first public release (0.47) please visit:

    This is an active project, and the principle designer and lead developer
    (Amruta Purandare, pura0010@d.umn.edu) and I would be delighted to hear
    any comments, requests, or even bug reports that you might have. You can
    see some of our future plans in our Todo list, which is distributed with
    the package.

    Ted and Amruta

    PS To subscribe to the SenseClusters mailing list/s, visit:

    http://lists.sourceforge.net/lists/listinfo/senseclusters-users (discussion)
    http://lists.sourceforge.net/lists/listinfo/senseclusters-news (announcements)

    # Ted Pedersen                              http://www.umn.edu/~tpederse #
    # Department of Computer Science                        tpederse@umn.edu #
    # University of Minnesota, Duluth                                        #
    # Duluth, MN 55812                                        (218) 726-8770 #

    This archive was generated by hypermail 2b29 : Sun Jan 04 2004 - 22:30:50 MET