Re: [Corpora-List] Automatic categorization of words.

From: Cyrus Shaoul (cyrus.shaoul@ualberta.ca)
Date: Thu Mar 10 2005 - 04:50:09 MET

  • Next message: Stefan Evert: "Re: [Corpora-List] Query about nomenclature"

    Hi Again Listers,

    Thanks for all your ideas and suggestions. For the benefit of the list,
    here is a short summary of what I learned.

    *****
    Some people felt that WordNet might have some relevance:

    For example: if 'object, physical object' was in the hypernymy tree of a
    word and the corpus was annotated with word sense disambiguation (WSD) info.

    *****

    Also, there was a pointer to the paper:

    D. Freitag, "Toward Unsupervised Whole-Corpus Tagging," Proceedings of
    Coling 2004.

    and http://clg.wlv.ac.uk/demos/similarity/

    and suggestions of using clustering to classify words based on some
    examples of concrete and abstract nouns. See also : a tool called "WEKA".

    *****

    There was the idea to look at the Regressive Imagery Dictionary at:

             http://www.simstat.com/WordStat/RID.htm

    And a pointer to:

    Martindale, C. (1990). The clockwork muse: The predictability of
    artistic change. New York: Basic Books.

    *****

    Finally Dominic Widdows pointed out to the list quite correctly that
    there are very few words that are purely abstract or concrete, and that
    any dichotomous classification is bound to have many problems.

    *****

    I should refine my question: I would like to rate words using a
    continuous measure of "concreteness" or "abstractness", not classify or
    categorize them. (I wish I said this in my original message!!! Can't
    take back those electrons...)

    I will look into the clustering and measuring the relative proximity of
    words to the cluster as a possibility.

    Thanks to all for your time and help. Hope I can do the same for you one
    day.

    Cyrus



    This archive was generated by hypermail 2b29 : Thu Mar 10 2005 - 04:50:11 MET