[Corpora-List] dictionary definitions to glosses

From: Mike Maxwell (maxwell@ldc.upenn.edu)
Date: Mon Dec 08 2003 - 21:16:43 MET

  • Next message: Santos Diana: "[Corpora-List] Portuguese Morpholympics data finally released"

    Has anyone seen any work on reducing dictionary-style definitions to
    simple(r) glosses?

    For example, the definition

        act or process of shrinking, esp in wood; shrinkage.

    might reduce to 'shrinkage', and

        bother; disturbance or interruption.

    might similarly reduce to any one of the three content words. In some
    cases, more than one word might be output:

        to carry a canoe

    should probably reduce to 'carry canoe', not just 'carry' or 'canoe'.

    I can think of some heuristics, e.g. choose the least common word (in some
    sense of 'common'), but if the chosen word is the object of a verb, retain
    the verb also. (Which requires some parsing--fortunately, verbs in English
    definitions are usually preceded by the word 'to', I suspect, so
    distinguishing verbs from nouns should not be all that difficult.)

    I suppose this may be related to text summarization work.

        Mike Maxwell
        LDC
        maxwell@ldc.upenn.edu



    This archive was generated by hypermail 2b29 : Mon Dec 08 2003 - 21:19:14 MET