Re: [Corpora-List] agent and patient probabilities

From: P Resnik (psresnik@gmail.com)
Date: Wed Jan 24 2007 - 00:00:40 MET

  • Next message: Djoerd Hiemstra: "[Corpora-List] SIGIR 2007: Call for papers for Doctoral Consortium"

    I worked on this topic some time back (wow, it's *quite* some time back now,
    sheesh...) developing a computational model of verb-argument selectional
    preferences and validating it using comparisons with human ratings of just
    the kind you describe:

    Philip Resnik, `Selectional Constraints: An Information-Theoretic Model and
    its Computational Realization'', Cognition, 61:127-159, November 1996.
    http://scholar.google.com/scholar?q=author:%22Resnik%22%20intitle:%22Selectional%20constraints:%20an%20information-theoretic%20model%20and%20its%20computational%20realization%22

    The model should be very straightforward to implement, and, in fact, I can
    probably dig up some old tgrep query expressions that will allow you to pull
    out verb-object and verb-subject triples from Penn Treebank constituency
    trees in order to estimate the model parameters.

    This thread has been followed in various directions in the subsequent
    computational linguistics literature. Two of the most useful references
    (esp. for pointers to related work) might be

    Marc Light and Warran Greiff, Statistical models for the induction and use
    of selectional preferences, Cognitive Science, 2002, Vol. 26, No. 3, Pages
    269-281
    http://www.mitre.org/work/tech_papers/tech_papers_02/greiff_statistical/greiff_statistical.pdf

    and

    Brockmann, C. and Lapata, M. 2003. Evaluating and combining approaches to
    selectional preference acquisition. In *Proceedings of the Tenth Conference
    on European Chapter of the Association For Computational Linguistics -
    Volume 1* (Budapest, Hungary, April 12 - 17, 2003). European Chapter Meeting
    of the ACL. Association for Computational Linguistics, Morristown, NJ,
    27-34.
    http://portal.acm.org/citation.cfm?id=1067813&dl=GUIDE,

    Hope this is helpful,

      Philip

    On 1/22/07, Jim Magnuson <james.magnuson@uconn.edu> wrote:
    >
    > I'm a psycholinguist rather than a computational linguist, with a
    > "newbie" question.
    >
    > For some experiments, we need agent-verb-patient triples where the
    > "goodness" of the agents and patients to the verb vary in strength.
    > Typical ways to develop materials for such studies is by having human
    > subjects rate how "good" various items are as agents and patients for
    > particular verbs (e.g., "how likely is a dog to walk?", "how likely
    > is a dog to be walked?"). While this works well, it's of course very
    > labor (and subject) intensive. So I'm hoping to automate this.
    >
    > I'm looking for recommendations for parsed corpora and tools to use
    > (with the goal of getting this going ASAP).
    >
    > I know about the Penn Treebank; are there better and/or less
    > expensive options for US English, or is this just the way to go?
    >
    > I'm an okay perl programmer, and computer savvy; are there tools that
    > would be helpful?
    >
    > Thanks very much,
    >
    > jim
    >
    >
    >



    This archive was generated by hypermail 2b29 : Tue Jan 23 2007 - 23:58:45 MET