[Corpora-List] The Second PASCAL RTE Challenge - Call for Participation

From: Roy Bar-Haim (barhair@cs.biu.ac.il)
Date: Fri Oct 28 2005 - 01:52:10 MET DST

  • Next message: Geoff Sampson: "[Corpora-List] intonationally-annotated corpora"

    ************************************************************************
                                The Second PASCAL
                    Recognising Textual Entailment Challenge

                              Call for Participation

                  http://www.pascal-network.org/Challenges/RTE2
    ************************************************************************
    A fundamental phenomenon of natural language is the variability of
    semantic expression, where the same meaning can be expressed by
    or inferred from different texts. Many natural language processing
    applications, such as Question Answering (QA), Information Retrieval
    (IR), Information Extraction (IE), and (multi) document summarization
    need to model this variability in order to recognize that a particular
    target meaning can be inferred from different text variants. Even though
    many applications face similar underlying semantic problems, these
    problems are usually addressed in an application-oriented manner.

    Textual Entailment Recognition was proposed recently as a generic task
    and evaluation framework that captures major semantic inference needs
    across natural language processing applications. The current challenge
    considers an applied notion of textual entailment, defined as a
    directional relation between two text fragments, termed T – the
    entailing text, and H – the entailed text. We say that T entails H if,
    typically, a human reading T would infer that H is most likely true (see
    examples below). This operational definition is based on (and assumes)
    common human understanding of language as well as common background
    knowledge.

    The last two years have seen rapidly growing interest in textual
    entailment within the natural language processing community.
    The First PASCAL Recognising Textual Entailment (RTE) Challenge
    provided the first benchmark for evaluating entailment systems. The
    challenge raised noticeable attention in the research community,
    attracting 17 submissions from diverse groups. The relatively low
    accuracy achieved by the participating systems suggests that the
    entailment task is indeed a challenging one, with a wide room for
    improvement. It was followed by an ACL 2005 Workshop on Empirical
    Modeling of Semantic Equivalence and Entailment. The challenge and its
    dataset motivated further research on empirical entailment, which
    resulted in a number of publications in recent main conferences as well
    as the inclusion of this topic in some recent calls for papers.

    By introducing a second challenge we hope to keep the momentum going,
    and to further promote the formation of a research community around the
    applied entailment task. As in the previous challenge, the main task is
    judging whether a hypothesis (H) is entailed by a text (T). One of the
    main goals for the RTE-2 dataset is to provide more "realistic"
    text-hypothesis examples, based mostly on outputs of actual systems. We
    focus on the four application settings mentioned above: QA, IR, IE and
    multi-document summarization. Each portion of the dataset includes
    typical T-H examples that correspond to success and failure cases of such
    applications. The examples represent different levels of entailment
    reasoning, such as lexical, syntactic, morphological and logical.
    The data collection procedure for each application setting can be found
    in the challenge website. The development subset, which represents the
    different types of test examples, is released first, but systems are
    likely to use external and unsupervised knowledge resources as well.
    The development set consists of 800 examples, 200 for each application
    setting. The test set will contain 1000-1200 examples. To make the
    challenge data more accessible, we also provide some pre-processing for
    the text and hypothesis, including sentence splitting and dependency
    parsing.

    RTE-2 was organized by Bar-Ilan University (Israel), CELCT (Trento,
    Italy), Microsoft Research (USA) and MITRE (USA). Data collection and
    annotation processes were improved this year, including cross-annotation
    of the examples across the organizing sites.

    T-H Examples:
    -------------

    Text: The drugs that slow down or halt Alzheimer's disease work best
    the earlier you administer them.

    Hypothesis: Alzheimer's disease is treated using drugs.

    Entailment: Yes

    * * *

    Text: Drew Walker, NHS Tayside's public health director, said: "It is
    important to stress that this is not a confirmed case of rabies."

    Hypothesis: A case of rabies was confirmed.

    Entailment: No

    * * *

    Text:Yoko Ono unveiled a bronze statue of her late husband, John
    Lennon, to complete the official renaming of England's Liverpool
    Airport as Liverpool John Lennon Airport.

    Hypothesis: Yoko Ono is John Lennon's widow.

    Entailment: Yes

    * * *

    Text: Arabic, for example, is used densely across North Africa
    and from the Eastern Mediterranean to the Philippines, as the key
    language of the Arab world and the primary vehicle of Islam.

    Hypothesis: Arabic is the primary language of the Philippines.

    Entailment: No

    * * *

    IMPORTANT DATES
    ------------------------------
    Release of Development Set October 26, 2005
    Release of Test Set January 12, 2006
    Deadline for participants' Submissions February 2, 2006
    Release of individual results February 7, 2006
    Deadline for participants' reports February 21, 2006
    Camera-ready version of reports March 14, 2006
    PASCAL Challenges Workshop April 10, 2006
    (in Venice, Italy)

    Note: the workshop is scheduled right after EACL.

    Challenge Organisers
    --------------------
    Bar-Ilan University, Israel (Coordinator):
            Ido Dagan,
            Roy Bar-Haim,
            Idan Szpektor

    CELCT, Trento - Italy:
            Bernardo Magnini,
            Danilo Giampiccolo

    Microsoft Research, USA:
            Bill Dolan

    MITRE, USA:
            Lisa Ferro

    SUPPORT
    -------
    The preparation and running of this challenge has been supported by the
    EU-funded PASCAL Network of Excellence on Pattern Analysis, Statistical
    Modelling and Computational Learning.

    For registration, further information and inquiries - please visit
    the challenge web site:
    http://www.pascal-network.org/Challenges/RTE2
    Contact: Roy Bar-Haim <barhair@cs.biu.ac.il>



    This archive was generated by hypermail 2b29 : Fri Oct 28 2005 - 12:13:00 MET DST