[Corpora-List] Call for Interest in Participation for SemEval Arabic Semantic Labeling

From: Mona Diab (mdiab@cs.columbia.edu)
Date: Wed Nov 29 2006 - 19:13:10 MET

  • Next message: Andy Roberts: "[Corpora-List] Typesetting concordance in LaTeX"

    [Apologies for duplications]
    [Please distribute widely]

    If you're interested in participating in the task of Arabic Semantic
    Labeling as part of SemEval-2007, please fill in the following for before
    December 1, 2006.
     
    http://nlp.cs.swarthmore.edu/semeval/interest.shtml

    The task description can be found on

    http://nlp.cs.swarthmore.edu/semeval/tasks/task18/description.shtml

    Attached below for your convenience
    *************************************************************************

    Task #18: Arabic Semantic Labeling

    Organizers

    Mona Diab (Columbia University)
    Christiane Fellbaum (Princeton University)
    Mohamed Maamouri (LDC, University of Pennsylvania)
    Martha Palmer (University of Colorado, Boulder)

    Tasks

    We propose several tasks for Arabic Semantic Labeling. The tasks will span
    both the WSD and Semantic Role labeling processes for this evaluation. Both
    sets of tasks will be evaluated on data derived from the same data set, the
    test set.

    *Word Sense Disambiguation

    We propose 3 subtasks for WSD all of which will only have test data for
    evaluation and trial data for formatting purposes:

    1. The first task is to discover different senses in the data for nouns and
    verbs without associating labels with those senses. Therefore it is a sense
    discrimination task.

    In this task the participants will be required to identify that the
    different senses for nouns and verbs without associating labels with those
    identified senses. These senses will be derived from the Arabic WordNet.
    There will be two levels of granularity, coarse and fine grain. The coarse
    grained senses will be numbered 1, 2, 3, etc. while the fine grained senses
    will be numbered 1.1, 1.2, 1.3. 2.1, 2.2, etc. The results will be evaluated
    on both granularities.

    2. The second task is to annotate all nouns and verbs in the data with
    Arabic WordNet senses (tentative)

    All verbs and nouns in the data will need to be annotated with their labels
    from Arabic WordNet

    3. The third task is to annotate all nouns and verbs in the data with
    English wordnet senses.

    a. In this task, the participants will be required to link the Arabic nouns
    and verbs with their corresponding sense(s) in the English WordNet

    b. An English translation corpus will be provided along with the trial/test
    data

    c. A bilingual word list will also be provided

    *Semantic Role Labeling

    We propose 3 subtasks for Semantic Role Labeling (SRL). These subtasks will
    have trial, training and test data available for it:

    4. Identifying Arguments in a sentence.

    In this task, the participants are required to identify all the constituents
    in a constituency tree that should be annotated with argument roles related
    to some predetermined verbs

    5. Automatic annotations for numbered argument

    In this task, the participants are required to identify and label the
    constituents in a constituency tree that should be annotated with numbered
    argument roles related to some predetermined verbs

    6. Automatic annotations for all arguments.

    In this task, the participants are required to identify and label all the
    constituents in a constituency tree that should be annotated with both
    numbered argument roles and ARGM roles related to some predetermined verbs

    Combination Tasks

    7. Finally, we will propose some subtasks to combine any of the WSD tasks
    with any of the SRL tasks. (more to come on this)

    Data
    he data will be Arabic Treebank 3 v.2 data which is newswire in Modern
    Standard Arabic. We will only opt for 100 most frequent verbs in this set to
    draw training, trial (for the semantic role labeling tasks) and test data
    for the semantic role labeling and WSD tasks) The data is syntactically and
    morphologically manually annotated. The syntactic trees are constituency
    trees. A preliminary version of the Arabic WordNet will be available

    Evaluation metric

    SRL: Conlleval metrics of precision recall and f measure
    WSD: Scorer metrics of precision, recall and f-measure on both coarse and
    fine grained sense distinctions.

    Dates
     
    Jan 1st Delivering trial data
    March 1st Delivering the training and test data
    ****************************************************************************
    Mona T. Diab, PhD
    Computational Linguistics Group (CADIM)
    Center for Computational Learning Systems
    Columbia University

    Tel.: +1 212 870 1290
    Fax: +1 212 870 1285
    http://www.cs.columbia.edu/~mdiab

    > -----Original Message-----
    > From: owner-corpora@lists.uib.no [mailto:owner-corpora@lists.uib.no] On
    > Behalf Of Ken Litkowski
    > Sent: Wednesday, November 29, 2006 11:38 AM
    > To: corpora@hd.uib.no
    > Subject: [Corpora-List] Call for Interest in Participation for SemEval
    > Preposition WSD
    >
    > If you're interested in particpating in the Word-Sense Disambiguation of
    > Prepositions as part of SemEval-2007, please fill in the following form
    > before December 1, 2006.
    >
    > http://nlp.cs.swarthmore.edu/semeval/interest.shtml
    >
    > Task #6: Word-Sense Disambiguation of Prepositions
    >
    > http://nlp.cs.swarthmore.edu/semeval/tasks/task06/description.shtml
    >
    > Abstract
    >
    > A major research topic in computational linguistics is semantic role
    > labeling. Prepositions are important mediators of semantic roles,
    > particularly oblique arguments and adjuncts. As part of The Preposition
    > Project, preposition instances from FrameNet have been tagged with
    > senses from the Oxford Dictionary of English. All major English
    > prepositions are included, with the number of instances ranging from 100
    > to over 4000 for the preposition "of". Additional data on each instance
    > (e.g., FrameNet frame element) has been generated and is available for
    > use in this disambiguation task. Since prepositions are a closed class,
    > this SemEval task has the potential for creating fundamental
    > characterizations of their behavior.
    >
    > Organizer
    > Ken Litkowski
    > CL Research
    >
    > --
    > Ken Litkowski TEL.: 301-482-0237
    > CL Research EMAIL: ken@clres.com
    > 9208 Gue Road
    > Damascus, MD 20872-1025 USA Home Page: http://www.clres.com



    This archive was generated by hypermail 2b29 : Wed Nov 29 2006 - 19:16:03 MET