[Corpora-List] Summary of responses: Pragmatic annotations

From: Victoria López (mavilos@terra.es)
Date: Wed Sep 14 2005 - 10:36:27 MET DST

  • Next message: owner-corpora@lists.uib.no: "(no subject)"

    Hello,

    Two weeks ago I posted a question about pragmatic annotations. Thanks to all
    of those who responded. Here's a brief summary.

    'Further levels of annotation' by Geoffrey Leech, Tony McEnery and
    Martin Wynne, in Corpus Annotation, edited by Roger Garside, Geoffrey
    Leech and Anthony McEnery, Longman, Harlow, 2005.

    -----------------------------------------------
    ACL workshop on discourse annotation ?

    http://www.cllt.osu.edu/dbyron/acl04/

    ------------------------------------------------
    Some exploratory experiments regarding
    general-knowledge-based cohesion in texts:

    Beigman Klebanov, B., 2005.
    "Using Readers to Identify Lexical Cohesive Structures in Texts"
    In Proceedings of ACL-2005 Student Session, Ann Arbor, USA, June 2005,
    pp. 55-60.

    The annotation guidelines we've given to the subjects can be found on my
    webpage: http://www.cs.huji.ac.il/~beata

    ---------------------------------------------------
    The work of Samuels et al. in COLING Montreal (1998?). it has gone quite a
    way since then with lots of people joining it--below are a few references to
    work at Sheffield which gets good results from rather simpler classifier
    training than is usual:

    Webb, N., M. Hepple and Y. Wilks (2005)
    Error Analysis of Dialogue Act Classification, in Proceedings of the 8th
    International Conference on Text, Speech and Dialogue, Carlsbad, Czech
    Republic.

    Webb, N., M. Hepple and Y. Wilks (2005)
    Empirical determination of thresholds for optimal dialogue act
    classification, in Proceedings of the Ninth Workshop on the Semantics and
    Pragmatics of Dialogue (SemDial), Nancy.

    Webb, N., M. Hepple and Y. Wilks (2005)
    Dialogue Act Classification using Intra-Utterance Features, in Proceedings
    of the AAAI Workshop on Spoken Language Understanding, Pittsburgh.

    Webb, N., H. Hardy, C. Ursu, M. Wu, T. Strzalkowski and Y. Wilks (2005)
    Data-Driven Language Understanding for Spoken Language Dialogue, in
    Proceedings of the AAAI Workshop on Spoken Language Understanding,
    Pittsburgh, 2005.
    -----------------------------------------------------------
    I don't know if you count coreference as pragmatics, but you could look at
    Aone and Bennett's (1994) Discourse Tagging Tool; Alembic workbench; and
    Clinka. There was also a workshop at ACL on frontiers in annotation
    http://nlp.cs.nyu.edu/meyers/frontiers/2005.html which might have some
    useful pointers.

    ------------------------------------------------------------
    Popescu-Belis et al 2003, A Thematic Bibliography on Dialogue Processing.
    Section 3.4 on Dialogue data and annotation.
    http://www.issco.unige.ch/projects/im2/mdm/docs/biblio/mdm-biblio.html

    Dhillon et al, 2004, Meeting Recorder Project: Dialogue Act Labeling Guide
    http://www.icsi.berkeley.edu/ftp/global/pub/speech/papers/MRDA-manual.pdf

    Stolcke et al 2000 Doalogue Act Modeling for Automatic Tagging and
    Recognition of Conversationl Speech.Computational Linguistics 26(3),
    339-373.

    Jurafsky et al, 1997, Switchboard SWBD-DAMSL Shallow-Discourse-Function
    Annotation
    http://www.colorado.edu/ling/jurafsky/manual.august1.html

    Carletta et al 1996, HCRC Dialogue Structure Coding Manual
    http://www.hcrc.ed.ac.uk/publications/tr-82.ps.gz

    ---------------------------------------------------------------

    May I call your attention to work we at Tilburg university have done on
    classifying dialogue acts in spoken dialogues.
    We applied machine learning to a Dutch corpus of human-machine dialogues
    conducted with a spoken dialogue system.
    We used a small, domain-specific tagset that covered different aspects
    of pragmatic and semantic phenomena.

    You may find our related publications on
    http://ilk.kub.nl/~piroska/research.htm , such as:

    # P. Lendvai, A. van den Bosch: /Robust ASR lattice representation types
    in pragma-semantic processing of spoken input./ In: Proc. of the AAAI
    Spoken Language Understanding Workshop, SLU-2005, Pittsburgh, PA, 2005,
    pages 15-22.

    # P. Lendvai:/ Extracting Information from Spoken User Input. A Machine
    Learning Approach./ Ph.D. thesis, Tilburg University, Netherlands, 2004.

    # P. Lendvai, A. van den Bosch, E. Krahmer: /Machine Learning for Shallow
    Interpretation of User Utterances in Spoken Dialogue Systems. /In: Proc.
    of EACL-03 Workshop on Dialogue Systems:interaction, adaptation and
    styles of management. Budapest, Hungary, 2003. pages 69-78.

    # P. Lendvai, A. van den Bosch, E. Krahmer, M. Swerts: /Multi-feature
    error detection. /In: Theune, M., Nijholt, A.& Hondorp, H. (Eds.),
    Language and Computers: Studies in Practical Linguistics. (pp. 163-178).
    Amsterdam: Rodopi. 2002.

    # P. Lendvai, A. van den Bosch, E. Krahmer, M. Swerts:
    /Improving machine-learned detection of miscommunications in
    human-machine dialogues through informed data splitting. /In: Kuebler,
    S. & Hinrichs, E. (Eds.), Machine Learning Approaches in Computational
    Linguistics. (pp. 1-15). Trento, Italy: ESSLLI. 2002.

    -----------------------------------------------------------------------------

    There's an article by Lampert and Ervin-Tripp in _Talking Data:
    Transcription
    and Coding for Discourse Research__, 1993 (edited by Martin Lampert and I)
    which describes principles for designing, implementing and evaluating a
    system of codes (including intercoder reliability). Illustrated by
    examples of coding of control acts in children.

    For an array of different types of coding,
    I'd recommend the deliverables from the MATE project, which are
    available online.



    This archive was generated by hypermail 2b29 : Wed Sep 14 2005 - 10:41:12 MET DST