Corpora: CFP: Layout seminar

Adam Kilgarriff (
Mon, 25 Jan 1999 10:48:37 +0000 (GMT)

(Apologies to those of you who receive multiple copies of this posting.)


"Using Layout for the Generation, Understanding or Retrieval
of Documents"

AAAI 1999 Fall Symposium
November 5-7, 1999
Sea Crest Conference Center on Cape Cod
North Falmouth, Massachusetts

Layout clearly plays a role in text comprehension and, concomitantly,
in the way in which text ought to be generated. It also contributes to
the identification of classes of documents (e.g., business letters
versus journal articles versus user manuals), parts of documents
(e.g., the sports pages versus the classified advertisements of a
newspaper) and types of information contained in a document (e.g.,
subsidiary information in footnotes versus primary information in
titles; paragraph breaks as topic breaks). Nevertheless, the issue of
layout has been largely ignored in computational linguistics and
information retrieval: few, if any, natural language generation
systems produce (except in the most rudimentary way) laid-out text;
probably no natural language understanding system includes layout as
an input feature; possibly no information retrieval or information
extraction system makes more than cursory use of layout. Furthermore,
of the growing corpora of on-line texts, none to our knowledge makes
more than a passing stab at including the layout of their source

The rapidly expanding use of SGML and HTML in source documents is,
however, now making layout a much more accessible feature for study
and for computational treatment than it has been previously, and
therefore increasingly available for use in natural language
processing and information retrieval.

The symposium will provide a discussion forum for emerging work on the
following issues:

- the parameters of layout

- the interactions between layout and
- information structure
- discourse structure
- document structure
- genre
- grammaticality
- punctuation
- referring expressions
- linguistic style

- the influence of layout on text comprehension

- corpus annotation schemes for document layout

- integrating text and graphics in documents

- implementation issues (e.g., treatment of local vs global layout
features; relation between layout realisation and syntactic


We invite applications for participation in one of two formats:

- An extended abstract of up to 5 pages describing completed work or
- A statement of 1 page describing your interest in this area and
including, where possible, any relevant publications.

Submissions should be made electronically in one of the following
forms: ASCII, postscript, self-contained LaTeX, HTML or RTF. They are
to be sent to

The symposium will be limited to between forty and sixty
participants. Working notes will be prepared and distributed to
participants in each symposium. In addition to invited participants, a
limited number of interested parties will be able to register in each
symposium on a first-come, first-served basis. Registration
information will be available in early July at

Submission Dates

Submissions for the symposia are due March 31, 1999

Notification of acceptance will be given by May 7, 1999

Material to be included in the working notes of the symposium must be
received by August 27, 1999.

Organising Committee

John Carroll, University of Sussex;
Robert Dale, Microsoft Research Institute;
Winfried Graf, Kienbaum Management Consultants GmbH;
Matthew Hurst, University of Edinburgh;
Geoff Nunberg, Xerox PARC;
Richard Power (co-chair), University of Brighton;
Donia Scott (co-chair), University of Brighton;
Karen Sparck Jones, Cambridge University.