SIGIR'04 Workshop
Call for Participation
July 29, 2004, Sheffield, UK
For registration details see
IMPORTANT: You do not have to register for the full SIGIR conference
to register and attend the IR4QA workshop.
Open domain question answering has become a very active research area
over the past few years, due in large measure to the stimulus of the
TREC Question Answering track. This track addresses the task of
finding *answers* to natural language (NL) questions (e.g. ``How
tall is the Eiffel Tower?" ``Who is Aaron Copland?'') from large text
collections. This task stands in contrast to the more conventional IR
task of retrieving *documents* relevant to a query, where the
query may be simply a collection of keywords (e.g. ``Eiffel Tower",
``American composer, born Brooklyn NY 1900, ...'').
Finding answers requires processing texts at a level of detail that
cannot be carried out at retrieval time for very large text
collections. This limitation has led many researchers to propose,
broadly, a two stage approach to the QA task. In stage one a subset of
query-relevant texts are selected from the whole collection. In stage
two this subset is subjected to detailed processing for answer
extraction. To date stage one has received limited explicit attention,
despite its obvious importance -- performance at stage two is bounded
by performance at stage one. The goal of this workshop is to correct
this situation, and, hopefully, to draw attention of IR researchers to
the specific challenges raised by QA.
A straightforward approach to stage one is to employ a conventional IR
engine, using the NL question as the query and with the collection
indexed in the standard manner, to retrieve the initial set of
candidate answer bearing documents for stage two. However, a number
of possibilities arise to optimise this set-up for QA, including:
Open domain question answering has become a very active research area
over the past few years, due in large measure to the stimulus of the
TREC Question Answering track. This track addresses the task of
finding *answers* to natural language (NL) questions (e.g. ``How
tall is the Eiffel Tower?" ``Who is Aaron Copland?'') from large text
collections. This task stands in contrast to the more conventional IR
task of retrieving *documents* relevant to a query, where the
query may be simply a collection of keywords (e.g. ``Eiffel Tower",
``American composer, born Brooklyn NY 1900, ...'').
Finding answers requires processing texts at a level of detail that
cannot be carried out at retrieval time for very large text
collections. This limitation has led many researchers to propose,
broadly, a two stage approach to the QA task. In stage one a subset of
query-relevant texts are selected from the whole collection. In stage
two this subset is subjected to detailed processing for answer
extraction. To date stage one has received limited explicit attention,
despite its obvious importance -- performance at stage two is bounded
by performance at stage one. The goal of this workshop is to correct
this situation, and, hopefully, to draw attention of IR researchers to
the specific challenges raised by QA.
A straightforward approach to stage one is to employ a conventional IR
engine, using the NL question as the query and with the collection
indexed in the standard manner, to retrieve the initial set of
candidate answer bearing documents for stage two. However, a number
of possibilities arise to optimise this set-up for QA, including:
o preprocessing the question in creating the IR query;
o preprocessing the collection to identify significant information that
can be included in the indexation for retrieval;
o adapting the similarity metric used in selecting documents;
o modifying the form of retrieval return, e.g. to deliver passages
rather than whole documents. preprocessing the question in creating
the IR query;
o preprocessing the collection to identify significant information that
can be included in the indexation for retrieval;
o adapting the similarity metric used in selecting documents;
o modifying the form of retrieval return, e.g. to deliver passages
rather than whole documents.
The workshop will consist of presentations of the following
accepted papers:
o What Works Better for Question Answering: Stemming or
Morphological Query Expansion
Matthew W. Bilotti, Boris Katz and Jimmy Lin
o A Comparative Study on Sentence Retrieval for Definitional
Question Answering
Hang Cui, Min-Yen Kan, Tat-Seng Chua and Jing Xiao
o Using Pertainyms to Improve Passage Retrieval for Questions
Requesting Information About a Location
Mark A. Greenwood
o Minimal Span Weighting Retrieval for Question Answering
Christof Monz
o Simple Translation Models for Passage Retrieval for QA
Vanessa Murdock, W. Bruce Croft
o Sense-Based Blind Relevance Feedback for Question Answering
Matteo Negri
o Exploring the Performance of Boolean Retrieval Strategies
For Open Domain Question Answering
Horacio Saggion, Rob Gaizauskas, Mark Hepple,
Ian Roberts and Mark A. Greenwood
o Boosting Weak Ranking Functions to Enhance Passage Retrieval
For Question Answering
Nicolas Usunier, Massih R. Amini and Patrick Gallinari
o Seeking an Upper Bound to Sentence Level Retrieval in
Question Answering
Kieran White and Richard F. E. Sutcliffe
o Domain-Specific QA for the Construction Sector
Zhuo Zhang, Lyne Da Sylva, Colin Davidson, Gonzalo Lizarralde
and Jian-Yun Nie
Workshop Organizers
Rob Gaizauskas (University of Sheffield)
Mark Hepple (University of Sheffield)
Mark Greenwood (University of Sheffield)
Programme Committee
Shannon Bradshaw (University of Iowa)
Charles Clarke (University of Waterloo)
Sanda Harabagiu (University of Texas at Dallas)
Eduard Hovy (University of Southern California)
Jimmy Lin (Massachusetts Institute of Technology)
Christof Monz (University of Maryland)
John Prager (IBM)
Dragomir Radev (University of Michigan)
Maarten de Rijke (University of Amsterdam)
Horacio Saggion (University of Sheffield)
Karen Sparck-Jones (University of Cambridge)
Tomek Strzalkowski (State University of New York, Albany)
Ellen Voorhees (NIST)
This archive was generated by hypermail 2b29 : Mon Jun 28 2004 - 21:21:24 MET DST