Call for participation: 3rd. Speak! workshop on Dialogue Systems

Dr. John Bateman (bateman@darmstadt.gmd.de)
Tue, 2 Jul 1996 18:33:53 +0200

3RD `SPEAK!' WORKSHOP: SPEECH GENERATION IN MULTIMODAL INFORMATION
SYSTEMS AND PRACTICAL APPLICATIONS

12 August, 1996

Budapest, Hungary

In parallel with ECAI '96,
preceding the ECAI '96 satellite workshop on Dialogue Processing
in Spoken Language Systems

******************** CALL FOR CONTRIBUTIONS ********************

This workshop aims to bring together researchers, developers, and
potential producers and marketers of multimodal information systems in
order to consider the role of *spoken language synthesis* in such
systems. Not only do we need to be able to produce spoken language
appropriately - including effective control of intonation - but also
we need to know in which practical contexts spoken language is most
beneficial. This requires a dialogue between those providing spoken
natural language technology and those considering the practical use of
multimodal information systems.

The workshop will consist of paper presentations and practical
demonstrations, as well as a roundtable discussion on the best
strategies for pursuing the practical application of spoken language
technology in information systems.

Suggested Topic Areas/Themes include, but are not limited to:

* functional control of intonation in synthesized speech

* use of speech in intelligent interfaces for information systems

* integration of speech into automatic query systems

* telecommunications applications

* cooperative integration of speech with text generation for
information systems

* evaluation strategies for information systems involving speech
synthesis

* applications for information systems with spoken language output
capabilities

* practical requirements for information systems with spoken language
capabilities.

Potential participants are invited to submit short statements of
interest indicating whether they would be interested in presenting a
paper, offering a system demonstration, participating in the round
table discussion, or simply attending.

Statements of interest should be sent as soon as possible followed,
where appropriate, by extended abstracts (max. 7 pages) by
1st. August by e-mail to: `nemeth@ttt.bme.hu' or by post to: Ge'za
Ne'meth, Dept. of Telecommunications and Telematics, TU Budapest,
Sztoczek u. 2. Budapest Hungary H-1111.

Extended abstracts will be made available at the workshop.

During the workshop current results and demonstrations of the EU
Copernicus Programme Project `Speak!' will also be given (see
attachment).

The workshop will be held in a historic building in Buda castle,
housing the Phonetic Laboratory of the Hungarian Academy of Sciences.
Participation is free of charge.

Contact:

Ge'za Ne'meth
Dept. of Telecommunications and Telematics
TU Budapest
Sztoczek u. 2.
Budapest Hungary H-1111
E-mail: NEMETH@ttt.bme.hu
Fax: +36/1-463-3107
Phone: +36/1-463 2401

-----------------------------------------------------------------

Project Information:

The `SPEAK!' Project:
Speech Generation in Multimodal Information Systems

`SPEAK!' is a European Union funded project (COPERNICUS '93 Project
No. 10393) whose aim is to embed spoken natural language synthesis
technology with sophisticated user interfaces in order to improve
access to information systems.

Multimedia technology and knowledge-based text processing enhance the
development of new types of information systems which not only offer
references or full-text documents to the user but also provide access
to images, graphics, audio and video documents. This diversification of
the in formation offered has to be supported by easy-to-use multimodal
user interfaces, which are capable of presenting each type of
information item in a way that it can be perceived and processed
effectively by the user.

Users can easily process simultaneously the graphical medium of
information presentation and the linguistic medium. The separation of
mode is also quite appropriate for the different functionalities of the
main graphical interaction and the supportive meta-dialogue carried out
linguistically. We believe, therefore, that a substantial improvement
in both functionality and user acceptance is to be achieved by the
integration of spoken languages capabilities.

However, text-to-speech devices commercially available today produce
speech that sounds unnatural and that is hard to listen to. High
quality synthesized speech that sounds acceptable to humans demands
appropriate intonation patterns. The effective control of intonation
requires synthesizing from meanings, rather than word sequences, and
requires understanding of the functions of intonation. In the domain of
sophisticated human-machine interfaces, we can make use of the
increasing tendency to design such interfaces as independent agents
that themselves engage in an interactive dialogue (both graphical and
linguistic) with their users. Such agents need to maintain models of
their discourses, their users, and their communicative goals.

The `SPEAK!' project, which was launched as a cooperation between the
Speech Research Technology Laboratory of the TECHNICAL UNIVERSITY OF
BUDAPEST and the UNIVERSITY OF DARMSTADT (in cooperation with
GMD-IPSI), is developing such an interface for a multimedia information
retrieval system. The speech synthesizer used is the MULTIVOX TTS
developed by the TU Budapest. At GMD-IPSI, the departments KOMET
(natural language generation) and MIND (information retrieval
dialogues) contribute to this project.

A proof-of-concept prototype of a multimodal information system is
being implemented, which combines graphical input and spoken language
output in a variety of languages. The work involves three supporting
goals: first, to advance the state of the art in the domains of speech
synthesis, spoken text generation, and graphical interface design;
second, to provide enabling technology for higher functionality
information systems that are more appropriate for general public use;
third, to significantly improve the public and industrial acceptance of
speech synthesis in general and the Hungarian text-to-speech technology
elaborated within the project in particular.