New Release from the LDC

LDC Office (ldc@unagi.cis.upenn.edu)
Fri, 31 May 1996 12:30:51 EDT

Announcing a NEW RELEASE from the
LINGUISTIC DATA CONSORTIUM

Acoustic-Phonetic Continuous Speech Corpus
Far Field Microphone Recordings

FFMTIMIT

The FFMTIMIT corpus contains the previously-unreleased secondary
microphone waveforms for the TIMIT Acoustic-Phonetic Continuous Speech
corpus. The primary microphone waveforms, which were recorded using a
close-talking noise-cancelling head-mounted Sennheiser microphone
(model HMD-414), are available from the LDC on NIST Speech Disc 1-1.1
(LDC93S1). The secondary microphone used in the recording of the
TIMIT corpus was a Breul & Kjaer 1/2" free-field microphone (model
4165).

While the Sennheiser microphone recordings are relatively "clean" with
respect to non-speech noise, the FFMTIMIT recordings includes
significant low frequency noise, which was due to the HVAC system and
mechanical vibration transmitted through the floor of the
double-walled sound booth used in recording. Because it is noiser
than its TIMIT counterpart, the data of FFMTIMIT may be used in the
development of more noise-robust speech recognition systems. In
addition, this data may be of value to researchers involved in vocal
tract modeling because the B&K microphone has extremely flat
free-field frequency response and calibration tones are provided.

Note that the B&K TIMIT data contained with this release has not been
processed through any highpass filter, (e.g., the 1581-point filter
described in the paper "The DARPA Speech Recognition Research
Database" by Fisher, Doddington and Goudie-Marshall in "DARPA TIMIT
Acoustic-Phonetic Continuous Speech Corpus CD-ROM," NISTIR 4930 / NTIS
Order No. PB93- 173938.)

Institutions that have membership in the LDC during the 1996
Membership Year will be able to receive FFMTIMIT at no additional
charge, in the same manner as all other text and speech corpora
published by the LDC.

Nonmembers can receive a copy of FFMTIMIT for research purposes only
for a fee of $100. If you would like to order a copy of this corpus,
please email your request to ldc@unagi.cis.upenn.edu. If you need
additional information before placing your order, or would like to
inquire about membership in the LDC, please send email or call (215)
898-0464.

Further information about the LDC and its available corpora can be
accessed on the Linguistic Data Consortium WWW Home Page at URL
http://www.cis.upenn.edu/~ldc. Information is also available via ftp
at ftp.cis.upenn.edu under pub/ldc; for ftp access, please use
"anonymous" as your login name, and give your email address when asked
for password.