Re: [Corpora-List] Question concerning audio file search

From: Doug Cooper (doug@th.net)
Date: Thu Dec 21 2006 - 09:50:19 MET

  • Next message: esslli2007.tcd.ie@cs.tcd.ie: "[Corpora-List] ESSLLI 2007 Announcment"

    You might want to check the DAISY Consortium site, especially
    the tools area: http://www.daisy.org/tools/ They produce both
    open tools and standards for digital talking book data (esp. for
    the blind), including recorded speech.

       On a related topic, I recently built an audio corpus tool to locate
    single words in recorded (by many speakers) & transcribed Thai
    texts, aligned variously at the sentence or short paragraph level.

        It turned out that the naive approach -- using the relative
    character-count position of a search string within the larger
    transcription to locate the corresponding spoken word within
    the recording of that segment -- worked reasonably well, given
    a +/- 1.25-second window. One critical requirement was getting
    rid of pauses in the sound files. For my data, applying SOX
    "silence" with these parameters worked pretty well at normalizing
    speaking rates without introducing artifacts:

       sox -V a.wav silence 1 0:0:0.1 -55d -1 0:0:0.1 -55d

       Doug Cooper
    _______________________________________
    Center for Research in Computational Linguistics
    http://sealang.net http://crcl.th.net
    CRCL Inc. is a US 501(c)3 nonprofit organization



    This archive was generated by hypermail 2b29 : Thu Dec 21 2006 - 09:53:18 MET