Re: [Corpora-List] fast string replacement

From: Rob Malouf (rmalouf@mail.sdsu.edu)
Date: Fri Mar 11 2005 - 18:32:21 MET

  • Next message: Normunds Gruzitis: "RE: [Corpora-List] Query about nomenclature"

    On Fri, 2005-03-11 at 07:28, Stefan Evert wrote:
    > If you're really interested in string replacement (probably with some
    > additional code to identify word boundaries), you should be looking at
    > finite-state transducers. Two open-source solutions I know are Helmut
    > Schmid's FST toolkit (see http://www.ims.uni-stuttgart.de/~schmid) and
    > Steve Abney's cascaded parser CASS (you'll have to search Google for
    > the source code).

    You should also consider Gertjan van Noord's FSA Utilities:

    http://grid.let.rug.nl/~vannoord/Fsa/fsa.html

    It can compile your transducers into Java or C code for portable and/or
    efficient execution.

    -- 
    Rob Malouf <rmalouf@mail.sdsu.edu>
    Department of Linguistics and Oriental Languages
    San Diego State University
    



    This archive was generated by hypermail 2b29 : Fri Mar 11 2005 - 18:38:03 MET