RE: [Corpora-List] fast string replacement

From: Piao, Songlin (s.piao@lancaster.ac.uk)
Date: Fri Mar 11 2005 - 17:44:05 MET

  • Next message: Rob Malouf: "Re: [Corpora-List] fast string replacement"

    Hi Jörg,
     
    I put a freely downloadable Java tool on my webpage, which has a function for the same purpose, :
    http://www.lancs.ac.uk/staff/piaosl/research/download/download.htm <http://www.lancs.ac.uk/staff/piaosl/research/download/download.htm>
     
    You can use it for your purpose as follows:
     
    1) Replace commas with tabs in the rules (the program use tabs as separator),
    2) List your rules, with each rule in a separate line as shown below:
      books books/v:3:pres;n:plur
      nice nice/adj
     
    3) go to menu "Tools" --> "Convert Codes", and click on it to get a file chooser.
    4) Choose one or multiple files that you want to convert.
     
    Then the program will convert all the matching items with corresponding substitutes in the files.
     
    For it is Java program, it should be running in Linux.
     
    I tried with your sample rules and senetnce with it, and I got exactly the same result as you hoped.
     
    Scott Piao

     

    ________________________________

    From: owner-corpora@lists.uib.no on behalf of js@cis.uni-muenchen.de
    Sent: Fri 11/03/2005 14:43
    To: CORPORA@hd.uib.no
    Subject: [Corpora-List] fast string replacement

    Hello,

    I am looking for a program that

    - takes as input a string (!) rewriting dictionary and and a corpus
    - applies all rewriting rules to all lines of the corpus
    - is fast, stable and free
    - works under Linux

    Example:

    Some rewriting rules:

     book3, books/v:3:pres;n:plur
     nice, nice/adj

    A "corpus" before transduction:

     John reads nice books.

    The same corpus after transduction:

     John reads nice/adj books/v:3:pres;n:plur

    Does anyone know such a program?

    Jörg Schuster



    This archive was generated by hypermail 2b29 : Fri Mar 11 2005 - 18:11:46 MET