Hi Marijke,
Here is a Perl module that can tell which letters need to be
removed/inserted/substituted in one word to get the other:
http://cs.haifa.ac.il/~shlomo/talks/edit_distance/slides/Brew.pm.html
Viktor
----- Original Message -----
From: "Marijke Koster" <marijke@polderland.nl>
To: <CORPORA@UIB.NO>
Sent: Friday, January 21, 2005 8:44 AM
Subject: [Corpora-List] Frequency list of transformations
Dear corpora list members,
Does anyone have a suggestion for a simple method / a script to extract
a frequency list of transformations from a list of spelling errors and
corrections?
For example here's this tab separated list:
wrong correct
----- -------
occurence occurrence
occosion occasion
commputer computer
live life
heavie heavy
geat great
save safe
After applying the method it should result in something like this
1 rr -> r
1 a -> o
1 m -> mm
2 f -> v
1 y -> ie
1 r -> ()
Thanks in advance,
Marijke Koster
______________________________________
Marijke Koster, linguistic engineer
Polderland Language & Speech Technology BV
The Netherlands
http://www.polderland.nl
Phone: +31.24.352 28 66
Fax: +31.24.352 28 60
This archive was generated by hypermail 2b29 : Fri Jan 21 2005 - 10:40:55 MET