Re: Corpora: approximations (bounds) for edit distance

From: COMP staff (csrluk@comp.polyu.edu.hk)
Date: Sat Dec 01 2001 - 00:22:23 MET

  • Next message: Bruce L. Lambert: "Re: Corpora: approximations (bounds) for edit distance"

    Hi Bruce,

    > Maybe I'm missing something, but the upper bound on edit distance between
    > two strings is always the length of the longer string, and the lower bound
    > is always zero (when the strings are identical).

    I guess it depends on the edit cost of the operations.
    For most applications, we do expect all edit costs to
    be larger than or equals to zero and the lb edit
    distance between any 2 strings is 0. However, the ub
    of any 2 strings is the length of the longer string
    is true (I guess) if insert, delete and substitution
    edit costs are 1 and 0 for correct substitution.
    If I remember correctly, that's the Levenstein
    metric but one needs to check the details for
    correctness. In the applications that I have worked,
    it is not uncommon to use edit costs are than
    the Levenstein.

    Cheers,

    Robert Luk



    This archive was generated by hypermail 2b29 : Sat Dec 01 2001 - 00:34:53 MET