Fw: Re: [Corpora-List] computing semantic word similarity

From: Сергей Крылов (krylov-58@mail.ru)
Date: Tue Nov 15 2005 - 01:50:30 MET

  • Next message: b siham: "[Corpora-List] evaluation methodology"

    -----Original Message-----
    From: Сергей Крылов <krylov-58@mail.ru>
    To: Dimitar Blagoev <gefix@pu.acad.bg>
    Date: Tue, 15 Nov 2005 03:48:23 +0300
    Subject: Re: [Corpora-List] computing semantic word similarity

    >
    >
    > I advise to use the STARLING.EXE.
    > See http://starling.rinet.ru
    > Let me quote a short piece from the STARHELP.DBF:
    >
    > _______________________
    >
    > SEMANTIC FUNCTIONS.
    >
    > All the semantic functions deal only with the English
    > language and are of interest mostly for comparative
    > linguists. They require the semantic database SENSE.DBF
    > (plus SENSE.VAR) which must be located together with
    > STARLING.EXE.
    > SENSE.DBF is a collection of about 7000 English
    > headwords described in terms of their semantic "attributes"
    > or "constituents" (all in all around 400). All the data was
    > extracted from existing etymological computer databases. A
    > record like
    >
    > (HEADWORD) require (V) (ITEMS) to want;to search;to be;able
    >
    > means that in several cases the meaning "require" was
    > associated with semantic "primitives" "to want", "to
    > search", "to be" and "able".
    >
    > The functions now available are the following:
    >
    > SENSE(par_C, par_L)
    >
    > This function returns the common semantic constituent(s)
    > of all the words in par_C - if any. Thus,
    >
    > SENSE("tree; bush") = "grass;root;tree"
    >
    > Note that SENSE("tree") is returned as "grass;leaf;root;
    > tree;stick;forest" and SENSE("bush") is returned as "root;
    > tree;thorn;grass;fruit".
    >
    > If a second logical parameter is passed as .T., the
    > function SENSE returns all the semantic constituents of all
    > the words constituting par_C (excluding articles,
    > prepositions and some other "empty" words). Thus,
    >
    > SENSE("tree; bush", .T.) will return
    >
    > "grass;leaf;root;tree;stick;forest;thorn;grass;fruit".
    >
    > COMMON(par_C1,par_C2)
    >
    > This function is for commodity only and is fully
    > equivalent to SENSE(par_C1+par_C2).
    >
    > SIMILAR(par_C1, par_C2, par_C3, par_C4)
    >
    > This is a complex function with four possible character
    > arguments. The former two are compared on the basis of the
    > SOUND function, while the latter two are compared on the
    > basis of the SENSE function. The parameter par_C3 is
    > supposed to be the meaning of par_C1, and the parameter
    > par_C4 - the meaning of par_C2. Thus,
    >
    > SIMILAR("hound","Hund","hound","dog") = .T.
    >
    > SIMILAR("dog","Hund","dog","dog") = .F.
    >
    > If only the first two parameters are passed, they are
    > compared merely by sound; if the first two parameters are
    > empty, the last two are compared merely by meaning. Thus:
    >
    > SIMILAR("hound", "Hand") = .T.
    > (while SIMILAR("hound","Hand","hound","hand") is of course
    > .F.)
    >
    > SIMILAR("","","hound","dog") = .T.
    > (while SIMILAR("","","hound","hand") is .F.)
    >
    > The function SIMILAR can now be automatically summoned
    > while EDITING 2 FILES. The files are presently supposed to
    > be standard etymological files with the fields PROTO and
    > MEANING. Pressing Left Shift + F11 while editing one file
    > will result in an automatic issuing of the Locate procedure
    > equivalent to
    >
    > LOCATE FOR SIMILAR(FILE1->PROTO,FILE2->PROTO)
    >
    > Pressing Right Shift + F11 will result in issuing the
    > Locate procedure equivalent to
    >
    > LOCATE FOR SIMILAR("","",FILE1->MEANING,FILE2->MEANING)
    >
    > Note that in this case only semantic matches are searched
    > and the performance is generally slow.
    >
    > Finally, pressing Shift + F12 will summon the Locate
    > procedure equivalent to
    >
    > LOCATE FOR SIMILAR(FILE1->PROTO, FILE2->PROTO,
    > FILE1->MEANING, FILE2->MEANING)
    >
    > By pressing F4 you can continue search and browse
    > through the whole second file looking for possible
    > similarities.
    >
    >
    > ________________
    >
    > Sincerely yours,
    > Sergej A. Krylov
    >
    >
    > -----Original Message-----
    > From: "Dimitar Blagoev" <gefix@pu.acad.bg>
    > To: "CORPORA" <CORPORA@UIB.NO>
    > Date: Fri, 11 Nov 2005 22:28:43 +0200
    > Subject: [Corpora-List] computing semantic word similarity
    >
    > >
    > > Hello,
    > >
    > > Could you tell me of any methods/programs (besides distributional similarity) to compute the semantic similarity between two words in one language, but not only for english, for example I am interested if there are ways to do this also for french, german, spanish etc.
    > >
    > >
    > > Best regards.
    > >
    > > Dimitar Blagoev
    > > gefix@pu.acad.bg
    > > 2005-11-11
    > >
    > >
    >
    >



    This archive was generated by hypermail 2b29 : Tue Nov 15 2005 - 02:04:47 MET