Re: [Corpora-List] Request for help concerning a LSA problem

From: Petr Šimo (petr.simon@gmail.com)
Date: Fri May 05 2006 - 16:18:54 MET DST

  • Next message: Ernesto Pimentel: "[Corpora-List] [Esslli2006.dist] ESSLLI2005: Early registration deadline - May 14"

    Cecilie,
    I suggest you try e.g. the notorious "Human machine interface..." corpus
    from Landauer's paper An Introduction to Latent Semantic Analysis. I
    have 'tested' tools I use, scipy (Python) and svdlibc (C), against
    this. I have also tried to produce the results from Ch.15 with scipy and
    sdvlibc, but both give the same results, ie. .75 .28 ... the figures in
    the book seem strange... but I only gave it a guick look.
    Good luck
    Petr

    Cecilie Desiree Widsteen wrote:
    > Hello all,
    >
    > I´m currently trying to implement Latent Semantic Analysis, as part of
    > an automatic classification system. I´m programming in Java, and using
    > the Jama Matrix package for the matrix stuff. I have stumbled over some
    > strange problems, and would be grateful if anyone on this list could
    > offer some help.
    > My problem is: I have implemented a class which takes care of building a
    > matrix representation of a corpus, and performs SVD over the
    > term-by-document matrix. Most of the operations are done by the Jama
    > class "Matrix". This works fine, except for the fact that when I ran
    > the program over various small test corpora (like, for instance, the one
    > from Chapter 15 in Schütze and Manning´s book Foundations of Statistical
    > NLP) most of the righ and left singular vectors contained the correct
    > values but with wrong/reversed sign?! E.g. a vector that should have the
    > values [-0.75,-0.28,-0.20, ...] are assigned the values [0.75,0.28,
    > ...]. Unfortunately, I have limited experience with linear algebra and
    > the like so now I find myself completely at loss in debugging this...
    > As far as I can understand, this means that my vectors are pointing in
    > the opposite direction from the one they should, but why this is escapes
    > my understanding :)
    > Any help, hints, tricks and the like are extremely welcome! I can also
    > send over the source code on request.
    >
    > Regards,
    > --
    > Cecilie D. Widsteen
    > Department of Linguistics
    > University of Oslo
    >
    >
    >



    This archive was generated by hypermail 2b29 : Mon May 08 2006 - 13:59:19 MET DST