Re: Corpora: Automatic word categorisation

From: Alexander Clark (
Date: Mon Nov 13 2000 - 12:39:22 MET

  • Next message: Constantin Orasan: "Corpora: Measures for the similarity between two sentences"

    At 11:18 13/11/00 +0100, Klas wrote:
    >Dear list members,
    >I am working on a dissertation in corpus lingusitics and my primary field
    >of research is automatic word categorisation and classification. I have
    >conducted a search for other works in this field. I am aware of the works
    >by John Hughes and Steven Finch as well as those of H. Schutze. Do You know
    >about others interested in the same area? Any references would be
    >Yours sincerely
    >Klas Prytz

    Dear Klas,

    Brown et al. (92) and Ney et al. (94) present similar approaches using a
    maximum likelihood approach.
    You are familiar with Chater and Finch's work. There is also Pereira et
    al.'s work on clustering of word senses.
    I gave a paper on this topic at CoNLL '00 this year, available on-line.


    Alexander Clark

    Bibtex entries:

      AUTHOR = {Finch, S. and Chater, N.},
      TITLE = {Bootstrapping syntactic categories},
      YEAR = {1992},
      BOOKTITLE = {Proceedings of the 14th Annual Meeting of the
                      Cognitive Science Society},
      PAGES = {820-825},

      AUTHOR = {Finch, S. and Chater, N.},
      TITLE = {Bootstrapping syntactic categories using statistical
      YEAR = {1992},
      BOOKTITLE = {Background and Experiments in Machine Learning of
                      Natural Language},
      PAGES = {229-235},
      EDITOR = {Daelemans, W. and Powers, D.},
      PUBLISHER = {Tilburg University: Institute for Language
                      Technology and AI}

      AUTHOR = {Finch, S. and Chater, N. and Redington, M.},
      TITLE = {Acquiring syntactic information from distributional
      YEAR = {1995},
      EDITOR = {Levy, Joseph P. and Bairaktaris, Dimitrios and
                      Bullinaria, John A. and Cairns, Paul},
      BOOKTITLE = {Connectionist Models of Memory and Language},
      PUBLISHER = {UCL Press}

      AUTHOR = {Brown, Peter F. and Della Pietra, Vincent J. and de
                      Souza, Peter V. and Lai, Jenifer C. and Mercer,
      TITLE = {Class-based n-gram models of natural language},
      YEAR = {1992},
      VOLUME = {18},
      PAGES = {467-479},
      JOURNAL = {Computational Linguistics}

      author = {Ney, Hermann and Essen, Ute and Kneser, Reinhard},
      title = {On Structuring Probabilistic dependencies in
                      stochastic language modelling},
      journal = {Computer Speech and Language},
      year = {1994},
      volume = {8},
      pages = {1-28}

      AUTHOR = {Pereira, Fernando and Tishby, Natali and Lee,
      TITLE = "Distributional Clustering of {English} words",
      YEAR = {1993},
      BOOKTITLE = "Proceedings of the 31st annual meeting of the
                      {Association for Computational Linguistics}"

      author = {Clark, Alexander},
      title = {Inducing Syntactic Categories by Context
                      Distribution Clustering},
      pages = {91-94},
      year = {2000},
      booktitle = {Proceedings of CoNLL-2000 and LLL-2000},
      address = {Lisbon, Portugal}

    Alexander Clark

    This archive was generated by hypermail 2b29 : Mon Nov 13 2000 - 12:36:58 MET