[Corpora-List] Natural Language Toolkit: NLTK-Lite version 0.6.5 released

From: Steven Bird (sb@csse.unimelb.edu.au)
Date: Mon Jul 10 2006 - 13:35:48 MET DST

  • Next message: Sandra Kübler: "[Corpora-List] call for papers TLT 2006"

    NLTK, the Natural Language Toolkit, is a suite of Python libraries and
    programs for natural language processing. Version 0.6.5 has been
    released, and can be downloaded from http://nltk.sourceforge.net/

    CONTENTS

    Software Modules: corpus readers, tokenizers & stemmers, taggers
    (regexp, n-gram, backoff, Brill, HMM), parsers (recursive descent,
    shift-reduce, chart, probabilistic, ...), clusterers (EM, k-means,
    ...), probability distributions, chatbots, demonstrations, ...

    Corpora and Corpus Samples: Brown Corpus, CMU Pronunciation
    Dictionary, CoNNL-2000, Genesis, Gutenberg, IEER, Presidential
    Addresses, Names, PP-Attachment, Senseval 2, TIMIT, Treebank, Words

    Documentation: Tutorials and exercises (190pp), API documentation for
    all software modules, installation instructions for Windows, Mac,
    Unix.

    ChangeLog for Version 0.6.5 2006-07-09

    * Code:
      - improvements to shoebox module (Stuart Robinson, Greg Aumann)
      - incorporated feature-based parsing into core NLTK-Lite
      - corpus reader for Sinica treebank sample
      - new stemmer package
    * Contrib:
      - hole semantics implementation (Peter Wang)
      - Incorporating yaml
      - new work on feature structures, unification, lambda calculus
      - new work on shoebox package (Stuart Robinson, Greg Aumann)
    * Corpora:
      - Sinica treebank sample
    * Tutorials:
      - expanded discussion throughout, incl: left-recursion, trees, grammars,
        feature-based grammar, agreement, unification, PCFGs,
        baseline performance, exercises, improved display of trees

    -Steven Bird



    This archive was generated by hypermail 2b29 : Mon Jul 10 2006 - 13:42:27 MET DST