[Corpora-List] interactive alignment interfaces

From: Joerg Tiedemann (tiedeman@let.rug.nl)
Date: Thu Jul 28 2005 - 12:40:13 MET DST

  • Next message: Bo Pang: "[Corpora-List] New sentiment datasets available"

    I have a prototype of an interactive sentence aligner and another one
    for interactive word alignment using web-based interfaces. They are
    part of Uplug (http://sourceforge.net/projects/uplug/) and you may try
    them with small sample texts at

    http://www.let.rug.nl/~tiedeman/__align_test/isa.php
    (interactive sentence aligner)
    http://www.let.rug.nl/~tiedeman/__align_test/ica.php
    (interactive word aligner)

    (Be nice to our server! - I might take the scripts away if it
    starts to get too busy! - But I thought I might risk it to put them
    on-line for a short while. I have a ulimit of 5 seconds for running
    alignments. Don't get confused if it times out ...)

    Current features of the sentence aligner:

    - automatic alignment with Gale&Church's approach
      (click on 'align')
    - a simple cognate filter to add hard-boundaries
      (press the 'cognates' button)
    - adding hard-boundaries using structural (XML) tags
      (there are no useful ones in the test corpus)
    - manually adding/removing hard-boundaries
      (click on corresponding sentences)
    - saving/sending results in 3 different formats
      (XCES with external pointers, simple TMX, plain text)

    Current features of the word aligner:

    - uses the Uplug clue aligner (with existing clue collections)
    - alignment visualization (matrix style)
    - select clue types and weights
    - add/remove links (click on corresponding cells in the matrix)
    - open/search the clue collections
    - save alignment results (disabled in the demo)

    More information can be found at
    http://www.let.rug.nl/~tiedeman/__align_test/doc/isa.html
    http://www.let.rug.nl/~tiedeman/__align_test/doc/ica.html

    I'm looking forward to get your feedback!

    best,

    Jörg Tiedemann

    ***********/\/\/\/\/\/\/\/\/\/\/\************************************
    ** Jörg Tiedemann tiedeman@let.rug.nl **
    ** Alfa-Informatica http://www.let.rug.nl/~tiedeman **
    ** Rijksuniversiteit Groningen Harmoniegebouw, room 1311-429 **
    ** Oude Kijk in 't Jatstraat 26 phone: +31 (0)50-363 5935 **
    ** 9712 EK Groningen fax: +31 (0)50-363 6855 **
    *************************************/\/\/\/\/\/\/\/\/\/\/\**********



    This archive was generated by hypermail 2b29 : Thu Jul 28 2005 - 18:00:03 MET DST