Re: [Corpora-List] Custom tagging validator

From: Jin-Dong Kim (jdkim@is.s.u-tokyo.ac.jp)
Date: Mon Nov 07 2005 - 03:24:57 MET

  • Next message: Fredrik Olsson: "[Corpora-List] NEW TEXT - Wikis and blogs and other dynamic text sources: Call for papers and participation"

    Dear Przemek,

    If you would like to test if the file conforms to XML syntax, e.g,
    no-boundary-crossing, etc, you can just perform a well-formedness test
    without a DTD.
    My favorite tool for well-formedness test is xmlwf.

    If you would like to test if the XML file contains valid tag-names,
    attributes or values, you need to define a DTD for that file and to
    perform a validness test with the DTD.
    My favorite tool for validness test is xmllint.

    I believe you can find open source implementations of both tools.
    I am using those coming together with Cygwin.

    Best Regards,

    Jin-Dong

    On 11/6/05, Przemek Kaszubski <przemka@amu.edu.pl> wrote:
    > Dear Members,
    >
    > I'm looking for a flexible tool that would validate files tagged by my
    > students. The tags follow the <tag>tagged_text</tag> convention but are
    > not linked to any DTD, and entirely my own. I'd like to be able to test
    > quickly if my students spelled the tag names correctly, closed the tags,
    > applied the < and > symbols etc. The tagging scheme is simple (sth like
    > 10-12 tags in all), with no embedding or special properties.
    >
    > Does anyone know of a tool or script of this kind, or perhaps developed one?
    >
    > Thank you for any help,
    >
    > Przemek
    >
    >
    > --
    > Dr Przemyslaw Kaszubski
    > +48 61 8293515
    > http://elex.amu.edu.pl/ifa/staff/kaszubski.html
    >
    > PICLE LEARNER CORPUS ONLINE:
    > http://www.staff.amu.edu.pl/~przemka/picle.html
    >
    > COMPREHENSIVE CORPORA BIBLIOGRAPHY:
    > http://www.staff.amu.edu.pl/~przemka
    >
    > MY SEMINARS:
    > http://www.staff.amu.edu.pl/~przemka/seminars.htm
    >
    > ACADEMIC WRITING PAGE (FULL-TIME PROGRAMME):
    > http://www.staff.amu.edu.pl/~przemka/IFA_writing
    >
    > =======================================
    > School of English (IFA)
    > Adam Mickiewicz University
    > http://elex.amu.edu.pl/ifa
    > =======================================
    >
    >
    >
    >



    This archive was generated by hypermail 2b29 : Mon Nov 07 2005 - 03:52:58 MET