Re: [Corpora-List] European Constitution in parallel

From: Joerg Tiedemann (tiedeman@let.rug.nl)
Date: Thu Apr 28 2005 - 19:08:00 MET DST

  • Next message: Paula Cristina Vaz: "[Corpora-List] Ontology about emotions."

    thanks for your reply.
    I changed the encoding for the CWB indeces to iso-8859-13. I hope it
    worked. maybe you could have a short look if you have some time. (the
    OPUS query interface)

    thanks for your help!

    Jörg

    ***********/\/\/\/\/\/\/\/\/\/\/\************************************
    ** Jörg Tiedemann tiedeman@let.rug.nl **
    ** Alfa-Informatica http://www.let.rug.nl/~tiedeman **
    ** Rijksuniversiteit Groningen Harmoniegebouw, room 1311-429 **
    ** Oude Kijk in 't Jatstraat 26 phone: +31 (0)50-363 5935 **
    ** 9712 EK Groningen fax: +31 (0)50-363 6855 **
    *************************************/\/\/\/\/\/\/\/\/\/\/\**********

    On Mon, 25 Apr 2005, Andrius Utka wrote:

    > Dear Joerg,
    > As far as I know Lithuanian uses ISO 8859-13. Not sure about Latvian.
    > Best,
    > Andrius
    >
    > >
    > >follow-up ....
    > >
    > >I just realized that there are some additional problems with character
    > >encodings. Latvian and Lithuanian should be supported by
    > >ISO-8859-4 according to information I found. However, I got serious
    > >trouble when converting from UTF-8 to ISO for these languages. Did the
    > >alphabet change recently or is the ISO standard just useless?
    > >
    > >Now, I changed the Latvian and Lithuanian texts from the EUconst corpus
    > >to
    > >UTF-8 in the CWB index. Looks good but is difficult to query for
    > >diacritics. Check:
    > >http://logos.uio.no/cgi-bin/opus/opuscqp.pl?corpus=EUconst;lang=lt
    > >http://logos.uio.no/cgi-bin/opus/opuscqp.pl?corpus=EUconst;lang=lv
    > >
    > >Let me know if there is a 8-bit code that can be (is) used for these
    > >2 languages.
    > >
    > >
    > >Jörg
    > >
    > >***********/\/\/\/\/\/\/\/\/\/\/\************************************
    > >** Jörg Tiedemann tiedeman@let.rug.nl **
    > >** Alfa-Informatica http://www.let.rug.nl/~tiedeman **
    > >** Rijksuniversiteit Groningen Harmoniegebouw, room 1311-429 **
    > >** Oude Kijk in 't Jatstraat 26 phone: +31 (0)50-363 5935 **
    > >** 9712 EK Groningen fax: +31 (0)50-363 6855 **
    > >*************************************/\/\/\/\/\/\/\/\/\/\/\**********
    > >
    > >
    > >
    >
    >
    >
    >
    >



    This archive was generated by hypermail 2b29 : Thu Apr 28 2005 - 19:36:34 MET DST