[Corpora-List] Codings for corpus files to be used in ParaConc

From: José Manuel Martínez Martínez (pitragoras@yahoo.es)
Date: Mon Jun 19 2006 - 17:44:12 MET DST

  • Next message: Paul Buitelaar: "[Corpora-List] Call for Participation: 2nd Workshop on Ontology Learning and Population"

    Dear colleagues,

    I'm compiling a corpus on European Parlamentary Speeches and I have
    found out that names of MEPs from Eastern Europe countries are displayed
    with errors when we use them with ParaConc. The same happens with
    accents in Spanish texts. We have saved our files as .txt using the
    coding Unicode (UTF-8). When we use the texts saved using the coding
    Western Europe (ISO) Spanish problems disappear but not those given with
    Eastern Europe languages.
    Does anybody know a single coding we can use for any language spoken in
    the European Parliament?
    Thank very much.
    Best regards,

    José Manuel

                    
    ______________________________________________
    LLama Gratis a cualquier PC del Mundo.
    Llamadas a fijos y móviles desde 1 céntimo por minuto.
    http://es.voice.yahoo.com



    This archive was generated by hypermail 2b29 : Mon Jun 19 2006 - 17:52:49 MET DST