[Corpora-List] Summary: Sorting upper-ASCII chars in Unix

From: William Fletcher (fletcher@usna.edu)
Date: Sun Nov 30 2003 - 17:52:44 MET

  • Next message: Marie-Hélène Antoni: "[Corpora-List] call for paper "TextMining" : CIFT'04"

    Last week I posted a query on how to sort the full range of ASCII
    characters under UNIX.

    Many thanks to Marco Baroni, Vlado Keselj, Arne Fitschen, Serge Heiden,
    Gertjan van Noord and Lisa Becktel for helping solve this. Serge's
    explanation of the issues was very enlightening, and Arne's man page
    (much more explicit than the one on our system) clarifies why the best
    solution is to set LC_ALL to C or POSIX as follows:

        setenv LC_ALL C

    OR

        setenv LC_ALL POSIX

    Both yield straight ASCII sort for all codes through 255.

    Again thanks to all for preventing further frustration with Unix!

    Bill Fletcher



    This archive was generated by hypermail 2b29 : Sun Nov 30 2003 - 18:08:06 MET