Re: [Corpora-List] WordSmith

From: Klaus Guenther (klaus@capitalfocus.org)
Date: Sat May 28 2005 - 22:54:37 MET DST

  • Next message: Linda Bawcom: "[Corpora-List] Opening BNC texts success!"

    Hi Li-chin,

    As you probably know, Wordsmith accepts UTF-8 encoded texts, so would
    that be an option for you? Then you shouldn't have the trouble with odd
    characters. Else you might like opening your texts in Microsoft Word or
    some other text editor (e.g., vim, emacs, etc.) and convert the format
    to UTF-8 or ASCII. It would be possible to write a perl or php script
    that would quickly and easily convert your texts. (the appropriate utf-8
    to ascii function in php is utf8_decode($string) and is provided by the
    xml extension)

    Hth,

    Klaus

    Klaus Guenther
    University of Freiburg, Germany

    On 5/28/2005 10:23 PM, sara chen spake the following words:

    > Hi Klaus,
    >
    > Thank you very much for your prompt reply!
    >
    > Unfortunately, my data are not tagged and I'm not planning to do so
    > unless it's necessary. And then, any way I can tag my data
    > automatically, in order to locate those " ? " ?
    >
    > Ok, some concordance lines actually don't include any " ? " , and " '
    > " in original data becomes " ? " so they appear in the result of my
    > searching. Of course, there are accurate concordance lines, but I need
    > to pick them out mannually. Weird!
    >
    > Li-chin
    >
    >
    >
    > */Klaus Guenther <klaus@capitalfocus.org>/* wrote:
    >
    > Hi Li-chin,
    >
    > What would the character encoding be? And what special characters?
    >
    > If your original data is tagged, you should be able to search for
    > the question mark tag. I've not had that much trouble doing what I
    > needed to do. (The only exception was that it doesn't offer
    > regular expression support...)
    >
    > Klaus
    >
    > On 5/28/2005 8:59 PM, sara chen spake the following words:
    >
    >> Hi everyone,
    >>
    >> I'm wonder if any WordSmith expert can help me to solve few
    >> questions.
    >>
    >> 1) How to avoid producing those strange codes when transferring
    >> my original text data into the txt. files, which is recognized by
    >> WordSmith? I have copied the origninal text on Notepad and then
    >> save it as txt file.
    >>
    >> 2) How to search question mark "?" from my data with WordSmith? I
    >> used concorrdance to search them but some strange codes came out.
    >>
    >> Many thanks
    >>
    >> Li-chin
    >>
    >>
    >> ------------------------------------------------------------------------
    >> Do You Yahoo!?
    >> Yahoo! Small Business - Try our new Resources site!
    >> <http://us.rd.yahoo.com/evt=31637/*http://smallbusiness.yahoo.com/resources/>
    >
    >
    >
    > ------------------------------------------------------------------------
    > Do You Yahoo!?
    > Yahoo! Small Business - Try our new Resources site!
    > <http://us.rd.yahoo.com/evt=31637/*http://smallbusiness.yahoo.com/resources/>



    This archive was generated by hypermail 2b29 : Sat May 28 2005 - 22:58:33 MET DST