Scanning

j.b.johannessen@ilf.uio.no
Wed, 10 Jul 1996 13:29:52 +0100

I would like to know whether anybody has experience with Omnipage Pro or
can recommend other programs for scanning which might be useful for
langueages using versions of the latin alphabet with diacritics.

I am engaged in a project that aims at putting together a Bosnian
electronic text corpus. Most of the texts will be scanned. We have tried
Omnipage Pro, which can be trained to a certain extent, but it nevertheless
has problems recognizing the characters that we have tried to predefine.

1. It often interprets diacritics as characters separate from the letters
they belong to. E.g., even if we have predefined c-with-a-hook, the program
will often interpret the hook as a separate symbol from c during scanning.

2. It seems impossible to use diacritics as part of a resulting font. This
means that e.g. c-with-a-hook, to the extent that the program recognizes it
at all, must be transliterated as something like c-dollar-sign, which in
turn must be replaced by the proper alphabet character c-with-a-hook in a
separate word processing program.

Does anybody have any experience using Omnipage or other scanning programs
for texts having this type of alphabet? We would be very pleased to hear
about it.

Janne Bondi Johannessen.

---------------------------------------------------------------------------
Janne Bondi Johannessen Tel: + 47-22 85 68 14
The Text Laboratory E-mail: jannebj@hedda.uio.no
Department of linguistics Fax: +47-22 85 69 19
University of Oslo
P.O.box 1102 Blindern
N-0317 Oslo, Norway
---------------------------------------------------------------------------