RE : Windows concordance programs

Lisa Biagini (lisa@ilc.pi.cnr.it)
Mon, 15 Jul 1996 10:20:00 +0200

In Pisa, at the Istituto di Linguistica Computazione, a query system, DBT,
has been developed designed specifically for linguistic and literary text
processing and analysis task.

It is now a complex system consisting of many separate modules, some of very
general purpose, others designed to handle the particular requirements of
specific applications. Some of these modules are now fully tested and
industrialised, whereas others are still in the development and testing stage.

One of these components is the programm DBTConco that makes the concordances
of a text or of the selected words in it. More information (and I think also
the program) can be obtained by writing to Dott. Eugenio Picchi
(picchi@ilc.pi.cnr.it) or visiting the web site of the Istituto di
Linguistica Computazionale (http://www.ilc.pi.cnr.it).

The main features of the core system are: total respect for the integrity
of the source text; management of different character (Latin and non-Latin
alphabets) and code sets; real time, interactive execution of all the
typical functions of a text retrieval system; high performance in terms of
flexibility and speed; optimisation of storage and memory requirements;
management of very large text corpora; management and analysis of images in
a text or associated with it (manuscripts, icons, etc.); management of
structured text, in particular dictionaries; management of tagged and
lemmatized text.

Other components have been designed to run on top of the core system for the
management of annotated texts and data for the POS tagging of Italian
texts, and for the creation, management and interrogation of bilingual
parallel and comparable text archives.

When using the system, the first step is to format the texts using the DBT
procedures. DBT has its own encoding system and this stage is simple rapid
and, once a few preliminary instructions have been given, fully automatic.
The system also has an interface which permits it to acquire textual
material already encoded in SGML-TEI and in HTML.

Once the texts are in DBT format, the users can use the system query
procedures which provide them with a series of search functions which can be
used to access a single text or a text corpus and retrieve various elements
or combinations of elements. They can display all or part of the text(s) on
which they are operating, search given word forms, search words containing
one or a combination of given character strings or using wild cards, compute
frequencies, define search functions in which words are associated in
different ways and retrieve all the contexts satisfying these search
conditions in the text(s), generate ordered text concordances, impose
particular conditions on concordance generation, select the language of
interest when several languages or language types are present in the text(s)
and have been classified as such, obtain statistical data on the
cooccurrences of selected words (e.g. Mutual Information Index), execute
queries on annotated and lemmatised material, etc..

Another important feature of the system is that it provides procedures to
manage images included in a text. An image database is associated with the
text; the user can access this database dynamically and view the images. The
way in which the results of the queries are displayed on the screen or
printed out can be defined by the user, according to his own particular
needs and preferences. The system is extremely versatile and easy to use.

You can download a demo version of the DBT core system that runs on the
"Divina Commedia" from our WEB site:
http://www.ilc.pi.cnr.it/dbt/pisystem.html

Sorry, but it is all in Italian. We are in the process rewriting all the
documentation in english.

When you are in our home page, you have to choose: "Alcune installazioni
DEMO del sistema di interrogazione DBT" (Some DEMO programs of the DBT query
system)
Then it is sufficient that you click on "WINDBT & 'La Divina Commedia'": the
demo program will be download to your computer. At this point you just have
to run the file (dbtw_div.exe): the DBT demo program will be automatically
installed on your system.

Please do not hesitate to contact me if your require any further information.

Lisa