UoB : MCTS at UoB : Aksis : BATMULT home
News archiveFall 2006New fellowship:Mojca Stritar (University of Ljubljana): KUST: A Slovene learner's corpus. (October 2006 - January 2007). REPORT The overall aim of my project is the theoretical foundation of KUST, the Slovene Learner Corpus (SLC). During the Marie Curie fellowship at BATMULT, two major scientific challenges have been faced: the development of a reasonable set of criteria for the collection and selection of learner material to be included in KUST, and the development of an appropriate error tagging system. The main aim was to digitize and tag the material to compile a pilot learner corpus of Slovene based on texts written by learners on different levels of competence and with different first languages. The purpose of the pilot corpus was to check, and if necessary, redefine the criteria for the collection, selection and documentation of learner materials, to develop and test mark-up conventions and the error tagging principles, and finally, to show some possibilities for the use of such corpora for language description, analysis and teaching. BATMULT thus offered me a unique insight into the Norwegian learner corpus ASK and the application of some of its solutions to the Slovene language situation. Spring 2006New fellowship:Pavel Vondricka (Charles University, Prague): An object-relational dictionary model and its application to Norwegian nouns. (January - October 2006). REPORT
The research project aims on lexical description of the Norwegian nouns, both monolingually and in a contrastive analysis with other languages. The background of this research is the desirability of reusable lexical resources supporting natural language processing (NLP). While tradional printed dictionaries are large and detailed, a lot of information contained in them is implicit, cannot easily be digested by computers, and does not reflect all the contextual information of the words. In the context of the project, a system was designed and implemented which is more general than language specific dictionary editors, but less general than formal systems used in NLP. This system has been tested on the Norwegian nouns. The goal of the system is to provide a description of the morphological structure (and syntactic structure in the case of multi-word entries) of the lemmas together with information about their syntactic behavior, collocability and lexical semantic relations of their senses. In addition, the system was designed to handle wide language variability and detailed usage attributes for different variants, important factors that are often underestimated in human oriented dictionaries and almost completely ignored by most NLP projects. Fall 2005New fellowship:Ron van Kesteren (University of Nijmegen): Prelexical language cues in bilingual visual word recognition. (September 2005 - January 2006). REPORT The project aims at clarifying the role of sublexical language cues in the word recognition process of bilinguals. It investigates an issue that currently receives a lot of international attention, namely whether bilingual readers are able to modify their word identification process on the basis of their expectations or the characteristics of the words they read. Currently available studies with respect to this issue are ambiguous to whether this is the case. If the reading process of bilinguals is sensitive to language-specific markers in the input, the presence of a letter such as 'å', which only occurs in Norwegian, might help Norwegian-English bilinguals to speed up their recognition process by limiting their word search to Norwegian words. The presence of bigrams that are normal in Norwegian, but very infrequent in English, such as 'hv' in 'hvit', might have the same effect. During the stay in Bergen, two kinds of experiments are conducted: a visual language decision task and a lexical decision task with Norwegian and Englishs words, performed by Norwegian-English bilinguals. These experiments are designed to answer the question if the presence of language-specific letters and/or bigrams influences reaction times and if this effect is caused by modifications to the word recognition process. Fall 2004New fellowship:Jana Zemljaric (University of Ljubljana): Building a Slovenian spoken corpus. (September - December, 2004). REPORT (photograph by Koenraad de Smedt) There are two main challenges in this PhD project aimed at building a Slovenian spoken corpus. The first is the development of a set of criteria for the collection and selection of spoken material to be included in a balanced non-opportunistic corpus. The second is developing guidelines for transcription of the spoken materials. These problems will be studied in a cross-linguistic comparison with materials and methods developed in Bergen and elsewhere, and accessible at BATMULT. About 100 minutes of recordings of Slovene will be used as test materials. Transcription aids such as Praat will be tested for the purpose. Finally, experiments will be done in parallelizing transcriptions with the original sound clips, using synchronization tools available at BATMULT. Spring 2004New countries allowed to BATMULT:On May 1, 2004, ten new countries joined the European Union: Cyprus, the Czech Republic, Estonia, Hungary, Latvia, Lithuania, Malta, Poland, Slovakia and Slovenia. Candidates from these countries can now be selected for BATMULT fellowships! New administrator at Batmult:
Gisle Andersen replaces Kristin Bech as Batmult's
administrator and manager.
From January 1, 2004, Dr. Gisle Andersen replaces Kristin Bech as coordinator of the portfolio of language technology activities at Aksis. Andersen's responsibilities include that of managing the administration and practical running of the Bergen Advanced Training Site in Multilingual Tools (BATMULT), among several other tasks. Hence, Gisle is now the person to contact for inquires about BATMULT. Like his predecessor, Gisle Andersen has a doctoral degree in English corpus linguistics from the University of Bergen. He has also worked in the language technology industry for three years, as a product developer for Norwegian concatenative speech synthesis. Fall 2003New fellowship:
Céline Poudat (University of Orléans):
Contrastive analysis of scientific genres: Toward a
characterization of French and English linguistics research
papers.
(October - December, 2003).
REPORT
This PhD project is related to the KIAP project, which is aimed at a comparative study of academic texts in different domains and written in different languages. The research stay focuses on the morphosyntactic level because of the large development of morphosyntactic annotation and the availability of many taggers for French and English. The research aims at comparing and assessing several French and English taggers in order to efficiently obtain morphosyntactic tags that will characterize the texts in a way that is relevant for the comparative study. Different parameters are taken into account in this perspective: contrastive relevance of the variables in the two languages, XML/TEI compatible encoding, and merging of the output tags. This research stay benefits from BATMULT expertise and tools in the areas of corpus tagging and text encoding. Cristiano Furiassi (University of Torino): False anglicisms in Italian: Retrieval of examples in large corpora of written texts. (August - October, 2003). REPORT
The average Italian speaker does not seem to be aware of the fact
that many English sounding or English looking words are not at all
English; instead they are autonomous coinages which are usually
referred to as 'false anglicisms'. The aim of this research stay
at BATMULT is to identify authentic examples of false anglicisms
and subject these to a contrastive linguistic analysis. The
ultimate goals of this PhD student are to arrive at a detailed
typology and compile a dictionary of false anglicisms. The project
will benefit from hands-on training in corpus linguistic tools
such as those available in Bergen, and will also use the English
language resources available in Bergen. The method will mainly
consist of automated searching in monolingual Italian and English
corpora and of automated string comparisons in Italian-English
parallel corpora.
New administrative unit at the
University of Bergen:
Until now, the aministration of BATMULT has been taken care of by
the Humanities Information Technologies centre (HIT) at the
University of Bergen. This centre has now become a part of the
Department of Culture, Language, and Information Technology
(AKSIS). The administration of BATMULT will be continued by
AKSIS. Spring 2003Natascia Leonardi (University of Macerata): An electronic edition of John Wilkins' Conceptual and Alphabetic Dictionary. (April - July, 2003) REPORT
This PhD research project studies John Wilkins' Essay Towards
a Real Character and a Philosophical Language (London 1668).
This book elaborates a universal language intended as an
instrument for precise and unambiguous communication. Several
aspects of Wilkins' ambitious publication are interesting from a
scientific viewpoint: not only its use of linguistic and cognitive
terminology, but also its textual structure, consisting of two
interrelated parts: a hierarchical taxonomy is presented in the
form of Tables of the Universal Philosophy, while
the lexical units are listed in the Alphabetical
Dictionary. The connection between the different parts of
Wilkins' Essay reveals a complex network of definitions and
semantic relations, unparalleled in the epoch when this work was
written and exhibiting modern features well ahead of its time.
Natascia Leonardi's research is complemented by the development of
a digitized version of Wilkins' work that is intended as a
faithful reproduction of the original defining architectures. An
XML encoding is applied to the tables, while the alphabetical
dictionary makes use of lexicographic tools. The digitized
integration of the two defining sections of Wilkins' work intends
to fully reveal the potential of its articulated defining scheme
and facilitates the reader's access to the different parts of
Wilkins' Essay. Natascia Leonardi benefits from BATMULT's
extensive expertise in text coding of complex documents. Continued fellowship: Luis Serrano Fernández (see below) obtained an extension allowing him to continue his fellowship in Bergen until February 22, 2003. December 2-3, 2002Event:With the financial support of the University of Bergen, BATMULT organizes an: This seminar, with the participation of prominent researchers from Norway and Sweden and two BATMULT fellows from Spain, aims at strengthening the BATMULT site and the cooperation between research groups. Fall 2002Luis Serrano Fernández (University of León): Translation, Film and Censorship: The Translation of Film Texts from English into Spanish 1975-1985. (Aug. 2002 - Feb. 2003) REPORT Ignacio Pérez Álvarez (University of León): Translation, Prose and Censorship: The Translation of Narrative Texts from English into Spanish 1936-1962. (Aug. - Dec. 2002) REPORT
From left: Luis Serrano Fernández,
Ignacio Pérez Álvarez (photograph by
Koenraad de Smedt)
|
Facilities News archive
|
Last updated March 22, 2007 by Koenraad de Smedt |