Re: [Corpora-List] offer of research resource

From: Martin Wynne (martin.wynne@oucs.ox.ac.uk)
Date: Wed Jun 28 2006 - 11:04:27 MET DST

  • Next message: Geoffrey Sampson: "[Corpora-List] Child Language Survey"

    Dear Geoffrey and everyone,

    I've had several messages offline asking why the OTA doesn't offer to
    take this resource, so before anyone else asks, I should point out that
    the Oxford Text Archive and the Arts and Humanities Data Service only
    archive electronic resources, and so, unfortunately, would not be able
    to offer a home for this valuable data in its current state. As I
    understand it, what is needed is a traditional archive for paper
    documents and magnetic media, or a project to digitise the data. (But
    please correct me if I'm wrong, Geoffrey.)

    If anyone out there is in a position to consider undertaking a project
    to digitise it, then I understand that Professor Sampson already has a
    detailed workplan. To make life even easier, the AHDS would be very
    happy to offer a free service to archive, catalogue, preserve and
    distribute the electronic data, on a non-exclusive basis. We could also
    give advice on digitisation, if needed.

    Best wishes,
    Martin

    -- 
    Martin Wynne
    Head of the Oxford Text Archive and
    AHDS Literature, Languages and Linguistics
    

    Oxford University Computing Services 13 Banbury Road Oxford UK - OX2 6NN Tel: +44 1865 283299 Fax: +44 1865 273275 martin.wynne@oucs.ox.ac.uk

    Geoffrey Sampson wrote: > Dear Colleagues, > > I am looking for someone who would be interested in taking over > responsibility for a valuable research resource I have been in charge of > in recent years. > > During the 1960s, a team of linguists sponsored by the Nuffield > Foundation assembled a collection of the spontaneous spoken and written > English of children and young people aged between 8+ and 15+ attending a > variety of schools of diverse types in different urban and rural English > regions: the "Child Language Survey". (This was initially intended as > part of a multinational effort directed at improving foreign-language > teaching in Europe, but I understand that parallel efforts in other > countries fell through; the material has essentially been gathering dust > more or less ever since it was compiled.) The leading member of the > team was Richard Handscombe, now long since retired from a Canadian > university and in indifferent health. After I used a small portion of > the Survey for my LUCY treebank (www.grsampson.net/RLucy.html), Richard > generously suggested that I should take charge of the entire Survey > material, and arranged for it to be transported to my workplace in > Sussex, where it now is. > > Since then, I have made repeated attempts to get funding to computerize > this material, clearly a necessary first step to unlocking the research > potential it contains. Although referees' reports on my various grant > applications have been outstandingly positive, unfortunately no > application has finally succeeded. I now find myself too close to > retirement for a further application to be worth making; even if I > secured funding now, I would not have time to see the work through to > completion. Hence I would be interested in hearing from anyone younger > who might succeed where I have failed. > > In my view the collection has unparalleled potential scientific value. > In the first place, it creates a possibility (which otherwise scarcely > exists) of comparing spontaneous English usage across several decades of > time -- children of the 1960s with children now, and/or the usage of a > generation in childhood with the usage of the same generation now it is > middle-aged. One can envisage many significant applications to the > study of language-skills education, for instance. One anonymous grant > referee in 2005 commented: > > "there is a yawning gap where there should be a research literature > on grammatical development at school age (contrasting with a rich supply > of research on both pre-school children and adults). What is needed > more than anything else is precisely what this project offers: age- > related data on speech and writing from the same children ..." > > The written portion of the material represents children's spontaneous > writing abilities in a way which in my experience is hard to match even > for present-day children. Collections of child writing often turn out to > be heavily influenced by the adult prose they have consulted, but the > Child Language Survey compilers found clever ways to get at what the > children could do under their own steam. And the quality of the > collection is extremely high. The spoken material has been transcribed > with an accuracy that compares very favourably with the speech > transcriptions in the British National Corpus (and I have the original > tape-recordings as well as the transcriptions). The written material > has been converted from the children's handwriting into typescript with > astonishing care, so that for instance every crossed-out letter is > identified. As a very rough estimate, the whole might comprise about > 800,000 words of speech and about 200,000 words of writing. > > It will be a minor scientific tragedy, to my mind, if this material is > lost to scholarship. Yet, if I cannot find a suitable home for it > fairly soon, that fate looks unavoidable. > > Accordingly, I should be very happy to hear from anyone who feels able > to rescue the Child Language Survey from oblivion. After handing it > over, I would be willing, indeed eager, to retain an involvement, to the > extent of advising on what I know about it, etc., but decisions would be > for the new owner to make: I have no wish to be a back-seat driver. I > would be quite willing to transfer the collection out of Britain -- I > have the impression that scholarly values may be in a better state in > some Continental European countries, for instance, than they are in > British universities nowadays. (And I would be glad to supply > documentation on my grant applications, referee reports, etc., if they > would help someone else construct a case for support.) > > Anyone who would like to be considered is invited to contact me, > commenting briefly on how he or she would hope to publish and/or exploit > the material, and we can take it from there. > > Geoffrey Sampson > > > ............................................................ > Prof. Geoffrey Sampson MA PhD MBCS CITP ILTM > > author of "The 'Language Instinct' Debate" > > Department of Informatics, University of Sussex > Falmer, Brighton BN1 9QH, England > > www.grsampson.net +44 1273 678525 > ............................................................ > > >



    This archive was generated by hypermail 2b29 : Wed Jun 28 2006 - 11:05:49 MET DST