The books mentioned below:

Gerhard Leitner, Corpus design - problems and suggested solutions. In
Leitner (ed.), <italic>New directions in English Language
Corpora</italic>, Berlin & New York 1992

Douglas Biber, Representativeness in corpus design. <italic>Literary and
Linguistic Computing </italic>8, 4:243-257

Greame Kennedy, <italic>An Introduction to Corpus Linguistics</italic>,
London & New York: Longman 1998, pp 70-5.



I'm not quite sure what exactly you're looking for, but I can give you three links, the first about
corpus linguistics in general

the other two more specific: the Text Encoding Initiative (TEI)

and the Corpus Encoding Standard (CES)

