Ad-hoc committee on encoding of bibliographic citations

= Charge =

At its meeting on 2010-02-07, the TEI Council discussed bug 2714682 and reached consensus that biblScope should not be allowed as a child of imprint since the prose definition of the imprint element does not include such content as might be included in one or more biblScope elements. Laurent Romary, Martin Holmes, and Kevin Hawkins were charged with writing a proposal to make it clear how to use biblScope for various types of bibliographic citations. The proposal should include corrected, annotated (with XML comments) examples for the Guidelines of encoding various types of citations in biblStruct (and maybe also bibl and biblFull). It was suggested that the ad-hoc committee also look at how citations are handled in other encoding schemes.

= Background =

The TEI Guidelines offer three elements for bibliographic descriptions:


 * bibl -- no enforced content structure
 * biblStruct -- structured citation follows a TEI content model
 * biblFull -- structured citation with a content model similar to fileDesc

These elements may be used for a number of different purposes in a TEI document. They may occur within the  to describe the source of a TEI document (as part of the metadata) or within the to describe citations appearing as content. As with other elements used in the body, any of these three elements could be used either:


 * to represent citations in a source document (such as a pre-existing print document)
 * to encode citations as part of a publishing process (such as a manuscript of a book to be published), whether destined for print publication, online publication, or both.

Some people distinguish these as two different purposes for markup (which Wendell Piez called "retrospective markup" versus "encoding done for the sake of fitting data to a particular application"). In discussing the elements for bibliographic purposes, one's opinion might be affected by whether one assumes the markup will be used for retrospective digitization or for a publishing workflow.

Encoding citations as represented in a source document (such as a pre-existing print document)
When encoding a citation as it appears in a source document, most users prefer to leave the transcribed text in the order it is read on the page when tagging; therefore, any elements used for bibliographic descriptions need to be flexible in their internal structure. bibl will clearly work for this purpose, and biblFull will not (unless the source document contains only ISBD-compatible citations without certain errors). biblStruct has a complicated content model whose allowed order of elements seems to map well onto common citation formats (at least those used in European languages). However, when including page numbers, it's not clear when to include a biblScope as a child or sibling of imprint.

Examples in the Guidelines sometimes include delimiting punctuation in the elements, which is sensible when encoding citations as represented in a source document. Users often wish to represent the original document as faithfully as possible, including inconsistent punctuation, so they don't want to insert any punctuation automatically. Leaving in punctuation is also simpler when converting from print or electronic source files.

Encoding citations as part of a publishing process (such as a manuscript of a book to be published)
When encoding a citation as part of a publishing process, the usual desire is to enforce uniformity of citation style through use of markup. If a particular citation format (MLA, APA, GOST 7.1, etc.) is to be used, it is likely that the possible combination of citation components in that format does not match perfectly with the content model of any of three elements for bibliographic descriptions. biblFull likely requires too much detail for most purposes, bibl would allow for more flexibility of encoding structure than is required for the citation format being used, and biblStruct is already too constrained for certain use cases (like a work without an an author, editor, or other responsibility, or like a monograph part of a larger monograph series). Constraining the content model of bibl would break TEI Conformance (?), so a user might instead maintain the TEI content model for the element but use a Schematron schema or other outside tool to check that the citation is encoded as desired.

It is advised not to include delimiting punctuation in citations encoded as part of a publishing process but rather to insert these through a stylesheet, thereby guaranteeing not only consistent use of punctuation but also making it easier to adapt to a different citation style in the future.