Serving "application/tei+xml" from Cocoon

From TEIWiki

Jump to: navigation, search

Most websites which publish TEI XML documents describe them with the content type "application/xml" or "text/xml". These are two labels for XML files in general, and don't identify the files as specifically TEI XML. By using the content type "application/tei+xml", a webserver can make the specific claim that the XML file is TEI rather than any other kind of XML. This has the advantage that browsers can be configured to handle TEI files distinctly from the way they handle other XML files (for instance, by opening them in an XML editor).

This page describes how to configure Apache Cocoon to serve TEI content as "application/tei+xml".

Simple configuration

To specify the content type, simply configure a TEI XML serializer in your sitemap.xmap file, and then invoke the serializer in your TEI pipelines.

To declare the serializer:

<serializer mime-type="application/tei+xml" name="tei" src="org.apache.cocoon.serialization.XMLSerializer"/>

To invoke the serializer:

<serialize type="tei"/> 

This configuration will always send the TEI using the "application/tei+xml" content type. However, this may not be ideal for browsers which are not configured to handle it.

Preferred configuration

Most browsers will adequately display generic XML but will not be configured to handle the specific content type "application/tei+xml". For this reason, it's usually better to serve the TEI as "application/tei+xml" only if the browser specifically asks for it, and otherwise to serve it as the generic "application/xml".

In Cocoon, you can use a RegexpHeaderSelector to determine if a browser is prepared to handle the specific content type. The following Cocoon sitemap file shows:

  • A plain XML serializer
  • A TEI XML serializer
  • A selector for determining if a browser will accept "application/tei+xml"
  • A pipeline fragment ("resource") which uses the selector to determine if a browser will accept "application/tei+xml", and uses the appropriate XML serializer
  • A pipeline which invokes the above resource instead of just invoking a serializer directly
<?xml version="1.0" encoding="UTF-8"?>
<sitemap xmlns="">
		<!-- declare two serializers: one for tei-xml, and one for generic xml -->
			<serializer mime-type="application/xml" name="xml" src="org.apache.cocoon.serialization.XMLSerializer"/>
			<serializer mime-type="application/tei+xml" name="tei" src="org.apache.cocoon.serialization.XMLSerializer"/>
		<!-- declare a selector for determining if a browser will accept tei-xml -->
			<selector name="accept-content-type" src="org.apache.cocoon.selection.RegexpHeaderSelector">
				<pattern xmlns="" name="tei">application/tei\+xml</pattern>
				<header-name xmlns="">accept</header-name>

		<!-- declare a pipeline fragment for serializing TEI with the most appropriate content type; -->
		<!-- tei-xml if the browser will accept it, or plain generic xml otherwise -->
		<resource name="serialize-tei">
			<select type="accept-content-type">
				<when test="tei">
					<serialize type="tei"/>
					<serialize type="xml"/>

			<match pattern="transcript/*.xml">
				<generate src="tei/{1}.xml"/>
				<!-- serialize the TEI using the appropriate serializer -->
				<call resource="serialize-tei"/>


Configuring your browser to request "application/tei+xml"

Using this configuration, you will need to configure your browser to declare that it's prepared to accept "application/tei+xml".

To configure Firefox, enter "about:config" into the address bar, find the key called "network.http.accept.default" and change it to include "application/tei+xml" in the list of acceptable content types, e.g.:

Personal tools