SyntaxHighlightingWithegXML

Quick-and-Dirty Syntax Highlighting with 
Using  to document encoding practices

The problem:

You want to document your encoding practices in a TEI document with lots of example code, and you want to render it into HTML for display on the web, with all the code nicely indented and syntax-highlighted.

The solution:

Mark up your code with the TEI element , and use the fragments of XSLT and CSS code below to render the embedded XML into attractively-indented and coloured output.

What is  and how do I use it?

Many ordinary TEI users will never have come across the  element, but whenever you look at the TEI Guidelines, you're seeing the results of it. Every time a piece of TEI code is shown in the Guidelines, it's embedded in an  element.

 is special in that it's not in the normal TEI namespace; it's always in its own special namespace, which is http://www.tei-c.org/ns/Examples. Inside , you can place any well-formed fragment of TEI code you like, like this:

 This is a paragraph. 

Now, you would think that the tag inside the  is a TEI, but it's not. It's in the http://www.tei-c.org/ns/Examples namespace. That means two things:


 * 1) When you validate your XML file against a TEI P5 schema, this will not cause problems. It's outside the TEI namespace.
 * 2) When you write XSLT to process your example TEI code, this is completely distinct from your regular P5, because it's in a different namespace.

Now shut up explaining things, and give me the code already.

Point taken, here it is:

First, there's some CSS. Put this in your site stylesheet, or in a separate stylesheet if you're afraid it might be infectious. This is intended to be CSS3 connecting with an XHTML5 web page, but it should work fine with earlier versions of XHTML.

/* Handling of example XML code embedded in pages. */ pre.teiCode{ white-space: pre-wrap; }

/* We want our XML code to look like code. */ .xmlTag, .xmlAttName, .xmlAttVal, .teiCode{ font-family: monospace; }

/* We want our XML code text to be bold. */ .xmlTag, .xmlAttName, .xmlAttVal{ font-weight: bold; }

/* We want syntax highlighting. */ /* I think I stole these colour values from oXygen. Sorry George! */ .xmlTag{ color: #000099; }

.xmlAttName{ color: #f5844c; }

.xmlAttVal{ color: #993300; }

Next, there's some XSLT.

The first thing you need to do is to put two things inside the root element of your XSLT file:

xmlns:teix="http://www.tei-c.org/ns/Examples"

(That's the TEI examples namespace.)

exclude-result-prefixes="xs xd xhtml hcmc exist teix"

(You're going to have to customize that a bit for yourself. What that's saying is: don't output xmlns nodes for these namespaces. In this example from my project, I'm suppressing unwanted namespaces from a range of different domains. You'll probably want to do something similar, but your list of namespace prefixes will be different.)

Now you need to add a couple of templates to the XSLT file(s) that process your TEI XML.

I always use XSLT 2.0 with Saxon 9+, but I think this would work perfectly well with XSLT 1.0. Add the following to your stylesheet. Bear in mind that most of the output finds itself inside an XHTML5 element, where whitespace matters, so please forgive the absence of human-readable whitespace. More competent XSLT coders will certainly find ways of making the code more readable without screwing up the whitespace. A lot will depend on the settings that are already in your stylesheet for xsl:preserve-space and xsl:strip-space.

Note: the key here is that all the TEI elements found inside an  element are in the Examples namespace.

   



<xsl:variable name="indent"><xsl:for-each select="ancestor::teix:*"> </xsl:for-each></xsl:variable>

<xsl:if test="not(ancestor::teix:p)"><xsl:value-of select="$indent"/></xsl:if>

<<xsl:value-of select="name"/> <xsl:for-each select="@*"><span class="xmlAttName"><xsl:text> </xsl:text><xsl:value-of select="name"/>= "<xsl:value-of select="."/>" </xsl:for-each> >

<xsl:if test="not(ancestor::teix:p)"><xsl:text> </xsl:text></xsl:if><xsl:apply-templates select="* | text | comment"/>

<xsl:if test="not(ancestor::teix:p)"><xsl:value-of select="$indent"/></xsl:if> </<xsl:value-of select="local-name"/>>

<xsl:if test="not(ancestor::teix:p)"><xsl:text> </xsl:text></xsl:if> </xsl:template>

 <xsl:if test="not(ancestor::teix:p)"><xsl:for-each select="ancestor::teix:*"> </xsl:for-each></xsl:if><xsl:value-of select="replace(., '&amp;', '&amp;amp;')"></xsl:value-of><xsl:if test="not(ancestor::teix:p) or not(following-sibling::* or following-sibling::text)"><xsl:text> </xsl:text></xsl:if> </xsl:template>

Further customization
This is a relatively simple hack, and it's crude in its assumptions. For instance, I assume that content inside a TEI tag is "inline" content, and other tag content is not. Obviously, in the context of a real project, you would want to make such assumptions more explicit and elaborate.

Also note that if you use the <gi>, and tags in your documentation, you can hook them up with the same class attributes in your CSS, so that XML fragments in your code are also styled and coloured.

More sophisticated solutions
Sebastian "Stormageddon" Rahtz's excellent TEI stylesheets include a much more extensive and intelligent module for processing . It doesn't include syntax highlighting, but it's handles a wider range of conditions and contingencies. You can see his code here:



The TEI By Example project also has nice syntax-highlighted rendering of TEI code. Their XSLT is not currently public, but I'm sure they'd share it if asked: