Extract-svn-id.xslt

From TEIWiki
Jump to navigation Jump to search

If you use Subversion, you may have set the SVN keywords property to include “Id” for some of your files. (This is done with svn propset svn:keywords "Id" file …, then checking in the file(s).) This allows you to insert the keyword “$Id$” into your file, and Subversion will replace it with a string that includes some useful information (like revision number and timestamp). (I think the replacement occurs on checkout.)

This XSLT 1.0 stylesheet will read in an XML file and look through the <editionStmt> and comments for such a string of useful information, and return only the string (or nothing, if one wasn’t found).

Note that this stylesheet:

  1. looks like a big, complicated stylesheet. But in fact it is quite small and not that complex. It looks large because it is verbosely documented. It has only ~25 xsl: elements, but ~41 documentation elements (xhtml and oXygen’s xd:).
  2. is not even close to foolproof. It just looks for the strings that Subversion uses to delimit the substituted Id keyword. It will mess up if those strings occur in comments other than those in the desired format (e.g., as you might well find in a book about Subversion, or you will find in this stylesheet itself, since a portion of code is commented out to demonstrate how to do something differently).
  3. is trying to be semi-intelligent about where it looks. But in truth, it would probably be fine to just look at normalize-space(/) for the strings of itnerest.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:tei="http://www.tei-c.org/ns/1.0"
  xmlns:xd="http://www.oxygenxml.com/ns/doc/xsl"
  xmlns="http://www.w3.org/1999/xhtml">

  <xd:doc>
    <xd:desc>Output is text; we use the XHTML namespace as the default namespace not for the output,
      but so that I can easily use the XHTML <tt>&lt;tt></tt> element in the documentation.
      :-)</xd:desc>
  </xd:doc>
  <xsl:output method="text"/>

  <xd:doc scope="stylesheet">
    <xd:desc>
      <xd:p><xd:b>extract-svn-id.xslt</xd:b> — a routine to read in an XML file and write out the
        Subversion <xd:b>Id</xd:b> substituted keyword string, if present</xd:p>
      <xd:p>The <xd:b>Id</xd:b> is extracted from the <tt>&lt;editionStmt></tt> of the TEI header if
        one is present, or the first comment that has one.</xd:p>
      <xd:p>If no Subversion <xd:b>Id</xd:b> substituted keyword strings are found, nothing is
        returned.</xd:p>
      <xd:p><xd:b>written</xd:b> 2010-01-31 by Syd Bauman</xd:p>
      <xd:p>Copyright 2010 Syd Bauman and the Brown University Women Writers Project, some rights
        reserved</xd:p>
      <xd:p>Available for download, copy, distribution, modification, distribution of modified
        versions, and use in other people’s products under the GNU General Public License, version
        3. (If that’s too restrictive for you, write.)</xd:p>
    </xd:desc>
  </xd:doc>

  <xd:doc>
    <xd:desc>Generate a key of comments that contain the string <xd:i>$Id:</xd:i> followed by a
      blank. The index into the key is just the cardinal number of the comment. Thus, the Perlese
      expression to get the string value of the first such comment would be <tt>$SVN-Ids{'0'}</tt>
      (except, of course, '-' is not a valid character in a Perl variable name).</xd:desc>
  </xd:doc>
  <xsl:key name="SVN-Ids" match="//comment()[contains(normalize-space(.),'$Id: ')]"
    use="count( preceding-sibling::comment() )"/>

  <xd:doc>
    <xd:desc>On matching the root, get the Subversion Id string and hand it to <xd:ref
        name="get_SVN-Id_itself" type="template"/> to be parsed, using the output thereof as the
      output of the template (which, in turn, is the output of the entire stylesheet).</xd:desc>
  </xd:doc>
  <xsl:template match="/">
    <xsl:call-template name="get_SVN-Id_itself">
      <xsl:with-param name="node-string">
        <xsl:call-template name="get_SVN-Id_string"/>
      </xsl:with-param>
    </xsl:call-template>
  </xsl:template>

  <xd:doc>
    <xd:desc>
      <xd:p>look through the input file, and find a suitable substituted Subversion Id keyword
        string</xd:p>
      <xd:p>First, try in the <tt>&lt;editionStmt></tt> with <tt>&lt;teiCorpus></tt> as root; then
        in the <tt>&lt;editionStmt></tt> with <tt>&lt;TEI></tt> as root. Last, look in
        comments.</xd:p>
    </xd:desc>
    <xd:return>The entire text node containing the substituted Subversion Id keyword string, or (if
      none was found) the keyword <xd:i>IDUNNO</xd:i>, as an xs:string</xd:return>
  </xd:doc>
  <xsl:template name="get_SVN-Id_string">
    <xsl:choose>
      <xsl:when
        test="/tei:teiCorpus/tei:TEI/tei:teiHeader/tei:fileDesc/tei:editionStmt[contains(normalize-space(.),'$Id: ')]">
        <xsl:value-of
          select="normalize-space(/tei:teiCorpus/tei:TEI/tei:teiHeader/tei:fileDesc/tei:editionStmt)"
        />
      </xsl:when>
      <xsl:when
        test="/tei:TEI/tei:teiHeader/tei:fileDesc/tei:editionStmt[contains(normalize-space(.),'$Id: ')]">
        <xsl:value-of
          select="normalize-space(/tei:TEI/tei:teiHeader/tei:fileDesc/tei:editionStmt)"/>
      </xsl:when>
      <!--
        This is how you would look in comments *without* using a key: 
      <xsl:when test="//comment()[contains(normalize-space(.),'$Id: ')]">
        <xsl:value-of select="//comment()[contains(normalize-space(.),'$Id: ')][1]"/>
      </xsl:when>
        I have no idea which way is better. -->
      <xsl:when test="key('SVN-Ids','0')">
        <xsl:value-of select="key('SVN-Ids','0')"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:text>IDUNNO</xsl:text>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

  <xd:doc>
    <xd:desc>Take the string that contains the Subversion substituted Id keyword, and parse the
      Subversion substituted Id keyword out of it. This is done by trimming off everything before
        <code>$Id: </code> and after <code> $</code>, and then tacking those strings back on the
      front and end of the remaining string.</xd:desc>
    <xd:param><xd:i>node-string</xd:i> is the entire textual content of either the comment or
      element that was found by <xd:ref name="get_SVN-Id_string" type="template"/> as an
      xs:string</xd:param>
    <xd:return>The Subversion substituted Id keyword extracted from <xd:i>node-string</xd:i> (which
      may include other things besides the Subversion substituted Id keyword)</xd:return>
  </xd:doc>
  <xsl:template name="get_SVN-Id_itself">
    <xsl:param name="node-string"/>
    <xsl:choose>
      <xsl:when test="$node-string='IDUNNO'"/>
      <xsl:otherwise>
        <xsl:variable name="node-stringTrimStart" select="substring-after($node-string,'$Id: ')"/>
        <xsl:variable name="node-stringTrimEnd"
          select="substring-before($node-stringTrimStart,' $')"/>
        <xsl:value-of select="concat('$Id: ',$node-stringTrimEnd,' $
')"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>