Difference between revisions of "PlaceName2IndexCSV.xsl"

From TEIWiki
Jump to navigation Jump to search
(Created page with "== Summary == This is a XSLT script for extracting "placeName" elements from any TEI XML document and creating a table index of placeName values with associated chosen data : tit...")
 
Line 17: Line 17:
  
  
= Example : =
+
Example :  
 
<pre>
 
<pre>
 
<nowiki>
 
<nowiki>
Line 59: Line 59:
 
</pre>
 
</pre>
  
== Alternative : another root element, possibly several "TEI" children ==
+
Alternative XPath expressions to manage source variants : one or more "TEI" children, other root element, parent of TEI element
 +
 
 
<pre>
 
<pre>
 
</nowiki>
 
</nowiki>

Revision as of 10:48, 27 June 2016

Summary

This is a XSLT script for extracting "placeName" elements from any TEI XML document and creating a table index of placeName values with associated chosen data : title, publisher ...

Add any comments to the 'discussion' tab.

Required Input

This document will take a TEI XML file with TEI root element. An alternative to manage another root element and 2, 3, or more TEI element as children is proposed too.

XPath and XSL version is 1.0 : all processors (xsltProc, Saxon ...) of all environments (XML editor, UNIX-Linux libraries) can process it.

Expected Output

A CSV table containing :

  • a line for each "namePlace", with its value in the first cell
  • associated data found in the TEI source in following cells : publisher, date, main title, DOI id
  • tabs "\t" to separe cells (rows) and "\n" for EndOfLine-NewLine


Example :


America\tOxford University Press\t2005\tThe Quality of Care Under a Managed-Care Program (...)\t10.1093/geront/45.4.49\n
Minneapolis\tOxford University Press\t2005\tThe Quality of Care Under a Managed-Care Program (...)\t10.1093/geront/45.4.496\n
MN\tOxford University Press\t2005\tThe Quality of Care Under a Managed-Care Program (...)\t10.1093/geront/45.4.496\n

(...)
line model : placeName value\tpublisher value\ttitle[@type='main'] value\tidno[@type='DOI] value\n

Stylesheet


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    
<xsl:output encoding="UTF-8" method="text"/> 
    
<xsl:template match="/TEI">
    
    <xsl:for-each select="//placeName">
        <xsl:value-of select="."/>
        <xsl:text>\t</xsl:text>
         <xsl:value-of select="normalize-space(//teiHeader/fileDesc/publicationStmt/publisher)"/> 
        <xsl:text>\t</xsl:text>
        <xsl:value-of select="normalize-space(//teiHeader/fileDesc/publicationStmt/date)"/>
        <xsl:text>\t</xsl:text>
        <xsl:value-of select="normalize-space(//teiHeader/fileDesc/titleStmt/title[@type='main'])"/>
        <xsl:text>\t</xsl:text>
        <xsl:value-of select="normalize-space(//teiHeader/fileDesc/sourceDesc/biblStruct/idno[@type='DOI'])"/>
        <xsl:text>\n</xsl:text>
    </xsl:for-each>
    
</xsl:template>
    
</xsl:stylesheet>


Alternative XPath expressions to manage source variants : one or more "TEI" children, other root element, parent of TEI element

</nowiki>
<xsl:template match="//TEI">
    <xsl:for-each select=".//placeName">
        
        <xsl:value-of select="."/>
        <xsl:text>\t</xsl:text>
        <xsl:value-of select="normalize-space(ancestor::TEI/teiHeader/fileDesc/publicationStmt/publisher)"/>
        <xsl:text>\t</xsl:text>
        <!-- and so on for other "values-of select" -->
        <xsl:text>\n</xsl:text>
</xsl:template>
    
</xsl:stylesheet>

</nowiki>