PlaceName2IndexCSV.xsl

From TEIWiki
Jump to navigation Jump to search

Summary

This is a XSLT script for extracting "placeName" elements from any TEI XML document and creating a table index of placeName values with associated chosen data : title, publisher ...

It's easy to modify and choose another index element (persName, orgName, category ...) and other output associated data.

Add any comments to the 'discussion' tab.

Required Input

This document will take a TEI XML file with TEI root element. An alternative to manage another root element and 2, 3, or more TEI element as children is proposed too.

XPath and XSL version is 1.0 : all processors (xsltProc, Saxon ...) of all environments (XML editor, UNIX-Linux libraries) can process it.

Expected Output

A CSV table containing :

  • a line for each "namePlace", with its value in the first cell
  • associated data found in the TEI source in following cells : publisher, date, main title, DOI id
  • tabs "\t" to separe cells (rows) and "\n" for EndOfLine-NewLine


Example :


America\tOxford University Press\t2005\tThe Quality of Care Under a Managed-Care Program (...)\t10.1093/geront/45.4.49\n
Minneapolis\tOxford University Press\t2005\tThe Quality of Care Under a Managed-Care Program (...)\t10.1093/geront/45.4.496\n
MN\tOxford University Press\t2005\tThe Quality of Care Under a Managed-Care Program (...)\t10.1093/geront/45.4.496\n
(...)

line model : placeName value\tpublisher value\ttitle[@type='main'] value\tidno[@type='DOI] value\n

Stylesheet


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    
<xsl:output encoding="UTF-8" method="text"/> 
    
<xsl:template match="/TEI">
    
    <xsl:for-each select="//placeName">
        <xsl:value-of select="."/>
        <xsl:text>\t</xsl:text>
         <xsl:value-of select="normalize-space(//teiHeader/fileDesc/publicationStmt/publisher)"/> 
        <xsl:text>\t</xsl:text>
        <xsl:value-of select="normalize-space(//teiHeader/fileDesc/publicationStmt/date)"/>
        <xsl:text>\t</xsl:text>
        <xsl:value-of select="normalize-space(//teiHeader/fileDesc/titleStmt/title[@type='main'])"/>
        <xsl:text>\t</xsl:text>
        <xsl:value-of select="normalize-space(//teiHeader/fileDesc/sourceDesc/biblStruct/idno[@type='DOI'])"/>
        <xsl:text>\n</xsl:text>
    </xsl:for-each>
    
</xsl:template>
    
</xsl:stylesheet>


An alternative : possibly several "TEI" trees, another root element. Needed : more accurate XPath.


<xsl:template match="//TEI">
    <xsl:for-each select=".//placeName">
        
        <xsl:value-of select="."/>
        <xsl:text>\t</xsl:text>
        <xsl:value-of select="normalize-space(ancestor::TEI/teiHeader/fileDesc/publicationStmt/publisher)"/>
        <xsl:text>\t</xsl:text>
        <!-- and so on for other "values-of select" -->
        <xsl:text>\n</xsl:text>
</xsl:template>
    
</xsl:stylesheet>