TEI to SVG
Contents
A Quick Look Through the TEI Figures Module
Module: figures (Tables, Formulae, and Graphics, P5 chapter 22)
Elements Defined: table row cell formula figure figDesc graphic binaryObject
The "figure" module really contains three related - but very different - aspects.
- Tables
Elements: table row cell
I don't immediately see anything corresponding to this in SVG. It wouldn't make sense, anyway - tables are really for organizing textual material. Anything else? - Formulae
Elements: formula
Formula already allows for the inclusion of elements from outside the TEI, although the examples in the guidelines do not use namespaces, and namespaces are not mentioned in the text. Should we recommend that <figure> require namespaces for elements pulled in from elsewhere? If so, what about (as in the first two examples) when the notation is non-XML? Is @notation really enough?- How will this section generally be used? Will the most common usages be MathML or ChemML (or something similar from the sciences), or perhaps simpler formulas expressed in available Unicode characters?
"By default, a <formula> is assumed to contain character data which is not validated in any way"
Is this really a good idea? This means that, if you have a formula that contains non-standard characters, you would be unable to use gaiji without customization. That seems strange.
- How will this section generally be used? Will the most common usages be MathML or ChemML (or something similar from the sciences), or perhaps simpler formulas expressed in available Unicode characters?
- Graphic Images
Elements: figure graphic binaryObject figDesc
The <figure> element is used to contain images, captions, and textual descriptions of the pictures.
The images themselves are specified using the <graphic> element, whose url attribute provides the location of an image.
<figure> <graphic url="Fig1.pdf"/> </figure>
<figure> may also contain <head> (providing a title or heading for the image) <figDesc> (a description of the image) and <p> (commentary or caption, not a description of the image).
"Where the graphic itself contains large amounts of text, perhaps with a complex structure, and perhaps difficult to distinguish from the graphic, the encoder should choose whether to regard the graphic as containing the text (in which case, a nested <text> element may be included within the <figure> element) or to regard the enclosed text as being a separate division of the <text> element in which the graphic appears. In this latter case, an appropriate divn class element may be used for the text represented within the graphic, and the <figure> element embedded within it. The choice will depend to a large degree on the encoder's understanding of the relationship between the graphic and the surrounding text."
So <figure> may also contain <text>.
TEI Elements and their SVG Equivalents (approx.)
Quick Links:
TEI Chapter 22 Tables, Formulae, and Graphics
Scalable Vector Graphics (SVG) 1.1 Specification
tei:graphic and svg:image
tei:graphic
tei:graphic "indicates the location of an inline graphic, illustration, or figure."
attributes: (In addition to global attributes)
width The display width of the image Status: Mandatory when applicable Datatype: data.outputMeasurement height The display height of the image Status: Mandatory when applicable Datatype: data.outputMeasurement scale A scale factor to be applied to the image to make it the desired display size Status: Mandatory when applicable Datatype: data.probability url The target URL Status: Mandatory when applicable Datatype: data.pointer Values: The name of a URL which provides the image. mimeType The MIME type Status: Mandatory when applicable Datatype: data.word Values: The MIME type to be used for the object when it is decoded
<figure> <graphic url="fig1.png"/> <head>Figure One: The View from the Bridge</head> <figDesc>A Whistleresque view showing four or five sailing boats in the foreground, and a series of buoys strung out between them.</figDesc> </figure>
svg:image
svg:image "indicates that the contents of a complete file are to be rendered into a given rectangle within the current user coordinate system. The 'image' element can refer to raster image files such as PNG or JPEG or to files with MIME type of "image/svg+xml""
attributes:
x = "<coordinate>" The x-axis coordinate of one corner of the rectangular region into which the referenced document is placed. If the attribute is not specified, the effect is as if a value of "0" were specified. Animatable: yes. y = "<coordinate>" The y-axis coordinate of one corner of the rectangular region into which the referenced document is placed. If the attribute is not specified, the effect is as if a value of "0" were specified. Animatable: yes. width = "<length>" The width of the rectangular region into which the referenced document is placed. A negative value is an error (see Error processing). A value of zero disables rendering of the element. Animatable: yes. height = "<length>" The height of the rectangular region into which the referenced document is placed. A negative value is an error (see Error processing). A value of zero disables rendering of the element. Animatable: yes. xlink:href = "<uri>" A URI reference. Animatable: yes.
- Are x and y necessary?
- xlink:href instead of url = this is nice.
<figure> <svg:image xlink:href="fig1.png"/> <head>Figure One: The View from the Bridge</head> <figDesc>A Whistleresque view showing four or five sailing boats in the foreground, and a series of buoys strung out between them.</figDesc> </figure>
Thoughts
If we were to recommend a module to import all of SVG, it would be preferable to use only svg:image and to drop tei:graphic entirely (if svg:image does indeed do everything we would need it to do in TEI). But if we don't want to "modulate" SVG (if we just say that we will refer to external SVG files if we need them), do we still want to maintain a separate tei:graphic element?
The nested grouping of TEI image elements
tei:figure/tei:graphic|tei:head|tei:figDesc
- figure "contains a block containing graphics, illustrations, or figures."
- graphic "indicates the location of an inline graphic, illustration, or figure."
- head "contains any type of heading, for example the title of a section, or the heading of a list, glossary, manuscript description, etc."
- figDesc "(Description of Figure) contains a brief prose description of the appearance or content of a graphic figure, for use when documenting an image without displaying it."
May map to the SVG:
svg:g/svg:image|svg:title|svg:desc
- "The 'g' element is a container element for grouping together related graphics elements."
- "The 'image' element indicates that the contents of a complete file are to be rendered into a given rectangle within the current user coordinate system. The 'image' element can refer to raster image files such as PNG or JPEG or to files with MIME type of "image/svg+xml""
- "Each container element or graphics element in an SVG drawing can supply a 'desc' and/or a 'title' description string where the description is text-only."
- "The 'title' child element to an ... element serves the purposes of identifying the content of the given SVG document fragment."
Examples
<figure xml:id="id1"> <graphic width="4" height="2" url="fig1.png" mimeType="image/png"/> <head type="image title">Figure One: The View from the Bridge</head> <figDesc>A Whistleresque view showing four or five sailing boats in the foreground, and a series of buoys strung out between them.</figDesc> </figure>
<svg:g id="id1"> <svg:image width="4" height="2" xlink:href="fig1.png"/> <svg:title>Figure One: The View from the Bridge</svg:title> <svg:desc>A Whistleresque view showing four or five sailing boats in the foreground, and a series of buoys strung out between them.</svg:desc> </svg:g>
Notes
- svg:g has @id, not @xml:id.
- tei:graphic/@scale and svg:image/@preserveAspectRatio (http://www.w3.org/TR/SVG/coords.html#preserveAspectRatio) are, I believe, similar, but they are Greek to me. I need help here.
- svg:image lacks @mimeType, but is it necessary?
- svg:title does not seem to have a type attribute, but since this element would be used only in reference to image titles, I don't think that this is a problem.
- As noted in a previous section, <figure> may also contain <p> or <text>. There are no equivalent elements in SVG. Would it make sense to include those elements in an SVG group as TEI namespace?
<svg:g id="id1"> <svg:image width="4" height="2" xlink:href="fig1.png"/> <svg:title>Figure One: The View from the Bridge</svg:title> <svg:desc>A Whistleresque view showing four or five sailing boats in the foreground, and a series of buoys strung out between them.</svg:desc> <tei:p>paragraph here</tei:p> <tei:text>text contained on the image</tei:text> </svg:g>
Or should we encourage use of a standard TEI block element such as a <div> or <figure> to bracket together an svg element and any tei elements that need to be tied to it:
<figure> <svg:g id="id1"> <svg:image width="4" height="2" xlink:href="fig1.png"/> <svg:title>Figure One: The View from the Bridge</svg:title> <svg:desc>A Whistleresque view showing four or five sailing boats in the foreground, and a series of buoys strung out between them.</svg:desc> </svg:g> <p>paragraph here</p> <text>text contained on the image</text> </figure>
Mapping attributes
Hold coordinates in individual elements using @mets:coords (instead of creating @tei:coords)
As described in the METS documentation: COORDS: an optional string attribute listing a set of visual coordinates within an image (still image or video frame). The COORDS attribute should be used as in HTML 4.
And in HTML 4.01: This attribute specifies the position and shape on the screen. The number and order of values depends on the shape being defined. Possible combinations:
- rect: left-x, top-y, right-x, bottom-y.
- circle: center-x, center-y, radius. Note. When the radius value is a percentage value, user agents should calculate the final radius value based on the associated object's width and height. The radius should be the smaller value of the two.
- poly: x1, y1, x2, y2, ..., xN, yN. The first x and y coordinate pair and the last should be the same to close the polygon. When these coordinate values are not the same, user agents should infer an additional coordinate pair to close the polygon.
Coordinates are relative to the top, left corner of the object. All values are lengths. All values are separated by commas.
METS defines @mets:coords on <area>. For TEI, it would be nice to have this attribute available to "regular" elements, not to a special element. The reasoning is that in many cases, especially when dealing with primary source texts, TEI elements refer not to a text in general but to the text as it appears in a specific physical document. It may not make sense to allow @mets:coords on <p>, but it may make perfect sense to allow it on those elements described in Chapter 18 Transcription of Primary Sources that relate to a specific physical occurrence:
<abbr> <sic> <add> <del> <hi> <restore> <gap> <damage> <unclear> <space> <fw>
SVG defines various attribute values for coordinates. The system is based on a shape (identified by the element), but the attributes vary, so one could use the same attributes in the same element in various combinations to achieve the same result:
rectangle:
@svg:x = length @svg:y = length (rounded edges: @svg:rx = length @svg:ry = length)
circle:
@svg:cx = "<coordinate>" The x-axis coordinate of the center of the circle. @svg:cy = "<coordinate>" The y-axis coordinate of the center of the circle. @svg:r = "<length>" The radius of the circle.
ellipse:
@svg:cx = "<coordinate>" The x-axis coordinate of the center of the ellipse. @svg:cy = "<coordinate>" The y-axis coordinate of the center of the ellipse. @svg:rx = "<length>" The x-axis radius of the ellipse. @svg:ry = "<length>" The y-axis radius of the ellipse.
polygon:
@svg:points = "<list-of-points>" The points that make up the polygon. All coordinate values are in the user coordinate system.
@svg:points seems to me to be very similar to @mets:coords, except that it cannot be used to form a circle (only boundaries with straight edges)
There may be instances where one would want to use @mets:coords (for simple circles and bounding boxes) and other times when it would make more sense to use svg:x/y/width/height etc. (for more complex shapes).
We need to continue looking at SVG, whether we want to be able to import it in a module or link to external files. Or both.
tei:binaryObject
A rough equivalent for tei:binaryObject in svg. tei:binaryObject is an svg:image whose xlink:href uses the "data" URL scheme.
<svg width="4in" height="3in" version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"> <desc>This graphic links to a picture of Larry Masinter</desc> <image x="200" y="200" width="48px" height="48px" xlink:href="data:image/gif;base64,R0lGODdhMAAwAPAAAAAAAP///ywAAAAAMAAw AAAC8IyPqcvt3wCcDkiLc7C0qwyGHhSWpjQu5yqmCYsapyuvUUlvONmOZtfzgFz ByTB10QgxOR0TqBQejhRNzOfkVJ+5YiUqrXF5Y5lKh/DeuNcP5yLWGsEbtLiOSp a/TPg7JpJHxyendzWTBfX0cxOnKPjgBzi4diinWGdkF8kjdfnycQZXZeYGejmJl ZeGl9i2icVqaNVailT6F5iJ90m6mvuTS4OK05M0vDk0Q4XUtwvKOzrcd3iq9uis F81M1OIcR7lEewwcLp7tuNNkM3uNna3F2JQFo97Vriy/Xl4/f1cf5VWzXyym7PH hhx4dbgYKAAA7"> <title>Larry Masinter</title> </image> </svg>
Using SVG with TEI
In some instances, it makes sense to enable broad use of SVG throughout the TEI document. For instance, although most of the examples here relate to the embedding of bitmap data, the real power of SVG is in describing vector information, and it's the perfect tool for capturing dividing lines on the page, shapes, blocks of text set off from the page, decorative flourishes, simple diagrams, graphs, logos and so on. It's not the case that, for example, illuminated manuscript pages could be described in SVG in place of the use of hi-res page images, but it would be useful to specify the layout of a complex MS page (where there might be annotations on commentaries on commentaries) in terms of SVG shape structures, with the text element blocks embedded inside them, enabling a usefully approximate rendering of the page layout incorporating the transcription.