Text Directionality Workgroup

Text Directionality Workgroup
This page will summarize the evolving work of the Text Directionality Workgroup, tasked by the TEI Council with developing a new section for the Guidelines on recommendations for encoding a variety of textual features related to text directionality and orientation. The related SourceForge ticket is http://purl.org/tei/fr/3475007.

MDH made a text_directionality.pdf presentation to the TEI Council on this topic during the April 2013 Council meeting in Providence.

Workgroup Members

 * Martin Holmes (TEI Council)
 * Deborah W. Anderson (Unicode Consortium)
 * Robert Whalen (Northern Michigan University)
 * Marcus Bingenheimer (Temple University)
 * Stella Dee (King's College, London)

Order of Tasks

 * Enumerate textual features to be covered
 * Collate existing standards and recommendations and relate them to features
 * Identify any gaps which might require new TEI elements or attributes
 * Outline the new section
 * Write the first draft for consideration by Council
 * Identify other places in the Guidelines where information or links need to be included

Mailing List
The group has a mailing list provided through Brown University at http://listserv.brown.edu/archives/cgi-bin/wa?A0=TEI-DIR-WG.

Notes from initial discussions

 * We agree (so far) that we would like to distinguish between two distinct types of phenomena: "true" text directionality (such as that found in language such as Japanese written vertically ttb with lines sequence rtl -- and "rotational" features, in which text written in any direction is rotated or written along a path. Our proposal will have to cover both of these phenomena, and provide for cases in which they interact, but they will probably be handled by different mechanisms.


 * We agree that the ITS specification is rather a red herring. Its primary concern is translation rather than text representation, and its provisions for directionality are sparse.


 * We agree that the CSS Writing Modes draft provides the best descriptive introduction to directional phenomena. The general consensus is that we should base our analysis of the phenomena on CSS Writing Modes, and probably base our recommendation on its properties and values.

If we do base our recommendation on the properties and values specified in CSS Writing Modes, then we have three possible approaches in our recommendation with regard to text directionality:


 * 1) We could recommend the creation of multiple new attributes, one for each property in CSS Writing Modes. This would enable us to add more attributes if there are features we want to describe that are not actually handled by CSS Writing Modes.
 * 2) We could recommend the creation of a single attribute, e.g. @cssWritingMode, whose value could be any valid combination of the CSS Writing Modes properties and values (in other words, its content would be a CSS ruleset constrained only to the properties relevant to writing modes).
 * 3) We could recommend that people use the existing global @style attribute, which is already available for CSS code (although it is not tied to CSS). This would enable users to combine CSS Writing Mode features with other CSS code which applies to the same element.

No. 1 seems too complicated; we'd end up with lots of new attributes, whose values would inevitably need to be combined anyway during any rendering process.

No. 2 is attractive in the sense that it keeps text directionality features separate from other CSS-specified features, and would allow the user to combine writing mode information with a use of @style which didn't happen to use CSS.

No. 3 seems the most attractive in that it's very simple, and involves no change to the existing TEI infrastructure at all; we just need to explain and illustrate how to use the properties, and point users at the W3C specification. MB points out that we are thereby assuming that @style is using CSS3 (since Writing Modes is not available in early versions of CSS); it would therefore be helpful if the styleDefDecl element were able to specify not only @scheme="css" but also the version (perhaps @schemeVersion="3") for clarity's sake.

Examples, constructed and from primary sources
This section collects together some examples which our discussion can reference. We aim to collect useful examples of some straightforward cases, but also of some edge cases which our proposal must be able to handle. Some of these may be used as examples in a final draft of the new section of the Guidelines. These are listed in no particular order.

Text directionality

 * Wikipedia has some good examples of Boustrophedon (alternate lines running in different directions, with glyphs also flipped horizontally for rtl lines).
 * Ancient Berber is an example of a script written bottom-to-top, with lines right-to-left.
 * This Berber inscription also incorporates rotation, so we could demonstrate the combination.
 * Rongo Rongo (Easter Island, reverse boustrophedon)

Rotation

 * Rotation along X axis: [[File:Rotation_on_x_axis.png‎]] "tei-c.org" 180 deg: "ʇǝı-ɔ˙oɹƃ"
 * Rotation along Y axis: [[File:Rotation_on_y_axis.png‎‎]]
 * Rotation along Z axis: [[File:Rotation_on_z_axis.png‎]]


 * Easter Wings is a good real-world example of rotation along the z axis.
 * Arabic text written along a circular path (a roundel).
 * The Phaistos Disc (spiral writing)
 * Ogham (clockwise from bottom left)

Useful documents

 * UTR 50, Unicode Properties for Horizontal and Vertical Text Layout
 * Forum for UTR 50
 * UAX 9, Unicode Bidirectional Algorithm
 * Proposed update to UAX 9 for Unicode 6.3
 * Unicode BIDI forum
 * UTR 20, Unicode in XML
 * What you need to know about the bidi algorithm and inline markup (W3C)
 * Unicode controls vs markup for bidi support (W3C)
 * CSS Writing Modes
 * CSS vs Markup in XHTML

Other notes
Deborah points to four new bidi isolate characters to be added to Unicode (probably 6.3), and quotes this description:

HTML/CSS recently introduced “bidirectional isolates” to improve handling of bidirectional text in HTML. However, this new technology does not provide a means to solve the bidi issues in non‐HTML documents or when copying and pasting HTML into plain text. This proposal requests four format characters that can be used to support formatting of bidirectional text in non‐HTML documents and plain text, in a way which can be interoperable with the mechanisms used by HTML/CSS for markup.

This is in addition to the five bidi codepoints (LRE, RLE, LRO, RLO, and PDF). They are described in proposed update to UAXZ #9.