Minutes of the Tools SIG meeting in Rome, 3 October 2013

Marjorie Burghart chaired the meeting. She asked for minute-taker. Kevin Hawkins volunteered.

Marjorie asked everyone to introduce themselves.

introductions
Emmanuelle Morlock: Doesn't create tools but "parameterizes" them (using different parameters to produce different results, as needed). Wants to know how to find the right tool.

Kevin Hawkins: Doen't create TEI tools but has been trying to develop the tools section of the wiki, which he encouraged people to contribute to or even restructure to better meet their needs.

Matija Ogrin: Colleague developed XSLT to render TEI editions. Later moved to using Sebastian's stylesheets and improving them. Interested in the line between rich encoding and suitable, simplified rendering.

Michael Gavin: Doesn't develop tools but a center at his university is developing a TEI tool. He'll be involved in user testing and comparing against other tools. Interested in hearing what others are working on.

Marion Lamé: Interested in digital representation of inscriptions.

Mathias Göbel: Works with a project on manuscripts and is responsible for web publication of digital edition, including an ultra-diplomatic rendering (which uses Sebastian's stylesheets in part). Interested in data visualization in general.

Wilhelm Ott: Is a developer whose work began in 1966. Wanted to allow users to ____. Became known as TUSTEP in 1978, with large user base in Germany. It's known for being powerful but impossible to use. Now working on an XML interface to the system to make it easier to use (see poster presented earlier today).

António Rito Silva: Is a faculty member and developer who just learned about TEI about a year ago. He and colleagues are implementing a tool to create a social edition of a book: want to describe fragments using TEI and allow users to build their own editions.

Frank Wiegand: Is a software developer for DTA, DWDS, and various smaller projects. He's also the corpus manager for several large corpora. We don't use tools from the TEI community but need to see if any would be appropriate.

Leif-Jöran Olsson: At Swedish Language Bank, they ___. Currently working with Dramawebben (see poster).

Marjorie Burghart: She used to try to produce tools but am interested in pushing forward interest in tools in the TEI community. She got thinking about tools after presentation in College Station by Ben Brumfield, where he surveyed the "Transcription Tool Directory" to find 27 tools and 23 editors but only 7 "support TEI". Even we in the community, who are producing tools, but each project has to develop something for its special needs. Users are not protesting either. I think we need to pressure tool developers into supporting TEI in order to break into the academic market. The big advantage to the TEI will let you encode virtually anything, but the big problem is that the TEI will let you encode virtually anything (!) and do it in virtually any way. The software developer needs a processing model not the many ways that things can be done. The danger is that if we let the software developers decide the right way to encode something, they have too much room for arbitrary decision. We need to explain the reasons for the complexity and offer to help. After all, the TEI is so overwhelming that the developers are likely very glad for help.

Another way to help people with the TEI is to help communities meet their needs. Perhaps the TEI Cheatsheets can help with this. After all, when there's more than one way of encoding something with no advantage to one or the other, use cheatsheets to promote consistency. The cheatsheets try to help both scholars and tool producers go from basic concepts like "Index nominum" to a sample encoding in TEI.

Our relationship with syncRO has been very beneficial. How come we don't have direct export from Microsoft Word to TEI?

Discussion
Emmanuelle Morlock: There is a recent book in France about publishing using XML. There was one line about TEI, saying it's used in the academic community and that it's much too complicated. But I do really like the cheatsheets approach. Maybe we could enhance with a way to talk to the user? We need to have some good interfaces to draw in prospective users. Perhaps the cheatsheets should include two ways of rendering the same phenomenon (and perhaps two ways to produce that rendering)?

Kevin: Easy to add TEI support to an XML editor because the editor can have user of other XML schemas besides TEI schemas, but it's hard to produce other tools that are generic enough for everyone's needs.

Leif-Jöran: We don't have enough data. I would prefer calling cheatsheets "entry points" or something: the community is too small to benefit from a term like that. But the user community does keep growing. Hard to generalize about tools: should clarify what kinds of use cases we want to meet and even what we mean by "tool". Break things down to well-defined parts.

Marjorie: I had in mind any independent application or module -- anything that's not conceptual encoding -- which helps you work with TEI.

Antonio: For a new project, first decide whether data should be stored in XML (hierarchical) or a database (relational). For his project, while they have been developing their prototype, people are encoding, so there's iterative development. It's good that the encoder gets to see results right away, but it's bad that he doesn't know when the work will be finished.

Wilhelm: We should think about prefabricated solutions to certain problems. TUSTEP treats data as string sequences; tree structure is part of the data. It's up to the user to figure out what kind of data will be described. This made it easier to replace SGML support with XML support. Wants elementary tools (microtools, as others have said) that can be assembled as needed.

George Bina: tei_all blew up jing. Fixed this by increasing maximum memory (the "stack") for Java, but this allows all sorts of operations to eat up memory. He also explained that RELAX NG's specification has a DTD-compatibility specification which defines IDs, default values, and annotations. He implemented support for default values in Jing and therefore in oXygen but is wondering whether anyone uses since he's unaware of any tools that use default values.

Kevin: You might ask on TEI-L.

Michael Gavin: When I see a tool that inserts default values, I go remove them manually without fixing the problem.

Leif-Jöran: Default values are an integral part of the Guidelines, so I don't think we need to ask anyone. If a user of the TEI doesn't want the default to be assumed, you should specify an alternative value.

Marion asked whether anyone has used TILE since it appears dead to her. Leif-Jöran said that there are a lot of users and active development. Kevin offered to introduce her to Dot and John.

Michael: Wasn't there a poster today on a similar application today? Marjorie said she thinks the presenter might have been from Germany.

Marion: Are there any tools that support non-rectangular regions? Kevin noted that the TEI recently added support for non-rectangular regions, but he didn't know of any software implementations.

Mathias: We've been talking mostly about tools useful during the encoding process. But what about tools for project management, project planning, and for publication? Scholars want to see immediate rendering of their encoding. He, for example, used jQuery to add a button that highlights the content of certain elements: while it seemed trivial to him, the users highly valued it.

Kevin: Commercial developers don't have a large enough user base. But for those in the academy, do we need incentives for tool developers to share what they've done?

Leif-Jöran: Need to put in a lot of work to clean up the code and make it usable by others. Kevin: Maybe just encourage people to share what they have. Gavin: But then there's even more tools that are unworkable for you.

Michael: Scholars are encouraged to move onto new projects rather than continue to improve existing ones. Developing of tools is different from typical motivations in the academy.

Marjorie: I'm seeing changes in the academy in terms of incentives for tool developers, but we are still keeping things within just the TEI circles. What about Base-X and eXist: can we get the developers to add some default TEI support?

Leif-Jöran: I am from eXist and am hearing these things.

Emmanuelle: How do you react when Kevin says there's a market problem? Leif-Jöran: I think there is a market.

Emmanuelle: I agree that we should try to convince developers that the TEI market, while small, will grow in the future.

Wilhelm: When we speak of sharing, are we talking about workflows for using tools or the tools themselves? Sharing solutions makes sense for similar, compatible projects. [. . .] Good to parameterize tools so users can adapt to their specific needs.

Frank: I'm a developer, not a marketing guy. Maybe we need marketing guys in our community to convince outside developers and prospective users to use or accommodate TEI. Also, he noted that the TEI community doesn't often share the TEI encoding, whereas researchers in the natural sciences customarily provide the raw data for their research.

Leif-Jöran: We should prioritize what we want and then define the top-priority things. That makes it much easier for someone to take on the challenge, even without funding. For example, what does "support TEI" mean?

Marjorie: Defining "support" is indeed hard. At what level do you need to support TEI versus generic XML?

Course of action?
Anyway, can we agree on any course of action? For instance, that we need better outreach beyond the TEI community?

Michael: What if jTEI reviewed tools? That might address the incentive gap for producers of tools to share what they make and maintain contact with prospective users.

Marjorie asked Leif-Jöran why he wanted a prioritization? Leif-Jöran: it will scare them away. Michael: so you want a document? Leif-Jöran: Yes.

Marion: If the TEI dropped the XML integration and was just a conceptual model, would that help with implementation? Kevin: Like using JSON? Marion: Or full implementation in standoff? Antonio: My off-the-cuff view is that when you base everything on XML, you tie the representation to a format that a human can read easily. But when you want to separate representation of data and presentation, use of XML for both muddles this. So we use TEI for encoding by humans and then generate a model readable only by a computer to produce the rendering. Still, I think most tools use XML.