DH2014Hackathon-Projects

From TEIWiki

Jump to: navigation, search

Contents

DH2014 Hackathon Project Discussion Page

Let's Hack

A suggestion for how to proceed based on the range of proposals and the range of participants. The group is likely to be able to sustain 2 or at most 3 projects in the time frame allotted. Some suggestions; please modify, develop or otherwise shape to your liking!

Suggested setup using Git

There is a Git repository at https://github.com/TEIC/Hackathon which can be used for any group during the Hackathon. Sebastian Rahtz can add people to the TEI group on request so that they have write access. You can use any Git client, obviously. If you don't have a client setup already and are on a Mac, you'll find https://mac.github.com/ easy to set up and use. If you possibly can, get your account and setup ready before the meet.

This repository is synchronized with a web server at http://tei.it.ox.ac.uk/Hackathon/ every minute. This web server has a CORS setup, which should allow cross-site scripting to work for any group working with JSON or other Ajax technique.

To test this, clone the Git repository on your local laptop and open the file map/maptest.html in your browser. It should read JSON data from http://tei.it.ox.ac.uk/Hackathon/map/objects.json and display it on a map. If you add more JSON data files to the repository, and push them up to Github, they should appear on the web server quite quickly; similarly this can be used to "publish" HTML.

Sebastian will be available during the Hackathon to wrangle the web server as needed.

Google Doc for the Extraction & Query Section

https://docs.google.com/document/d/1W-f2jdQInxA3IKzSH_hn6sBD68lZcBTYZzWdBVjjSoo/edit?usp=sharing

Link to German Inscription Database DIO's ReStful Interface : http://www.inschriften.net/rest/di[CALATOGUENR.]/articles/di[CALATOGUENR.]-[ARTICLENR.] http://www.inschriften.net/rest/di060/articles/di060-0027

ODD visualizer

Some notes at: http://tinyurl.com/dh2014-TEIODD

A group might try to put together a plan and start to code up an ODD visualizer. ODD is a literate programming language for XML schemas written by the TEI, and which is used to define the TEI schema. An ODD visualizer might be a component of an ODD customization tool such as the successor of Roma, and could provide information to a user about what modules they have included in their TEI schema, or what modification they have made.

The group should start by discussing what to visualize, perhaps including some TEI users who has experience in customization.

Resources: About ODD Getting Started with P5 ODDs, Guidelines Chapter 23: Using the TEI and a look at Roma, the tool that is currently in use for generating TEI customizations.

--Elli Bleeker 16:05, 4 July 2014 (CEST) I know of some digital edition projects that are a priori TEI compliant but have added their own custamizations if the TEI Guidelines did not offer a sufficient solution for the "challenge" posed by the text. I can imagine there are more projects that do so. Is that meant by "modification" of the TEI schema? It seems the ODD visualizer might be a good addition to the authorial workflow as described below, if it offers the editors insight in their specific text encoding methods.

Elena Spadini 12:13, 5 July 2014 (CEST) (I'm not sure this is the right place for adding names to a project as Elli wrote us, but) I would like to work on it!

Raffaele.viglianti 13:40, 6 July 2014 (CEST) I'd happily work on this

--Rahtz 17:50, 6 July 2014 (CEST) Elli - a modification is often simply specifying a subset of the TEI, ie removing elements you don't need. Visualizing just this relationship is itself quite useful; the proof of concept at http://tei.it.ox.ac.uk/Byzantium/ demonstrates a trivial example. In an ideal alternate work, the visualization would dynamic. Ie it would show a list of modules, and you could click on one to remove the elements in it, and that would trigger writing an ODD file.

--Elli Bleeker 22:28, 6 July 2014 (CEST) Thanks Sebastian for the clarification. I would like to join this group too.

Basic framework for rendering a document and creating simple visualizations

This project could be quite extensive, so the group should begin by selecting components that they can manage.

The proposed outcome is to be able to display (or at least mine) a document for salient features, and to produce output generic enough that it can be passed to either mapping or to other visualization software. Or perhaps to the NY Times Pourover js library. One possible way to to do this is to decide on a json data structure, as that is a popular format for visualization libraries.

--Frederike Neuber 08:08, 3 July 2014 (CEST) Are you referring to a particular visualization software on which it might be useful to take a quick look before the hackathon?

--Elli Mylonas 11:27, 4 July 2014 (CEST) Frederike has a good comment - are there suggestions? Google Maps is one idea. Leaflet, and D3 (more complex) are others. Thoughts?

--Rahtz 17:02, 6 July 2014 (CEST) I would suggest we limit ourselves for this project to Leaflet (mapping) and D3 (graphs) with its companion C3. Both have loads of examples and addons. Leaflet rather than Google Maps because it doesn't need API keys etc, and can use other tiles easily. The key thing here is to proof of concept - get a visualization up and running quickly, then tweak it. If its successful, recoding it later in another library is not so hard. Note, however, that we come up against cross-site scripting when we read JSON in an HTML file. See Setup section above.

I like the idea that we take some texts and a) isolate a set of features that can be extracted for looking at, b) work out what they look like in JSON, and c) practice visualizing them in some way. I have added an example XSLT script at https://github.com/TEIC/Hackathon/blob/master/visualization/teitojson.xsl which shows how you'd grab (in this case persName) elements from a TEI XML file and write a JSON file. Demo result at http://tei.it.ox.ac.uk/Hackathon/visualization/test.json

Develop the MS description framework

This was one of the proposed projects that got a lot of comments. It could be developed along the lines of the comments, and generalized a bit.

Alexander Czmiel 14:20, 4 July 2014 (CEST) Maybe I don't get it, but it seems to me this project has the same aims like "Basic framework for rendering a document". Anyway, rendering TEI documents is an imortant issue which has not been adressed enough in the past. Especially when looking on complex markup such as recommended by the Genetic Markup SIG. There are already a few approaches which try to do exactly the same, we want to do here: build a generic framework to publish TEI documents online. There is eLaborate (http://elaborate.huygens.knaw.nl/) from Den Haag, xMod (http://www.cch.kcl.ac.uk/xmod/) from London (the latest installment of xmod is called "Kiln" and is more sophisticated and multiplatform: the code is here: (https://github.com/kcl-ddh/kiln)--Raffaele.viglianti 09:33, 7 July 2014 (CEST)), DENQ from Rome, and SADE (best/current version on Github by vronk: https://github.com/vronk/SADE) which we try to build in Berlin/Göttingen/Wien. To a certain point OMEKA (http://www.omeka.net/) could be used as well, I think. There are many more, I don't know of. So please don't let us start from the scratch with a new project.

--Elli Bleeker 14:56, 4 July 2014 (CEST) I agree it would be pointless to start from scratch. But it might be worth considering to further develop one of the abovementioned approaches. I am not familiar with all of them, but most tend to fail once the textual genetics becomes more complex. The fact that many of the proposed projects in this Hackathon are aimed at the rendering of complex TEI mark up, suggests that this is one of the most important needs of textual scholars and that there is still no sufficient way to do so.

--Frederike Neuber 19:27, 4 July 2014 (CEST) It seems we have all different understandings of the project's descriptions - I will tell you mine too:
In my opinion the two project aim different results, even if the workflow will be partly the same.
The "rendering complex markup.." aims to offer in a creative visualization of the extracted data, so this project will be about including visualization software which supports data visualization beyond lists and timelines.
The "MS description.." aims (or at least it did in the beginning) to create a framework which is applicable to each TEI-document containing a <msDesc> part and includes both displaying the <msDesc> content (in an linear way, no tricky visualization) and search and browse functions. The scope of this project is to provide in the end a set of files which anyone can use to render a <msDesc> part.
So far.. can somebody bring light into the darkness please? :)

Patrik Granholm 23:35, 4 July 2014 (CEST) My initial proposal was to create a simple search and browsing interface for MS descriptions which would be general enough to be used by other similar projects. But perhaps it is too specific. An interface for TEI-files in general would probably be more useful. The idea is that everything should be well documented so that people with limited programming skills would be able to modify the code to suit their needs. It would be more like an eXist template than a finished product.

Elena Spadini 12:08, 5 July 2014 (CEST) I don't think that it is too specific and an interface for TEI files in general is probably too wide for one day. For sure there are a lot of projects flourishing in this way (not eLaborate that is at a certain point TEI compliant but not at all TEI based), as everyone using TEI deals with the problems and find more or less home made solutions. But if during the Hackathon something can be started and finished, it would be a really good result!

--Frederike Neuber 15:55, 5 July 2014 (CEST) Maybe we could decide whether we want to render the metadata or the text. I think some templates for rendering the <teiHeader> with focusing on search options (searching for certain fields, combining them, faceted search) would be nice.
Can someone provide some XML files to work with?

Patrik Granholm 08:55, 6 July 2014 (CEST) I have a few preliminary MS descriptions from our cataloguing project, and quite a large collection of MS descriptions from other projects which we could work with.

--Rahtz 19:02, 6 July 2014 (CEST) I agree that starting a new framework isn't a good idea, but I do think there is a future in taking just the components of the transcription module and doing rendering of those. Its a horribly open-ended exercise, though.

Discussing, working out and documenting best practices for an authoring workflow

Projects that last for many years, have many people working on them, and comprise large amounts of material need to ensure that work is being done efficiently, produce accurate output, and move files through transformations and validations. Good workflows can be enhanced by both tools and best practices. The outcome of this project could be a document with explanation and examples of authoring/encoding/proofreading/version control workflows with a focus on software tools.

--Elli Bleeker 15:22, 4 July 2014 (CEST) Would we aim at enhancing the workflow of existing projects or suggest best practices for future projects? It would be very interesting to see whether it is possible to distinguish a best practice for an authoring workflow. Normally projects that run for many years and started "back in the days" have developed their own ad hoc workflow. They happily admit to have done things differently witht the knowledge of today. Of course the workflow depends upon the people working on it (trained editors, volunteers through crowd-sourcing...). Perhaps the outcome would be several different workflows, depending on the type of material, the objectives of the project and the people collaborating on it.

Participants

  • Raffaele Viglianti, University of Maryland
  • Patrik Granholm, Uppsala University Library: Greek Manuscripts in Sweden. A Digitization and Cataloguing Project
  • Frederike Neuber, DiXiT - research fellow at Graz University - Austrian Centre for Digital Humanities
  • Felix Lange, Academy of Sciences and Literature | Mainz, Project IBR (http://www.spatialhumanities.de)
  • Emmanuelle Morlock, CNRS, HISoMA Laboratory (Histoire et Sources des mondes antiques / History and Origins of the Antique World)
  • Elli Bleeker, PhD­ student in Digital Humanities at Antwerp University; research fellow at DiXiT
  • Magdalena Turska, University of Oxford
  • Nick Laiacona, Performant Software Solutions LLC www.performantsoftware.com
  • Elizabeth Maddock Dillon, Professor of English and Co­Director of NULab for Texts, Maps, and Networks, Northeastern University, Boston, MA USA
  • Thomas Kollatz, Steinheim-Institute for German-Jewish History: epidat
  • Elena Spadini, Huygens Ing


  • Sebastian Rahtz, University of Oxford
  • James Cummings, University of Oxford
  • Alex Czmiel, Berlin-Brandenburg Academy of Sciences and Humanities
  • Hugh Cayless, Duke University
  • Elli Mylonas, Brown University

Procedure

We will be using the wiki as a place for comments and discussion. The projects each participant proposed are all listed below. Ideally, the hackathon will be focussed around one or two projects that are useful to everyone. The participants are all experienced with TEI in various capacities, but they are not all skilled programmers. The hackathon would be a success if the outcome of the one day event was e a good start on one or two useful pieces of TEI related software. In order to achieve this, we should collectively decide on projects that are interesting, useful, generalizable, and do-able! We should also try to capitalize on everyone's skill set, and try to plan the event in such a way that we can all contribute to something.

You are all welcome to introduce yourselves, by adding a description of a few sentences to the list of participants above. And please, edit, correct and comment on the projects. We are also inviting the TEI Council and Board and other interested parties to look in on the discussions.

Proposed Projects

Mormon City Planning:

when early Mormonism ventured from Kirtland, Ohio, into Missouri, their prophet Joseph Smith Jr provided a revelation for the plat (city layout) of the new Zion (http://urbanplanning.library.cornell.edu/DOCS/smith.htm; http://zomarah.files.wordpress.com/2010/10/firstplatofzion.png), which was subsequently applied in the settlement of Far West, Missouri, for which two plats exist, one on sheepskin (https://www.lds.org/bc/content/shared/content/images/gospel-library/manual/32502/15-02 b.gif) in private possession and one on paper at BYU University (there is no online copy of this, but I have obtained a digital copy from BYU with their permission); however these plats were not only used to sketch out the city, but also to assign lots to settlers, and thus show secondary markup in pencil to allocate houses, redraw lot boundaries, etc; (http://zomarah.files.wordpress.com/2010/10/platoffarwestbig.png)

Comments:

ODD Customization visualizer

A web-based tool to visualize any ODD customization against TEI-all. This could be useful when working on a customization (with Roma, or manually) to quickly and visually check that the ODD is still TEI-conformant and see how it diverges from the standard. I started working on a basic D3 visualization a couple of years ago, but haven't really touched the code since:https://github.com/raffazizzi/ODDViz I only restructured the repository a bit before sending this proposal.

Comments A simple demo: https://dl.dropboxusercontent.com/u/2443674/ODDViz/index.html --Raffaele.viglianti 03:02, 3 July 2014 (CEST)

ACE-based Web editor

MITH, at the University of Maryland has been working on an ACE-based web editor able to validate tei-all files in the browser and provide ODD-based contextual help (e.g. suggesting valid options when entering a new element). The grant that funded this work is now over, but there's plenty more to do. This is the GitHub repository: https://github.com/umd-mith/angles

Comments:


MS Description Display Framework

To create a simple web interface which would provide basic functionalities like browsing, searching and displaying TEI-files containing manuscript descriptions. The interface could be built using a Ubuntu server with nginx, and eXist-db following the setup guide and scripts provided by Grant Macken. I have already made some preliminary work on an eXist-db web interface and posted the code in our GitHub repository. This could be used as a starting point for further development. A possible goal for the hackathon could be to create an advanced search form in XQuery which would display snippets of the descriptions in the search results using the transform function with our XSL stylesheet, with links to the full description. All of the code, with detailed documentation, would be made public on GitHub for others to reuse and modify for their own projects. I think this would be beneficial to the TEI community at large, and especially to other cataloguing projects.

Comments:

  • I believe this would be really useful project, especially if we build it not only having this particular purpose of MS descriptions in mind, but something more general, so other users would potentially swap only source files and stylesheets and would have a basic working website. Magdalena Turska 11:00, 23 June 2014 (CEST)
  • me too – more general features almost every project will need after swapping source files and stylesheets: navigation to next page/object ; or: navigation to next entry/object in chronological order, full text search everywhere resp in particular div's (type="edition") etc. Thomas Kollatz 12:09, 1 July 2014 (CEST)
  • I am definitely open to creating a more general display framework in eXist. Perhaps we could make use of Joe Wicentowski’s Punch eXist tutorial from DHOxSS 2011. Patrik Granholm 11:16, 2 July 2014 (CEST)

Medieval Text Edition

Project isn't advanced enough to suggest concrete task, but the topic of interest is focussed around a digital scholarly edition of a medieval text (with corresponding images) encoded in XML/TEI. The edition will be enriched with palaeographical and codicological information (in both <text> and <teiHeader>) and the results can be visualized in a way which hopefully goes beyond the sometimes not very enlightening listing of data.

Comments: Even now my project is not advanced enough to be used for the hackathon. I will join the discussions on the other projects. --Frederike Neuber 22:05, 24 June 2014 (CEST)

Semantic Connections for Epigraphic Documents

I am currently working with epidoc documents in the context of the Project IBR. These documents, epigraphical editions from the catalogue "German Inscriptions Online" (inschriften.net), are (1) to be transformed into RDF­triples for semantic connection and for a fine­grained quantificational analysis in a Triple Store, (2) XSLT ­transformed into HTML­Documents for further annotation in the semantic annotator "Pundit"(thepund.it). I could surely contribute a programming task based on this points, but would also be happy to participate in another task, preferably based on "adding a TEI mode to a web editor".

Comments:

--Frederike Neuber 23:02, 24 June 2014 (CEST) Who is responsible for this project and can provide further information? I would be very interested in joining this project even if I can't offer material and even if I do not understand one step: how can you transform an epidoc document into RDF triples? Do you mean you generate RDF from the annotated entities (extracting them?)? And how do you do it, are there tools or software to support it?

Thomas Kollatz 12:12, 1 July 2014 (CEST) The oxgarage magic box contains already a tool converting TEI2RDF, if so – we should try it out and evaluate it if not - let's develop it …

Felix Lange14:56, 4 July 2014 (CEST) Thomas Kollatz is right, this tool is quite mature by now, making a new development from scratch pointless. But is there a programmatic (e.g. ReST) interface to the converter? If not, let's do it! @Frederike, some further project informations can be found at www.spatialhumanities.de/ibr, our project website.

Zotero

Some TEI encoding project may need to use a "master bibliography" to group together all the bibliographic references that are used in a given text or collection of texts. Using the "masterfile" option in Oxygen give the user a convenient way of pointing to a reference without having to encode a ref each time it is used in a text or in a specific bibliographic section.

<bibl type="fromStone">
   <ptr target="#Breal1878"/>
   <citedRange>15</citedRange>
</bibl>

The bibliography can be inserted in a <back/> element with xinclude. But the question is how to use Zotero to create this file, update it and synchonize it with the biblographic masterfile. Though incomplete, the workflow I use works like that :

  • the entering of the bibliographic entries is done with zotero, in a group library
  • the zotero database is then exported in xml using the "bibliontology_rdf" format (the TEI export format being to restrictive in its formatting choices : e.g. does'nt include "short titles" which are a requisite in epigraphy
  • the bibliontology rdf is converted in TEI through XSLT in Oxygen (modifying existing xslt :https://github.com/paregorios/Zotero-RDF-to-TEI-XML)
  • If the user wants to add a new reference in the masterbibliography or correct an entry, he or she has to do go to zotero and follow the whole workflow of export - tranform process. But the user might not have the user right to do so. And of course, there are some potential conflicting issues.

How to improve that workflow ? use a version control system like git or subversion ? or adding/modifying an entry in xml / tei and then updating the zotero group library via api ?

Comments: --Elli Bleeker 15:29, 1 July 2014 (CEST) It seems that working with subversion would improve the workflow a lot and at least illiminate the risk of conflicting issues. Of course users should have the right to work in the (seperate) xml file of the bibliographic masterfile and add/change references. Once completed, the xml file can be exported to Zotero. In short, this would mean a reverse of the current workflow.

Migrating value lists form CSS frameworks to ODD

The author mode of the Oxygen editor can be customized with custom css functions. A part from the function providing a more user-friendly and tag free visualization of the tei content, one of the fuction allow the user to edit attributes or simple elements values using combo boxes or check boxes. The "form controls" can display values collected from an xml schema. But these values can also be just in the oxygen css. This may be used in a workflow to test some choices before integrating them in a more consistent and persistant way in the ODD (then exported in the schema).

This could be useful in case you have user that may be ok to change the values in the css code but would be relunctant to get involved with the ODD editing / schema generation process. cf. http://www.oxygenxml.com/doc/ug-editor/concepts/combo-box-editor.html#combo-box-editor

Comments:

--Raffaele.viglianti 19:20, 16 June 2014 (CEST) This is interesting, would like to work on it. I would suggest to make it less tied to Oxygen and think in terms of CSS to ODD, for example to limit attribute values (e.g. hi[rend=italic])

Elena Spadini 00:13, 24 June 2014 (CEST) I'm not well-exeperienced with the author mode in oXygen and its customizations, but I would like to explore it working on this project.

Visualization of intertextual TEI content

[UPDATED on June 24th]

The focus is on the possibilities offered by TEI XML for encoding intertextuality. The case study concerns the personal library of an author and a digital edition of his work. The objective is to encode the material in such a way that the intertextual relations between the literary work and its external sources is visualized.

These -sometimes subtle- intertextual relationships provide insight into the nature of writing, all the more since the genesis of the literary work itself is already encoded in the edition (i.e. adds, dels, etc.). The envisioned result enables a user to study the writing process in detail as well as on a larger scale.

Material:

  • detailed XML TEI transcription of the literary work concerned
  • high quality digital facsimiles of the author's personal extant library
  • rough transcriptions of the library books (based on OCR)


Comments:

  • Clarification: is the project about the visualization? the relationships? More detail would be great.

--Elli Bleeker 15:46, 1 July 2014 (CEST) The idea is to log the complete "path" of a citation: from a phrase underlined in a library book to the incorporation of that phrase in the author's work. Currently, the intertextual references are encoded with [ref] tag in the XML TEI transcription of the literary work. The [ref]s refer to another xml file containing the transcriptions of the personal library. This file consists of [div]s, that contain anything from a complete library book to a small section (paragraph, phrase) from a library book.

The visualization of the intertextual relations comes in a later stage when transforming the documents. I wonder whether it is possible to change or improve this encoding.


Rendering Complex Markup

Rendering complex markup in an innovative and playful way: reconstructing & visualizing author’s geographical position by the dates of sending of his letters (taking uncertainty into account)

Comments:

Magdalena Turska 12:36, 12 June 2014 (CEST)

Starting with a list of letters that have a place and date of sending the idea would be to present map overview of the author's journeys. See the simple Google Map at https://mapsengine.google.com/map/edit?mid=z3q4AefiR7Us.kf1fVBHlu8qE

Fist ideas re visualisation:

  • places color-coded with color getting darker to represent 'later' places
  • similarly shaded lines along the routes between places
  • animation with slider to show circle moving along the routes, size of the circle getting bigger with the uncertainty

Input data: each letter has assigned place name and 1+ time intervals (notBefore to not After)

--Frederike Neuber 22:25, 24 June 2014 (CEST) Would it be a problem to work on different material? @TEI-experts I would like to visualize the provenance of medieval manuscripts on a map, so even if I do not work with letters, the task is partly the same (extracting date and time from the <teiHeader> and visualizing it on a map). Then our scopes separate: Magdalena wants to reconstruct the journey, I want to visualize the chronology of different mss on one map.

Thomas Kollatz 15:56, 1 July 2014 (CEST) I agree with Frederike Neuber: the task is the same in all projects, where a spatio-temporal visualization makes sense: Extract date and time from header - and visualize the chronology of date/time and place on a map and/or timeline /TEI/teiHeader/fileDesc/sourceDesc/msDesc/history/origin

Adding a TEI mode to a web editor

Title says it all!

Comments:

--Raffaele.viglianti 19:20, 16 June 2014 (CEST) How does the Angles proposal above sound?

Juxta Script support

Adding support for Bengali text to Juxta Commons

Comments:

  • This is fairly narrow.

Thomas Kollatz 15:57, 1 July 2014 (CEST) If Bengali then Hebrew (Arabic … right-to-left), please

Visualization

The archive of early Caribbean texts and images is a fairly large project which will be heavily encoded (and a portion of texts will be encoded by July). I'd like to explore ways of using the TEI to visualize relations between and among texts and elements of texts.

Comments:

--Elli Bleeker 16:19, 1 July 2014 (CEST) What is the status of this project? I am also interested in finding ways to visualize relations between texts, perhaps there are similarities?

Jewish Sepulchral Headstones DB

Given that all headstones can be clearly located, and about 20.000 (of total 26.000) inscriptions are dated, and to a large extend can be distinguished by gender, and also by language usage (hebrew, german, german in Hebrew letters, …), they lend themselves to mining the epidat corpus for specific data facets and try to visualize the results "in an innovative and playful way".

  1. A challenging research questions could be the search for and visualization of the differences in word usage and specific idiomatic
    1. between different locations,
    2. within different periods (based on one or several locations),
    3. between inscriptions for men and inscriptions for women,
    4. between hebrew and non-hebrew inscriptions …
  2. Even more challenging is the search for inscriptions with very similar text coverage. These are quite difficult to discover by a simple full text search. The inscriptions are usually rather short, and the filter to develop has to consider the differentiating elements – usually names and dates – in order to determine, whether two text are identical or not.

All necessary information to answer this questions is contained in the metadata <teiHeader> and data <text> of each single record. example – spatial and temporal metadata (date, country/region code, geo-coordinates, Thesaurus of Getty Names or OSM ID)

    <history>
     <origin>
      <date notBefore='1621-08-17'>1621-08-17</date>
      <country type="ISO_3166" key="XA-DE-HH">
       Germany
       <region>Hamburg</region>
      </country>
      <settlement type='city' key='tgn:7012310'>
       Hamburg-Altona, Königstraße 
       <geogName>
        Jüdischer Friedhof
        <geo decls="#WGS">53.549373 9.950545</geo>
       </geogName>
      </settlement>
     </origin>
    </history>

example – gender-specific metadata (given according to ISO 5218:2004) <particDesc>

   <listPerson>
    <person xml:id="hha-3361-1" s x='1'>
     <persName>Schmuel ben Jehuda</persName>
      <event when='1621-08-17' type="dateofdeath">
      <desc/>
     </event>
    </person>
   </listPerson>
  </particDesc>

example – language Usage metadata:

   <langUsage>
   <language ident='he' usage='100'>Hebrew</language>
  </langUsage>


With respect for the TEI Hackathon I have (just) set up a website with very general information how to harvest epidat records: http://www.steinheim-institut.de/cgi-bin/epidat?info=howtoharvest (to be continued …) Over and beyond that it would make no great difficulty to provide a zip file with the data available for the workshop.

Comments: Thomas Kollatz 16:05, 1 July 2014 (CEST) The TEI spam filter seems to dislike the person-attribute s_x (middle-letter e) … rather prudish, isn't it

Discussion

What commonalities to you see? What seems interesting? Anything to add?

--Frederike Neuber 23:21, 24 June 2014 (CEST) Two (even if very different) projects, "Jewish Sepulchral Headstones DB" and "Rendering complex mark up", are dealing with visualization of spatial and temporal metadata.

Three (again very different) projects, "Visualization of intertextual TEI content", "Semantic Connections for Epigraphic Documents" and "Jewish Sepulchral Headstones DB", are dealing with semantic connection and modelling of content (even if not all of them say it explicitly in the description).

--Elli Bleeker 16:25, 1 July 2014 (CEST) Add to that the visualization of textual relations in the early Caribbean texts archive.

-- Felix Lange 16:30 4 July 2014 (CEST) - That's sounds pretty interesting, I'm on board. The text-analysis ("mining") part seems pretty tough, though. How exactly should the content of different inscriptions be compared? Do you think about general word-frequency-statistics, that can populate a Word-Cloud? Or do you plan text comparison that is "semantic" in a certain sense?

Core Builder

Another possible tool to work on: https://github.com/raffazizzi/coreBuilder It provides a simple web interface to create stand-off markup. TEI files can be open into multiple ACE editors and the user can click on elements with xml:ids to create references. The current version creates <app> elements containing <rdg>s with pointers to the selected elements. It should be easy enough to make the elements user configurable so that the Core Builder can be used to put together <linkGrp>s or <timeline>s. The tool is written in CoffeScript with a Backbone framework. --Raffaele.viglianti 19:07, 16 June 2014 (CEST)

--Elli Bleeker 16:23, 1 July 2014 (CEST) From the sound of it, I think I would like to work on this project. Some more detail would be great.

Here's a demo of the current version. It's got a few bugs but should show the basic functionality. Pick a couple of sources, then click on any element with xml:id to create a selection. Click add to save it to the "core". https://dl.dropboxusercontent.com/u/2443674/coreBuilder/index.html --Raffaele.viglianti 02:31, 3 July 2014 (CEST)

Personal tools