"Nidaba is an open source distributed optical character recognition pipeline that makes it easy to preprocess, OCR, and postprocess scans of text documents in a multitude of ways."
"Nidaba does a bunch of things for you:
- Grayscale Conversion
- TEI output
- Format Conversion"
Please sign all comments.
Source code and licensing
Support for TEI
"Nidaba is capable of encoding the OCR results and their metadata into XML documents following the most recent P5 guidelines. The output is designed to facilitate further manual annotation."
Current version number and date of release