Nidaba

From TEIWiki

Jump to: navigation, search


Contents

Synopsis

"Nidaba is an open source distributed optical character recognition pipeline that makes it easy to preprocess, OCR, and postprocess scans of text documents in a multitude of ways."[1]

Features

"Nidaba does a bunch of things for you:

  • Grayscale Conversion
  • Binarization
  • Deskewing
  • Dewarping
  • OCR
  • Spell-checking
  • TEI output
  • Format Conversion"[2]

User commentary

Please sign all comments.

System requirements

http://openphilology.github.io/nidaba/index.html#installation

Source code and licensing

http://openphilology.github.io/nidaba/index.html#licensing-and-authorship

Support for TEI

"Nidaba is capable of encoding the OCR results and their metadata into XML documents following the most recent P5 guidelines. The output is designed to facilitate further manual annotation."[3]

Language(s)

Documentation

http://openphilology.github.io/nidaba/index.html

Tech support

User community

Sample implementations

Current version number and date of release

0.9.7

History of versions

How to download or buy

https://pypi.python.org/pypi/nidaba/0.9.7

Additional notes

References

  1. http://openphilology.github.io/nidaba/index.html
  2. http://openphilology.github.io/nidaba/index.html
  3. http://openphilology.github.io/nidaba/tei.html
Personal tools