open-source character recognition

Main index| Download| Screenshots| Examples| Developers| Support| Links

News

December 24, 2006
libgocr is dead. It was replaced by a new project, Conjecture.
July 22, 2001
Released version 0.7.1. The MDK will be released soon.
June 24, 2001
Current CVS version is, for the first time, working from head to tail. A new packaged version will be released soon.

About

libgocr is an attempt to create a library that have all functionality that you may need to develop a OCR engine. Instead of wasting time to write I/O functions, linked lists, all the steps in the recognition process, etc, etc, just code your new revolutionary algorithm at once!
libgocr is completely modular, using a plugin system: you can have tens of plugins to process your text, resulting in a much more precise recognition. This system also stimulates recognition of not-standard text: think about partitures, equations, block diagrams, etc: it all can be done, and, what's best, TOGETHER.

Features


Current state

libgocr is still a new project, in beta. It's written completely from scratch, taking advantage of what we learned coding the original gocr program. Here's our plan:
  1. Develop libgocr until it's stable. In parallel, continue the development of original gocr, focusing only in the recognition engine.
  2. Once libgocr is stable and usable, gocr will be converted into a plugin, probably named "gocr main module" (gmm).
  3. From now on, libgocr and gmm will be developed in parallel. Since libgocr is just the API, we expect most of the wrok will be directed to gmm.
WE NEED DEVELOPERS, i.e., people that ACTIVELY code. And, of course, your comments and ideas are VERY apreciated. Tell us what you think.
About the Module Development Kit which is mentioned in libgocr documentation: it exists, but since the module system is not working (well) yet, and the API is likely do change, the MDK was not released. If you want it, a ask me. UPDATE: See the developers section below for a preview of the MDK.

Documentation

The API is well documented: there's a manually written LaTeX file which is almost a tutorial, which comes in the package. You can get the gzipped postscript here. You can browse online automatic generated documentation using Doxygen. It comes in the package too, but you need Doxygen to build it.

Download

You can get it here. Remember, it's a development version, and we recommend that you do not "make install". Some test programs are available.

Developers

(2001/06/24): just uploaded a preview of the Module Development Kit, with a simple module that I have been using to test and debug libgocr. The MDK will be officialy released soon.

Contact us: see the Support page.

jOCR is at SourceForge Logo since June 2000 (announcement mailing list, etc.)