
open-source character recognition
Main index|
Download|
Screenshots|
Examples|
Developers|
Support|
Links 
News
  - December 24, 2006
  
 - libgocr is dead. It was replaced by a new project, Conjecture.
  
 - July 22, 2001
  
 - Released version 0.7.1. The MDK will be released soon.
  
 - June 24, 2001
  
 - Current CVS version is, for the first time, working from head to tail.
      A new packaged version will be released soon.
 
About
libgocr is an attempt to create a library that have all functionality that you 
may need to develop a OCR engine. Instead of wasting time to write I/O
functions, linked lists, all the steps in the recognition process, etc, etc,
just code your new revolutionary algorithm at once!
libgocr is completely modular, using a plugin system: you can have tens of
plugins to process your text, resulting in a much more precise recognition.
This system also stimulates recognition of not-standard text: think about
partitures, equations, block diagrams, etc: it all can be done, and, what's
best, TOGETHER.
Features
  - File input: support most common image types.
  
 - Unicode support.
  
 - Module system allows you to develop specific code (such as only 
    segmentation, etc) and integrate with existing code, without 
    recompilation.
  
 - Useful bits of code: linked lists, hash tables.
  
 - Automatic parallelism (not implemented yet).
  
 - Debug routines.
  
 - Frontend comm system (not implemented yet).
  
 - More. :)
 
Current state
libgocr is still a new project, in beta. It's written completely from scratch, 
taking advantage of what we learned coding the original gocr program. Here's
our plan:
  - Develop libgocr until it's stable. In parallel, continue the development
  of original gocr, focusing only in the recognition engine.
  
 - Once libgocr is stable and usable, gocr will be converted into a plugin,
  probably named "gocr main module" (gmm).
  
 - From now on, libgocr and gmm will be developed in parallel. Since libgocr
  is just the API, we expect most of the wrok will be directed to gmm.
 
WE NEED DEVELOPERS, i.e., people that ACTIVELY code. And, of course, your
comments and ideas are VERY apreciated. Tell us what you think.
About the Module Development Kit which is mentioned in libgocr 
documentation: it exists, but since the module system is not working (well) 
yet, and the API is likely do change, the MDK was not released. If you want 
it, a ask me. UPDATE: See the developers
section below for a preview of the MDK.
Documentation
The API is well documented: there's a manually written LaTeX file which is 
almost a tutorial, which comes in the package. You can get the 
gzipped postscript here. You can browse online automatic generated documentation using Doxygen. 
It comes in the package too, but you need Doxygen to build it.
Download
You can get it here. Remember, it's a
development version, and we recommend that you do not "make install". Some test
programs are available.
Developers
(2001/06/24): just uploaded a preview of the Module
Development Kit, with a simple module that I have been using to test and debug 
libgocr. The MDK will be officialy released soon.
  Contact us: see the Support page.
jOCR is at
 since June 2000 
(announcement mailing list, etc.)