code.google.com/p/tesseract-ocr -> code.google.com/p/tesseract-ocr/
mhasnat Background The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available. The sourcecode will read a binary, grey or color image and output text. A tiff readeris built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images.
Supported Platforms The developers are regularly testing on the following platforms: * Ubuntu 606 (x86/32, x86/64) * Ubuntu 610 (x86/32, x86/64) * Windows (x86/32) with Visual C++ Express 2008 Additionally, we believe that the code should be running on these other platforms, but we don't have the resources to test on them regularly: * recent Linux distributions (x86/32, x86/64) * Mac OS X (x86, PPC) People have reported success with Cygwin on Windows, but this is not a tested platform. If you're interested in supporting other platforms or languages, please get in touch with Ray Smith. Roadmap Version 204 release is now available for download and contains the following new features: * Many reported issues fixed, especially portability issues: 1, 63, 67, 71, 76, 79, 81, 82, 84, 106, 108, 111, 112, 128, 129, 130, 133, 135, 142, 143, 145, 146, 147, 153, 154, 160, 165, 169, 170, 175, 177, 187, 192, 195, 199, 201, 205, 209. The release candidate will be available from the downloads page soon, after further testing.
Even the windows executables tarball is incomplete as language files are required. The upcoming 300 release will probably include: * Page layout analysis. Core Developers The core developer on the project is Ray Smith (theraysmith).
OCRopus project, for which Tesseract is one of the pluggable OCR engines; OCRopus also provides layout analysis and statistical language modeling. Migration As you have probably noticed, the Tesseract project has migrated from SourceForge to Google hosting. We were actually happy with SourceForge hosting, but since we needed to move from CVS to Subversion anyway, it seemed to make sense to move to Google hosting at the same time. We had planned on announcing the migration first and spending some time on it, but it turned out to be so quick and easy that we were done the same day. If you have questions or concerns about this migration, please contact Ray Smith. The major difference is that there is no discussion forum.
|