Posts tagged ‘Tesseract’

Ubuntu document scanning and OCR

In a previous blog message, I described the procedure of scanning documents within OpenOffice.org. Now the following piece of software, gscan2pdf, is even more useful, since it allows us to OCR (Optical Character Recognition) the scanned document, and thus, make it usable in a lot of other programs: scanned images (characters)  within the document can be searched as text. Please see this video (in Dutch). Installation of the program is a snap in Ubuntu: in your Synaptic manager. You select the following packages to install: gscan2pdf, pdftk, pdf2djvu, and, for Dutch Ubuntu users: tesseract-ocr-nld. That’s all there is to!

synaptic

09/09/2009 at 6:56 pm Plaats een reactie


Posted previously