Berichten met de tag ‘OCR’
Ubuntu document scanning and OCR
In a previous blog message, I described the procedure of scanning documents within OpenOffice.org. Now the following piece of software, gscan2pdf, is even more useful, since it allows us to OCR (Optical Character Recognition) the scanned document, and thus, make it usable in a lot of other programs: scanned images (characters) within the document can be searched as text. Please see this video (in Dutch). Installation of the program is a snap in Ubuntu: in your Synaptic manager. You select the following packages to install: gscan2pdf, pdftk, pdf2djvu, and, for Dutch Ubuntu users: tesseract-ocr-nld. That’s all there is to!
