目录
- 38.1. Tesseract
- 38.2. cuneiform - multi-language OCR system
https://help.ubuntu.com/community/OCR
38.1. Tesseract
查找Tesseract安装包
$ apt-cache search Tesseract ocrodjvu - tool to perform OCR on DjVu documents slimrat - GUI application for automated downloading from file hosters slimrat-nox - CLI application for automated downloading from file hosters tesseract-ocr - Command line OCR tool tesseract-ocr-deu - tesseract-ocr language files for German text tesseract-ocr-deu-f - tesseract-ocr language files for the German Fraktur script tesseract-ocr-dev - Development files for the tesseract command line OCR tool tesseract-ocr-eng - tesseract-ocr language files for English text tesseract-ocr-fra - tesseract-ocr language files for French text tesseract-ocr-ita - tesseract-ocr language files for Italian text tesseract-ocr-nld - tesseract-ocr language files for Dutch text tesseract-ocr-por - tesseract-ocr language files for Brasilian Portuguese text tesseract-ocr-spa - tesseract-ocr language files for Spanish text tesseract-ocr-vie - tesseract-ocr language files for Vietnamese text
$ sudo apt-get install tesseract-ocr
$ convert test.jpg test.tif $ tesseract test.tif test $ cat test.txt
原文出处:Netkiller 系列 手札
本文作者:陈景峯
转载请与作者联系,同时请务必标明文章原始出处和作者信息及本声明。