OCR: Difference between revisions
(Created page with "== Status == == Roadmap == OCR/Roadmap") |
No edit summary |
||
Line 1: | Line 1: | ||
== | == Current state == | ||
* OCR extraction of Ingredients using Tesseract 2 (production) and 3 (.net) | |||
* Uses the French dictionary for all languages | |||
<pre> | |||
-- /home/off-fr/cgi# grep get_ocr * | |||
Ingredients.pm:use Image::OCR::Tesseract 'get_ocr'; | |||
Ingredients.pm: $text = decode utf8=>get_ocr($image,undef,'fra'); | |||
</pre> | |||
* Has a small custom dictionary for French ( /usr/share/tesseract-ocr/tessdata/fra.user-words) | |||
**https://code.google.com/p/tesseract-ocr/wiki/FAQ#How_do_I_provide_my_own_dictionary | |||
== Roadmap == | == Roadmap == | ||
[[OCR/Roadmap]] | [[OCR/Roadmap]] |
Revision as of 12:03, 24 January 2016
Current state
- OCR extraction of Ingredients using Tesseract 2 (production) and 3 (.net)
- Uses the French dictionary for all languages
-- /home/off-fr/cgi# grep get_ocr * Ingredients.pm:use Image::OCR::Tesseract 'get_ocr'; Ingredients.pm: $text = decode utf8=>get_ocr($image,undef,'fra');
- Has a small custom dictionary for French ( /usr/share/tesseract-ocr/tessdata/fra.user-words)