OCR: Difference between revisions

Revision as of 12:03, 24 January 2016

Current state

OCR extraction of Ingredients using Tesseract 2 (production) and 3 (.net)
Uses the French dictionary for all languages

-- /home/off-fr/cgi# grep get_ocr *
Ingredients.pm:use Image::OCR::Tesseract 'get_ocr';
Ingredients.pm: $text =  decode utf8=>get_ocr($image,undef,'fra');

Has a small custom dictionary for French ( /usr/share/tesseract-ocr/tessdata/fra.user-words)
- https://code.google.com/p/tesseract-ocr/wiki/FAQ#How_do_I_provide_my_own_dictionary

Roadmap

OCR/Roadmap

@@ Line 1: / Line 1: @@
-== Status ==
+== Current state ==
+* OCR extraction of Ingredients using Tesseract 2 (production) and 3 (.net)
+* Uses the French dictionary for all languages
+<pre>
+-- /home/off-fr/cgi# grep get_ocr *
+Ingredients.pm:use Image::OCR::Tesseract 'get_ocr';
+Ingredients.pm: $text =  decode utf8=>get_ocr($image,undef,'fra');
+</pre>
+* Has a small custom dictionary for French ( /usr/share/tesseract-ocr/tessdata/fra.user-words)
+**https://code.google.com/p/tesseract-ocr/wiki/FAQ#How_do_I_provide_my_own_dictionary
 == Roadmap ==
 [[OCR/Roadmap]]