OCR/Roadmap: Difference between revisions

Latest revision as of 08:42, 28 August 2024

Currently, all products are edited manually. This project is about automatic or semi-automatic detection of a number of things using OCR and Computer vision.

Product Opener improvements

Process all uploaded images using Tesseract and/or the New Cloud based engine
Return JSON to mobile client and/or web client for suggestions to the user
Add support to search into OCR results

✅ TODO

Process Open Beauty Facts, Open Pet Food Facts, Open Products Facts images
Process the Belgian Food Photographs

Short term goals

Use the right standard dict for each language
Integrate custom lists from Global Ingredients Taxonomy
USDA UNII list of ingredients (will also work for Open Beauty Facts)

Testing

Create a golden set of products that are complete

Product
- Category: "Ingredients complete" "Ingredient images selected"
- Get the ingredients image
- Get the canonical (typed by contributors) ingredient list
- Get the ingredients list generated with the current OCR system
- Generate the ingredient list on your laptop based on the image, and the custom dictionary above
- Compare the result with the canonical/golden test and report some accuracy measures
Draft Script: https://lite6.framapad.org/p/OFF_OCR_Script

Easy wins

Process all images and make products searchable, even if not filled yet

Long-term goals

Get dictionaries translations from Wikidata
Investigate Ocropus for complex layout extractions
Investigate Open CV for detection of patterns, logos…

Targets

Logos of brands (Getting them from POD ?)
Logos of Labels (standardized)
Text (distorted - bottle case, diagonally - with low light, bright light)
Standardized layouts (US Nutrition labels)

Store in separate image for further reference

Standardized text (quantities, EU Packaging codes)
Barcodes (extraction in uploaded images)
Image orientation: check that the text is properly oriented to guess if the image is properly oriented.

Deep Learning
- Product photo on packaging - guess category based on product picture
- Container: guess whether it's a bottle, cardboard…

Extracting areas is already great work: if we can extract logos or patterns, it will be faster for humans to double check and turn that into text.

@@ Line 1: / Line 1: @@
 Currently, all products are edited manually. This project is about automatic or semi-automatic detection of a number of things using OCR and Computer vision.
-Tools:
+== Product Opener improvements ==
-* Google Drive OCR or Google Goggles
+* Process all uploaded images using Tesseract and/or the New Cloud based engine
-* Ocropus
+* Return JSON to mobile client and/or web client for suggestions to the user
-* OpenCV
+* Add support to search into OCR results
-* Moodstocks
-Targets:
+== ✅ TODO ==
-* Logos (standardized)
+* Process Open Beauty Facts, Open Pet Food Facts, Open Products Facts images
-* Text
+* Process the Belgian Food Photographs
+== Short term goals ==
+* Use the right standard dict for each language
+* Integrate custom lists from Global Ingredients Taxonomy
+* USDA UNII list of ingredients (will also work for Open Beauty Facts)
+=== Testing ===
+==== Create a golden set of products that are complete ====
+* Product
+** Category: "Ingredients complete" "Ingredient images selected"
+** Get the ingredients image
+** Get the canonical (typed by contributors) ingredient list
+** Get the ingredients list generated with the current OCR system
+** Generate the ingredient list on your laptop based on the image, and the custom dictionary above
+** Compare the result with the canonical/golden test and report some accuracy measures
+* Draft Script: https://lite6.framapad.org/p/OFF_OCR_Script
+=== Easy wins ===
+* Process all images and make products searchable, even if not filled yet
+==  Long-term goals ==
+* Get dictionaries translations from Wikidata
+* Investigate Ocropus for complex layout extractions
+* Investigate Open CV for detection of patterns, logos…
+== Targets ==
+* Logos of brands (Getting them from POD ?)
+* Logos of Labels (standardized)
+* Text (distorted - bottle case, diagonally - with low light, bright light)
 * Standardized layouts (US Nutrition labels)
+== Store in separate image for further reference ==
 * Standardized text (quantities, EU Packaging codes)
 * Barcodes (extraction in uploaded images)
 * Image orientation: check that the text is properly oriented to guess if the image is properly oriented.
+* Deep Learning
+** Product photo on packaging - guess category based on product picture
+** Container: guess whether it's a bottle, cardboard…
 Extracting areas is already great work: if we can extract logos or patterns, it will be faster for humans to double check and turn that into text.
+[[Category:Roadmap]]
+[[Category:Project]]
+[[Category:ProductOpener]]
+[[Category:OCR]]
+[[Category:Artificial Intelligence]]