OCR/Roadmap: Difference between revisions
No edit summary |
No edit summary |
||
(15 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
Currently, all products are edited manually. This project is about automatic or semi-automatic detection of a number of things using OCR and Computer vision. | Currently, all products are edited manually. This project is about automatic or semi-automatic detection of a number of things using OCR and Computer vision. | ||
== | == Product Opener improvements == | ||
* | * Process all uploaded images using Tesseract and/or the New Cloud based engine | ||
* | * Return JSON to mobile client and/or web client for suggestions to the user | ||
* Add support to search into OCR results | |||
== ✅ TODO == | |||
* Process Open Beauty Facts, Open Pet Food Facts, Open Products Facts images | |||
* Process the Belgian Food Photographs | |||
* | |||
== Short term goals == | == Short term goals == | ||
* Use the right standard dict for each language | * Use the right standard dict for each language | ||
* Integrate custom lists from Global Ingredients Taxonomy | * Integrate custom lists from Global Ingredients Taxonomy | ||
* | * USDA UNII list of ingredients (will also work for Open Beauty Facts) | ||
*** | |||
*** | === Testing === | ||
** | ==== Create a golden set of products that are complete ==== | ||
* Product | |||
** Category: "Ingredients complete" "Ingredient images selected" | |||
** Get the ingredients image | |||
** Get the canonical (typed by contributors) ingredient list | |||
** Get the ingredients list generated with the current OCR system | |||
** Generate the ingredient list on your laptop based on the image, and the custom dictionary above | |||
** Compare the result with the canonical/golden test and report some accuracy measures | |||
* Draft Script: https://lite6.framapad.org/p/OFF_OCR_Script | |||
=== Easy wins === | |||
* Process all images and make products searchable, even if not filled yet | * Process all images and make products searchable, even if not filled yet | ||
== Long-term goals == | == Long-term goals == | ||
* Get dictionaries translations from Wikidata | |||
* Investigate Ocropus for complex layout extractions | |||
* Investigate Open CV for detection of patterns, logos… | |||
== Targets == | == Targets == | ||
Line 35: | Line 38: | ||
* Text (distorted - bottle case, diagonally - with low light, bright light) | * Text (distorted - bottle case, diagonally - with low light, bright light) | ||
* Standardized layouts (US Nutrition labels) | * Standardized layouts (US Nutrition labels) | ||
== Store in separate image for further reference == | |||
* Standardized text (quantities, EU Packaging codes) | * Standardized text (quantities, EU Packaging codes) | ||
* Barcodes (extraction in uploaded images) | * Barcodes (extraction in uploaded images) | ||
* Image orientation: check that the text is properly oriented to guess if the image is properly oriented. | * Image orientation: check that the text is properly oriented to guess if the image is properly oriented. | ||
* Deep Learning | * Deep Learning | ||
** Product photo on packaging - guess category based on product picture | ** Product photo on packaging - guess category based on product picture | ||
Line 44: | Line 48: | ||
Extracting areas is already great work: if we can extract logos or patterns, it will be faster for humans to double check and turn that into text. | Extracting areas is already great work: if we can extract logos or patterns, it will be faster for humans to double check and turn that into text. | ||
[[Category:Roadmap]] | |||
[[Category:Project]] | [[Category:Project]] | ||
[[Category: | [[Category:ProductOpener]] | ||
[[Category:OCR]] | |||
[[Category:Artificial Intelligence]] |
Latest revision as of 08:42, 28 August 2024
Currently, all products are edited manually. This project is about automatic or semi-automatic detection of a number of things using OCR and Computer vision.
Product Opener improvements
- Process all uploaded images using Tesseract and/or the New Cloud based engine
- Return JSON to mobile client and/or web client for suggestions to the user
- Add support to search into OCR results
✅ TODO
- Process Open Beauty Facts, Open Pet Food Facts, Open Products Facts images
- Process the Belgian Food Photographs
Short term goals
- Use the right standard dict for each language
- Integrate custom lists from Global Ingredients Taxonomy
- USDA UNII list of ingredients (will also work for Open Beauty Facts)
Testing
Create a golden set of products that are complete
- Product
- Category: "Ingredients complete" "Ingredient images selected"
- Get the ingredients image
- Get the canonical (typed by contributors) ingredient list
- Get the ingredients list generated with the current OCR system
- Generate the ingredient list on your laptop based on the image, and the custom dictionary above
- Compare the result with the canonical/golden test and report some accuracy measures
- Draft Script: https://lite6.framapad.org/p/OFF_OCR_Script
Easy wins
- Process all images and make products searchable, even if not filled yet
Long-term goals
- Get dictionaries translations from Wikidata
- Investigate Ocropus for complex layout extractions
- Investigate Open CV for detection of patterns, logos…
Targets
- Logos of brands (Getting them from POD ?)
- Logos of Labels (standardized)
- Text (distorted - bottle case, diagonally - with low light, bright light)
- Standardized layouts (US Nutrition labels)
Store in separate image for further reference
- Standardized text (quantities, EU Packaging codes)
- Barcodes (extraction in uploaded images)
- Image orientation: check that the text is properly oriented to guess if the image is properly oriented.
- Deep Learning
- Product photo on packaging - guess category based on product picture
- Container: guess whether it's a bottle, cardboard…
Extracting areas is already great work: if we can extract logos or patterns, it will be faster for humans to double check and turn that into text.