Artificial Intelligence/Robotoff/Roadmap: Difference between revisions
(Created page with "2021 Roadmap draft === āMore themes for detections, more efficient detections, with more techniques, more broadly automatically applied, and more broadly distributed and va...") Ā |
No edit summary Ā |
||
(3 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
2021 Roadmap draft | == 2021 Roadmap draft == | ||
Ā | |||
=== āMore themes for detections, more efficient detections, with more techniques, more broadly automatically applied, and more broadly distributed and validated by contributorsā === | === āMore themes for detections, more efficient detections, with more techniques, more broadly automatically applied, and more broadly distributed and validated by contributorsā === | ||
=== Pre-requisites === | |||
==== Developper experience ==== | ==== Developper experience ==== | ||
* Document installation better including simulating a Product Opener sending and receiving to and from Robotoff | * Document installation better including simulating a Product Opener sending and receiving to and from Robotoff | ||
* Make it easy to assess impact of Regex on existing OCRs (or the need for it could be replaced by the self service regex system) | * Make it easy to assess impact of Regex on existing OCRs (or the need for it could be replaced by the self service regex system) | ||
==== Solve current performance issues ==== | ==== Solve current performance issues ==== | ||
* Resolve performance issues on Robotoff and Hunger Games to unlock usage | * Resolve performance issues on Robotoff and Hunger Games to unlock usage | ||
Line 15: | Line 14: | ||
* Ensure we have current taxonomies loaded | * Ensure we have current taxonomies loaded | ||
* [https://github.com/openfoodfacts/robotoff/issues/214 Create a test that robotoff-app is able to write] | * [https://github.com/openfoodfacts/robotoff/issues/214 Create a test that robotoff-app is able to write] | ||
=== More detections === | |||
There are a lot of things to detect on a packaging. While we can add many individual detections (and that's very fine, there are some good and impactful examples in this project), we should aim to build general purpose systems that will scale across languages and kinds of detection (eg a system able to detect any kind of recurring text pattern) | |||
==== Put an initial version of nutrition table extraction in production ==== | ==== Put an initial version of nutrition table extraction in production ==== | ||
Line 24: | Line 26: | ||
* Integrate the spellcheck into Product Opener (already available in the Robotoff API) | * Integrate the spellcheck into Product Opener (already available in the Robotoff API) | ||
==== | ==== Categories and ingredients ==== | ||
* Categories | |||
** [https://github.com/openfoodfacts/openfoodfacts-ai/issues/6 Generate new categories by automatically clustering similar products] | |||
* Ingredients | |||
** [https://github.com/openfoodfacts/robotoff/issues/155 Check if the ingredient list matches the Google OCR result]Ā | |||
Ā | |||
==== Other ==== | |||
Ā | |||
* [https://github.com/openfoodfacts/robotoff/pull/234 Integrate Spanish packager code detection (Partial PR available)] | * [https://github.com/openfoodfacts/robotoff/pull/234 Integrate Spanish packager code detection (Partial PR available)] | ||
* "Is it a cosmetic product ?" (YES/NO, move to OBF if yes), based on current detection already in place, based on cosmetic specific keywords | * "Is it a cosmetic product ?" (YES/NO, move to OBF if yes), based on current detection already in place, based on cosmetic specific keywords | ||
* Detect Gluten-free certification - [https://github.com/openfoodfacts/robotoff/issues/249 Detect AFDIAG products] | * Detect Gluten-free certification - [https://github.com/openfoodfacts/robotoff/issues/249 Detect AFDIAG products] | ||
==== Support quality ==== | ==== Support quality ==== | ||
(Probably doable in Q2, after the Wild School prototype completes) | (Probably doable in Q2, after the Wild School prototype completes) | ||
Line 69: | Line 54: | ||
* specific labels (as image or text) | * specific labels (as image or text) | ||
** Improve support for them, find techniques to improve detection, create an Eco-Score page in Hunger Games | ** Improve support for them, find techniques to improve detection, create an Eco-Score page in Hunger Games | ||
=== Distribution === | |||
==== Increase insight validation ==== | |||
Increase insight validation, either by | |||
* increasing distribution (though apps, 3rd party apps, gamesā¦) | |||
* increasing ability to validate them automatically | |||
** More aggressive application of many insights | |||
*** EMLyon project | |||
**** apply top level categories for the Nutri-Score | |||
**** Secondary goal: OCR-based category prediction | |||
==== Hunger Games ==== | ==== Hunger Games ==== | ||
Line 75: | Line 72: | ||
* Add a nudge to login to Open Food Facts | * Add a nudge to login to Open Food Facts | ||
* Add a mission page (could be a link to a wiki page) | * Add a mission page (could be a link to a wiki page) | ||
==== Speculative projects ==== | |||
* Being able to auto-extract recurring patterns of text, and create rules to populate fields from them (or create insights) (Le Wagon project) | |||
==== Next-gen Robotoff and high impact tooling ==== | ==== Next-gen Robotoff and high impact tooling ==== | ||
Line 80: | Line 80: | ||
* Composite insights - [https://github.com/openfoodfacts/robotoff/issues/307 "IF & IF THEN" insights] | * Composite insights - [https://github.com/openfoodfacts/robotoff/issues/307 "IF & IF THEN" insights] | ||
* Add a self-service Elastic Search interface to let power users create new Regex rules (search within all the OCRs for a keyword or pattern, to help find the potential volumetry and refine the REGEX, and then add a Rule-building interface that saves the detection in a new BDD table that will then be used by Robotoff) | * Add a self-service Elastic Search interface to let power users create new Regex rules (search within all the OCRs for a keyword or pattern, to help find the potential volumetry and refine the REGEX, and then add a Rule-building interface that saves the detection in a new BDD table that will then be used by Robotoff) | ||
[[Category:Robotoff]] | |||
[[Category:Roadmap]] |
Latest revision as of 14:27, 23 February 2023
2021 Roadmap draft
āMore themes for detections, more efficient detections, with more techniques, more broadly automatically applied, and more broadly distributed and validated by contributorsā
Pre-requisites
Developper experience
- Document installation better including simulating a Product Opener sending and receiving to and from Robotoff
- Make it easy to assess impact of Regex on existing OCRs (or the need for it could be replaced by the self service regex system)
Solve current performance issues
- Resolve performance issues on Robotoff and Hunger Games to unlock usage
- New servers
- Ensure we can read/write on 2 different servers
- Performance issues ((Hypothesis) PeeWee random might be slowing down Hunger Games queries)
- Rebuild the monitoring system on Grafana for Robotoff (perf issues, and down)
- Ensure we have current taxonomies loaded
- Create a test that robotoff-app is able to write
More detections
There are a lot of things to detect on a packaging. While we can add many individual detections (and that's very fine, there are some good and impactful examples in this project), we should aim to build general purpose systems that will scale across languages and kinds of detection (eg a system able to detect any kind of recurring text pattern)
Put an initial version of nutrition table extraction in production
- Put the nutrition table in production, with a website and app integration
Integrate spellcheck into Product Opener
- Integrate the spellcheck into Product Opener (already available in the Robotoff API)
Categories and ingredients
- Categories
- Ingredients
Other
- Integrate Spanish packager code detection (Partial PR available)
- "Is it a cosmetic product ?" (YES/NO, move to OBF if yes), based on current detection already in place, based on cosmetic specific keywords
- Detect Gluten-free certification - Detect AFDIAG products
Support quality
(Probably doable in Q2, after the Wild School prototype completes)
- Leverage the historization databases to:
- Score edits (bad edits compared to the category, or to some labelled dataset of bad edits)
- Score contributors (bad edits, likely a producer)
Support the Eco-Score
Extract the various variable of the Eco-Score:
- packaging type and shape
- origins of ingredients (inside or outside ingredient lists)
- specific labels (as image or text)
- Improve support for them, find techniques to improve detection, create an Eco-Score page in Hunger Games
Distribution
Increase insight validation
Increase insight validation, either by
- increasing distribution (though apps, 3rd party apps, gamesā¦)
- increasing ability to validate them automatically
- More aggressive application of many insights
- EMLyon project
- apply top level categories for the Nutri-Score
- Secondary goal: OCR-based category prediction
- EMLyon project
- More aggressive application of many insights
Hunger Games
- Bring the logos and labels games to the next level
- Bind the nutrition game to the nutrition prediction
- Add a nudge to login to Open Food Facts
- Add a mission page (could be a link to a wiki page)
Speculative projects
- Being able to auto-extract recurring patterns of text, and create rules to populate fields from them (or create insights) (Le Wagon project)
Next-gen Robotoff and high impact tooling
- Secondary insights
- Composite insights - "IF & IF THEN" insights
- Add a self-service Elastic Search interface to let power users create new Regex rules (search within all the OCRs for a keyword or pattern, to help find the potential volumetry and refine the REGEX, and then add a Rule-building interface that saves the detection in a new BDD table that will then be used by Robotoff)