Artificial Intelligence/Robotoff/Roadmap: Difference between revisions
(Created page with "2021 Roadmap draft === āMore themes for detections, more efficient detections, with more techniques, more broadly automatically applied, and more broadly distributed and va...") Ā |
|||
Line 45: | Line 45: | ||
**** Secondary goal: OCR-based category prediction | **** Secondary goal: OCR-based category prediction | ||
Ā | ==== Categories and ingredients ==== | ||
Ā | |||
Ā | |||
* Categories | * Categories | ||
** [https://github.com/openfoodfacts/openfoodfacts-ai/issues/6 Generate new categories by automatically clustering similar products] | ** [https://github.com/openfoodfacts/openfoodfacts-ai/issues/6 Generate new categories by automatically clustering similar products] | ||
* Ingredients | * Ingredients | ||
** [https://github.com/openfoodfacts/robotoff/issues/155 Check if the ingredient list matches the Google OCR result]Ā | ** [https://github.com/openfoodfacts/robotoff/issues/155 Check if the ingredient list matches the Google OCR result]Ā | ||
==== Support quality ==== | ==== Support quality ==== | ||
(Probably doable in Q2, after the Wild School prototype completes) | (Probably doable in Q2, after the Wild School prototype completes) |
Revision as of 19:27, 1 February 2021
2021 Roadmap draft
āMore themes for detections, more efficient detections, with more techniques, more broadly automatically applied, and more broadly distributed and validated by contributorsā
Developper experience
- Document installation better including simulating a Product Opener sending and receiving to and from Robotoff
- Make it easy to assess impact of Regex on existing OCRs (or the need for it could be replaced by the self service regex system)
Solve current performance issues
- Resolve performance issues on Robotoff and Hunger Games to unlock usage
- New servers
- Ensure we can read/write on 2 different servers
- Performance issues ((Hypothesis) PeeWee random might be slowing down Hunger Games queries)
- Rebuild the monitoring system on Grafana for Robotoff (perf issues, and down)
- Ensure we have current taxonomies loaded
- Create a test that robotoff-app is able to write
Put an initial version of nutrition table extraction in production
- Put the nutrition table in production, with a website and app integration
Integrate spellcheck into Product Opener
- Integrate the spellcheck into Product Opener (already available in the Robotoff API)
Add more detections
There are a lot of things to detect on a packaging. While we can add many individual detections (and that's very fine, there are some good and impactful examples in this project), we should aim to build general purpose systems that will scale across languages and kinds of detection (eg a system able to detect any kind of recurring text pattern)
- Integrate Spanish packager code detection (Partial PR available)
- "Is it a cosmetic product ?" (YES/NO, move to OBF if yes), based on current detection already in place, based on cosmetic specific keywords
- Detect Gluten-free certification - Detect AFDIAG products
Speculative projects
- Being able to auto-extract recurring patterns of text, and create rules to populate fields from them (or create insights) (Le Wagon project)
Increase insight validation
Increase insight validation, either by
- increasing distribution (though apps, 3rd party apps, gamesā¦)
- increasing ability to validate them automatically
- More aggressive application of many insights
- EMLyon project
- apply top level categories for the Nutri-Score
- Secondary goal: OCR-based category prediction
- EMLyon project
- More aggressive application of many insights
Categories and ingredients
- Categories
- Ingredients
Support quality
(Probably doable in Q2, after the Wild School prototype completes)
- Leverage the historization databases to:
- Score edits (bad edits compared to the category, or to some labelled dataset of bad edits)
- Score contributors (bad edits, likely a producer)
Support the Eco-Score
Extract the various variable of the Eco-Score:
- packaging type and shape
- origins of ingredients (inside or outside ingredient lists)
- specific labels (as image or text)
- Improve support for them, find techniques to improve detection, create an Eco-Score page in Hunger Games
Hunger Games
- Bring the logos and labels games to the next level
- Bind the nutrition game to the nutrition prediction
- Add a nudge to login to Open Food Facts
- Add a mission page (could be a link to a wiki page)
Next-gen Robotoff and high impact tooling
- Secondary insights
- Composite insights - "IF & IF THEN" insights
- Add a self-service Elastic Search interface to let power users create new Regex rules (search within all the OCRs for a keyword or pattern, to help find the potential volumetry and refine the REGEX, and then add a Rule-building interface that saves the detection in a new BDD table that will then be used by Robotoff)