Automatic rule editing engine: Difference between revisions
No edit summary |
No edit summary |
||
(3 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
[[Category:Project]] | |||
[[Category:Developer]] | |||
=== Project 9: Build an automatic rule editing engine === | |||
==== Mockups ==== | |||
* https://docs.google.com/presentation/d/1ZKov9S0XHY3J6Jn-Uk36-WcLI-FLXDqkMaUkS1uaRj4/edit#slide=id.ge6ba0caa2d_0_0 | |||
=== | ==== Description ==== | ||
Every day, Open Food Facts receives a lot of data from different sources, but the main contribution comes from users using a smartphone and contributing photos, data, answers to simple questions, etc. While this is a really efficient way to get data, quality is an important challenge. We have contributors capable of observing data, finding anomalies and applying corrections. | Every day, Open Food Facts receives a lot of data from different sources, but the main contribution comes from users using a smartphone and contributing photos, data, answers to simple questions, etc. While this is a really efficient way to get data, quality is an important challenge. We have contributors capable of observing data, finding anomalies and applying corrections. | ||
Line 30: | Line 34: | ||
=== Additional resources === | |||
https://github.com/openfoodfacts/robotoff/issues/226 | https://github.com/openfoodfacts/robotoff/issues/226 |
Latest revision as of 11:27, 20 August 2024
Project 9: Build an automatic rule editing engine
Mockups
Description
Every day, Open Food Facts receives a lot of data from different sources, but the main contribution comes from users using a smartphone and contributing photos, data, answers to simple questions, etc. While this is a really efficient way to get data, quality is an important challenge. We have contributors capable of observing data, finding anomalies and applying corrections.
For example, one might observe that if a yogurt has sugar in it, it must belong to the “sweetened yogurt” category (and not only to the yogurt category).
Many corrections can lead to massive edits, and there is a dedicated tool. But a correction might be immediately overwritten by a new contribution. This is discouraging for advanced contributors. This one shot modification process does not permit capitalizing on fixes.
While thinking about it with some advanced contributors, we imagined a solution, which would involve While the implementation of rule resolution and application in the server code is quite easy, the hard part is to have a tool for contributors to find and test rules before capitalizing them.
Expected outcomes:
You will have to build a tool to give advanced contributors the ability to write rules that are applied automatically to data.
- The tool must help users write rules (either directly or maybe with an interface) and validates their syntactic correctness.
- It must provide a way to test the impact of a rule. The ability to cross the result (and non result) with a search, or easily test different versions of the rule, might help digging into the results and refine it.
- The tool must provide a management of rule ownership and versioning.
- The tool can be online or standalone. It may use the api to measure the impacts of a rule (a specific api can be developed by the team, if needed), or maybe directly the mongo database, or if needed a specific database (but synchronization would have to be considered).
- Optionally, gamification might be considered. We have a gamification service to collect achievements, and rule impact could be registered.
- Optionally we could consider if the rule building service might also become, in the future, the service that applies rules to incoming edits.
Skills required/preferred: A first experience with databases is necessary, Mongo knowledge would be a plus. An understanding of things like regular expressions is also required..
- Slack channels: #product-opener
- Potential mentors: Alexandre Fauquette / Pierre Slamich
- Project duration: 350h
- Difficulty rating: Hard