Automatic rule editing engine: Difference between revisions

Latest revision as of 11:27, 20 August 2024

Project 9: Build an automatic rule editing engine

Mockups

https://docs.google.com/presentation/d/1ZKov9S0XHY3J6Jn-Uk36-WcLI-FLXDqkMaUkS1uaRj4/edit#slide=id.ge6ba0caa2d_0_0

Description

Every day, Open Food Facts receives a lot of data from different sources, but the main contribution comes from users using a smartphone and contributing photos, data, answers to simple questions, etc. While this is a really efficient way to get data, quality is an important challenge. We have contributors capable of observing data, finding anomalies and applying corrections.

For example, one might observe that if a yogurt has sugar in it, it must belong to the “sweetened yogurt” category (and not only to the yogurt category).

Many corrections can lead to massive edits, and there is a dedicated tool. But a correction might be immediately overwritten by a new contribution. This is discouraging for advanced contributors. This one shot modification process does not permit capitalizing on fixes.

While thinking about it with some advanced contributors, we imagined a solution, which would involve While the implementation of rule resolution and application in the server code is quite easy, the hard part is to have a tool for contributors to find and test rules before capitalizing them.

Expected outcomes:

You will have to build a tool to give advanced contributors the ability to write rules that are applied automatically to data.

The tool must help users write rules (either directly or maybe with an interface) and validates their syntactic correctness.
It must provide a way to test the impact of a rule. The ability to cross the result (and non result) with a search, or easily test different versions of the rule, might help digging into the results and refine it.
The tool must provide a management of rule ownership and versioning.
The tool can be online or standalone. It may use the api to measure the impacts of a rule (a specific api can be developed by the team, if needed), or maybe directly the mongo database, or if needed a specific database (but synchronization would have to be considered).
Optionally, gamification might be considered. We have a gamification service to collect achievements, and rule impact could be registered.
Optionally we could consider if the rule building service might also become, in the future, the service that applies rules to incoming edits.

Skills required/preferred: A first experience with databases is necessary, Mongo knowledge would be a plus. An understanding of things like regular expressions is also required..

Slack channels: #product-opener
Potential mentors: Alexandre Fauquette / Pierre Slamich
Project duration: 350h
Difficulty rating: Hard

Additional resources

https://github.com/openfoodfacts/robotoff/issues/226

@@ Line 1: / Line 1: @@
+[[Category:Project]]
+[[Category:Developer]]
+=== Project 9: Build an automatic rule editing engine ===
+==== Mockups ====
+* https://docs.google.com/presentation/d/1ZKov9S0XHY3J6Jn-Uk36-WcLI-FLXDqkMaUkS1uaRj4/edit#slide=id.ge6ba0caa2d_0_0
-=== Project 9: Build an automatic rule editing engine ===
+==== Description ====
-Description:
 Every day, Open Food Facts receives a lot of data from different sources, but the main contribution comes from users using a smartphone and contributing photos, data, answers to simple questions, etc. While this is a really efficient way to get data, quality is an important challenge. We have contributors capable of observing data, finding anomalies and applying corrections.
@@ Line 30: / Line 34: @@
-### Additional resources
+=== Additional resources ===
 https://github.com/openfoodfacts/robotoff/issues/226