Data quality: Difference between revisions
(Add section: Data quality issues which can't be fixed) |
|||
Line 73: | Line 73: | ||
* save (if "A product already exists with the new code" message appear, move it manually, and delete it) | * save (if "A product already exists with the new code" message appear, move it manually, and delete it) | ||
* in [https://app.slack.com/client/T02KVRT1Q/CT2N423PA/thread/GCUD53J5R-1586349162.333800?cdn_fallback=2 #bot-image-alerts] channel, annotate the product with a "checked" icon to tell others that the product has been moved | * in [https://app.slack.com/client/T02KVRT1Q/CT2N423PA/thread/GCUD53J5R-1586349162.333800?cdn_fallback=2 #bot-image-alerts] channel, annotate the product with a "checked" icon to tell others that the product has been moved | ||
== Data quality issues which can't be fixed == | |||
Some data quality issues can't be fixed due to different reasons. See the dedicated page: [[Data quality issues which can't be fixed]]. | |||
== Data quality measurement == | == Data quality measurement == |
Revision as of 09:31, 6 December 2022
Some important things to know:
- Quality does not make sense for itself: quality depends on usages.
- No database at all can pretend to zero-default.
- With more than 2 600 000 products, there are quality concerns: our goal is to lower the impacts of the issues.
Data quality: how to help?
Nutrition values issues
Open Food Facts identifies some issues related to nutrition values. Some of them are very easy to solve:
- Energy value in kcal greater than in kJ
- Nutrition Salt is higher than 100g per 100g
- Carbohydrate is higher than 100g per 100g
- Fat is higher than 100g per 100g
Nutri-Score quality
Some products now have Nutri-Score printed on the front of pack. Some differs from our Nutri-Score calculation. We should take care about that:
- Nutri-Score printed A but calculated E
- Nutri-Score printed A but calculated D
- Nutri-Score printed A but calculated C
- Nutri-Score printed E but calculated A
- Nutri-Score printed E but calculated B
- Nutri-Score printed E but calculated C
- Nutri-Score printed D but calculated A
- Nutri-Score printed D but calculated B
- Nutri-Score printed D but calculated C
- Nutri-Score printed B but calculated E
- Nutri-Score printed B but calculated D
- Nutri-Score printed B but calculated C
There are many reasons why it can differ:
- the label in Open Food Facts does not represent the label printed on the package (easy to solve)
- the label is correct, but our calculation doesn't provide the same result:
- check the category,
- then check the nutrition facts: the issue is sometimes the lack of "fibers" information or the lack of "Fruits, vegetables, nuts and rapeseed, walnut and olive oils" percentage.
- it can be a software issue (quite rare but possible).
Issue | Rationale | How to fix |
---|---|---|
The Nutri-Score displayed by the producer is different from the Nutri-Score Open Food Facts computes | The label in Open Food Facts does not represent the label printed on the package | Change the label |
idem | The category is wrong | Change the category and see if it's modifying the Nutri-Score |
idem | The nutrition facts are wrong |
|
Cleaning up the consequences of an old Android bug
The word "Loading…" replaced the correct product name. 99% of phones have been updated with the fix, but we still have some unfixed products.
Non-Food Products
Some people are adding products which are not food: beauty products, books, pet food, etc. These products have to be moved to Open Food Facts side projects. Our AI (artificial intelligence) already identifies many cases. These cases are published in the #bot-image-alerts channel on our slack space.
How to move these products?
- identify a product in the #bot-image-alerts channel
- clic on the link after "edit:"
- if you have the rights to so, you will see "If the barcode is not correct, please correct it here"
- enter "obf" to move beauty products to Open Beauty Facts
- enter "opff" to move products to Open Pet Food Facts
- enter "opf" to move products to Open Product Facts
- save (if "A product already exists with the new code" message appear, move it manually, and delete it)
- in #bot-image-alerts channel, annotate the product with a "checked" icon to tell others that the product has been moved
Data quality issues which can't be fixed
Some data quality issues can't be fixed due to different reasons. See the dedicated page: Data quality issues which can't be fixed.
Data quality measurement
We have started an initiative to measure and publish continuously some data quality stats. We have created a specific page dedicated to data quality stats.
Helping technically with data quality
- If you have technical skills, you can also do your part for data quality. Head over to our tracking issue on GitHub