6,323
edits
m (Stephane moved page Student projects/2018/ENSAE data science to Student projects/2018/ENSAE Classification and Error Detection) |
No edit summary |
||
(4 intermediate revisions by 2 users not shown) | |||
Line 10: | Line 10: | ||
Participants: | Participants: | ||
* | * Laetitia | ||
* Zoé | * Zoé | ||
Line 16: | Line 16: | ||
Most of the products photos and data on https://world.openfoodfacts.org is crowdsourced by users who scan product barcodes, use the Open Food Facts mobile app to send photos of the product, ingredients and nutrition facts, and the Open Food Facts web site to input the corresponding data. This project aims to find errors in data entered by users, and to speed up data entry by automatically classifying products or making suggestions to contributors. | Most of the products photos and data on https://world.openfoodfacts.org is crowdsourced by users who scan product barcodes, use the Open Food Facts mobile app to send photos of the product, ingredients and nutrition facts, and the Open Food Facts web site to input the corresponding data. This project aims to find errors in data entered by users, and to speed up data entry by automatically classifying products or making suggestions to contributors. | ||
=== Classification === | |||
Automatically classify products into product categories, brands, labels | |||
=== Error detection === | === Error detection === | ||
Line 21: | Line 27: | ||
Find errors in values entered by users for the nutrion facts (energy, fat, carbohydrates, proteins, salt etc.) | Find errors in values entered by users for the nutrion facts (energy, fat, carbohydrates, proteins, salt etc.) | ||
== Data == | == Data == | ||
* https://world.openfoodfacts.org/data | |||
== Approaches, algorithms etc. == | == Approaches, algorithms etc. == | ||
(I just listed some initial ideas -- Stephane) | |||
=== Classification === | |||
* use words extracted from OCR on product photos to classify products into categories, brands and labels | |||
=== Error detection === | |||
* cluster products by category to find outliers | |||
== Measuring results == | |||
== Source code == | |||
Open Food Facts uses Github for source control: https://github.com/openfoodfacts | |||
We have some Python clients and libraries: | |||
* https://github.com/openfoodfacts/openfoodfacts-python | |||
* https://github.com/openfoodfacts/OpenFoodFacts-APIRestPython | |||
== Misc. == | |||
* https://en.wikipedia.org/wiki/Weka_(machine_learning) | |||
[[Category:Data quality]] | |||
[[Category:Machine learning]] | |||
[[Category:Robotoff]] |