Jump to content

Student projects/GSOC/Proposals: Difference between revisions

no edit summary
No edit summary
Line 5: Line 5:
This page lists the key areas where we need the most help. You are of course welcome to propose other project ideas, and we are looking forward to discussing these ideas and yours.
This page lists the key areas where we need the most help. You are of course welcome to propose other project ideas, and we are looking forward to discussing these ideas and yours.


== Project ideas ==
* To discuss ideas, please join us on our Slack, #summerofcode channel: https://openfoodfacts.slack.com/messages/summerofcode
* To get an instant invite to our Slack: https://slack-ssl-openfoodfacts.herokuapp.com/


=== Improve New Native Android and iOS apps to drive mass adoption and mass contribution  ===   
= Google Summer of Code 2018 Project ideas =
 
== Improve New Native Android and iOS apps to drive mass adoption and mass contribution  ==  


Why it's important: most of the data in the Open Food Facts database come from crowdsourcing through mobile apps: users scan barcodes of products and send us photos and data for missing products. We need Android and iOS apps that bring a lot of value to users so that we gain mass adoption, and that have powerful features to contribute photos and data as easily and quickly as possible.
Why it's important: most of the data in the Open Food Facts database come from crowdsourcing through mobile apps: users scan barcodes of products and send us photos and data for missing products. We need Android and iOS apps that bring a lot of value to users so that we gain mass adoption, and that have powerful features to contribute photos and data as easily and quickly as possible.
Line 15: Line 18:
Key features needed:
Key features needed:


==== Augmented reality and continuous scan ====
=== Augmented reality and continuous scan ===


* Users need to be able to use the viewfinder of their camera to continuously scan for barcodes of products
* Users need to be able to use the viewfinder of their camera to continuously scan for barcodes of products
Line 22: Line 25:
* Stretch goal: recognize products without scanning barcode, using technologies like Pastec
* Stretch goal: recognize products without scanning barcode, using technologies like Pastec


==== Offline mode ====
=== Offline mode ===


* A small version of the database needs to be included in the app (at install, and then synched regularly)
* A small version of the database needs to be included in the app (at install, and then synched regularly)
Line 32: Line 35:
** Photos should be sent when network becomes available
** Photos should be sent when network becomes available


==== Drip editing ====
=== Drip editing ===


* Every little helps. Drip editing means asking Open Food Facts users little questions about the product they are looking at. They should take a split second to answer. Put together, they helps complete products quicker, update existing products and ensure quality. This project is about introducing drip editing, in collaboration with the backend team in either the Android or the iOS version.
* Every little helps. Drip editing means asking Open Food Facts users little questions about the product they are looking at. They should take a split second to answer. Put together, they helps complete products quicker, update existing products and ensure quality. This project is about introducing drip editing, in collaboration with the backend team in either the Android or the iOS version.


==== Personnalisation and recommendations ====
=== Personnalisation and recommendations ===


* Users should be able to provide data about them (age, sex, weight etc.) and their diet restrictions (e.g. allergens, vegan, religious) and preferences (organic, no GMOs, no palm oil..)
* Users should be able to provide data about them (age, sex, weight etc.) and their diet restrictions (e.g. allergens, vegan, religious) and preferences (organic, no GMOs, no palm oil..)
Line 43: Line 46:
* Display product recommendations / alternatives that better match the user preferences
* Display product recommendations / alternatives that better match the user preferences


=== Computer vision ===
== Computer vision ==


Why it's important: all product data comes from photos of the product and labels. Today most of this data is entered manually. In order to be able to scale, we need to extract more data from photos automatically.
Why it's important: all product data comes from photos of the product and labels. Today most of this data is entered manually. In order to be able to scale, we need to extract more data from photos automatically.
Line 49: Line 52:
Background: We currently only do basic OCR for ingredients. There is a lot of room for improvement.  
Background: We currently only do basic OCR for ingredients. There is a lot of room for improvement.  


==== Improve OCR for ingredients ====
=== Improve OCR for ingredients ===


* Create golden test sets to measure accuracy of the current OCR and improvements
* Create golden test sets to measure accuracy of the current OCR and improvements
Line 55: Line 58:
* Automatic cropping of ingredients lists
* Automatic cropping of ingredients lists


==== OCR for Nutrition Facts tables ====
=== OCR for Nutrition Facts tables ===


* Automatic recognition and cropping of nutrition facts table
* Automatic recognition and cropping of nutrition facts table
* OCR for the nutrition facts table
* OCR for the nutrition facts table


==== Brands and labels detection ====
=== Brands and labels detection ===


* Automatically recognize brands and labels
* Automatically recognize brands and labels




=== Data science ===
== Data science ==


Why it's important: our product database is growing rapidly (10k new products every Month in early 2018), we need automated ways to extract and validate data
Why it's important: our product database is growing rapidly (10k new products every Month in early 2018), we need automated ways to extract and validate data
Line 71: Line 74:
Background: to date, we have done very little in this area
Background: to date, we have done very little in this area


==== Automatically classify products ====
=== Automatically classify products ===


* Detect field values from other field values or bag of words from the OCR
* Detect field values from other field values or bag of words from the OCR
Line 80: Line 83:
* When less certain, we can ask users to confirm suggestions
* When less certain, we can ask users to confirm suggestions


==== Automatically detect errors ====
=== Automatically detect errors ===


* Bad nutrition facts
* Bad nutrition facts
Line 86: Line 89:




=== Other projects ===
== Other projects ==


==== Taxonomy Editor ====
=== Taxonomy Editor ===


* We define and use multilingual taxonomies for categories, labels, ingredients and other fields.
* We define and use multilingual taxonomies for categories, labels, ingredients and other fields.