GSOC/2024 ideas list

From Open Food Facts wiki

Here are ideas for GSOC There are just ideas, and are non limitative.

IMPORTANT:

Server-side

Make the API re-user centric

Description

The Open Food Facts API is used by a wide variety of applications (more than 200 of them) helping people making better choice around food.

It has grown organically with time, on a volunteer bases, and is sometime messy and complicated to understand. This is a barrier to re-use and does make every one loose time.

Expected outcomes

Propose a new API, limited to most important items, that is Open API compatible, well designed and easy to understand.

Project implementation could be in two ways: either add a module to Product Opener to transform data to fit the new API (in perl), or either create a proxy in front of current API.

The project will be deployed as soon as possible and iteratively provides a more complete API.

Along the way, the current API documentation should be improved, and a full Open API compliant specification for the new API must be written.

Project information

  • repository: https://github.com/openfoodfacts/openfoodfacts-server/
  • Slack channels: #productopener
  • Potential mentors: Stéphane, Alex
  • Project duration: 350 hours
  • Skills required: Perl (at least a minimal understanding), Python or Javascript (for the proxy option)
  • Difficulty rating: Medium

Mobile-side

Scan products without barcodes

Description

The Open Food Facts mobile application can only scan products using barcodes. We would like to be able to identify products that do not have barcodes (e.g. fruit and vegetables).

Expected outcomes

This project will initially involve testing/benchmarking several solutions, before determining the best one.

  • Research models (YOLO, MLKit, MediaPipe, etc.)
  • Test and benchmark
  • Implement the selected model in the application
  • stretch goal: Complete recognition with suggestions to narrow the choice

Project information

Personal dashboards from scans

Description

We want users to be able to scan several products (e.g. their shopping list) and then generate reports (health, environment, etc.).

Expected outcomes

This project is the 1st step of future projects (that you may start during the internship). The aim is to build solid, but flexible foundations.

  • Scan multiple products and link them to a list automatically
  • Compute stats from one or more lists (locally)
  • Display statistics to the user with personal dashboards (locally)

Project information

Photo auto-crop and selection

Description

Products photos are one of the most important asset from the database. It is also the first contribution user may do. It gives evidence for facts, and we thoroughly use OCR and Machine Learning to extract as much info as possible. Currently as you take a photo, you have to first select it's role and carefully crop the photo. This slow down contribution quite a lot and may discourage users in the long run. We won't to try to have crop and eventually selection done automatically, on device.

Expected outcomes

  • Use on the shelf image detection libraries and implement auto crop for photos
  • Implement a batch photo acquisition mode
  • from there we can iterate on other features:
    • find a working heuristic to detect front image (or eventually train a model to detect them)
    • try to do detect photo languages automatically

Note that this project is as much as mobile development as machine learning engineering.

Project information

  • repository: https://github.com/openfoodfacts/smooth-app
  • Slack channels: #mobile-app
  • Potential mentors: Raphaël, Pierre
  • Project duration: 350 hours
  • Skills required: Flutter (Dart), some machine learning
  • Difficulty rating: Medium

Tools

Help boost taxonomy contributions

Description

Taxonomies are at the heart of Open Food Facts in many aspects. It helps identify components (ingredients, labels, brands,…) and link them to useful properties, at the base of nutri-score, eco-score, allergens identification and some other properties. It is a less known but very important contribution area for the project.

Up to now contributors who wants to contribute to the taxonomy would have to edit in a cumbersome flat file and open a pull request. That's not easy.

Taxonomy editor comes to the rescue. While still in alpha stage, it should rapidly be deployed to production. Now it's time t add a lot of features to really help taxonomy grow rapidly in many languages.

Expected outcomes

The project will develop features that will help taxonomy contributors to adapt and edit the taxonomy.

  • a lot of checks: missing translations, duplicated synonyms, entries with a lot of children
  • enriching the search engine with useful filters
  • helpers to enrich taxonomy properties: links to wikidata, ciqual codes, etc.
  • dashboards at taxonomy level
  • exploration of the graph
  • suggestions or consistency checks by LLMs
  • tracking modifications of nodes to enable comparison with raw taxonomy

It will leverage the graph database as well as external APIs. You will develop iteratively (continuous deployment is already there) getting immediate feedback from the community.

Project information

Your idea

You are a candidate and have a specific project idea, that's really welcome.

But to maximize your chances, please:

  • Contribute to the project none the less in the bounding period
  • Check with us that your idea is a good fit and align with our priorities

Project template (TO REMOVE)

<DESCRIPTIVE TITLE>

Description

Explain what, why.

Expected outcomes

Deliverables and KPI / benefits

Project information

  • repository:
  • Slack channels:
  • Potential mentors:
  • Project duration:
  • Skills required:
  • Difficulty rating: