GSOC/2024 ideas list

From Open Food Facts wiki
Revision as of 16:17, 6 February 2024 by Alex-off (talk | contribs) (added nutripatrol project)

Here are ideas for GSOC There are just ideas, and are non limitative.

IMPORTANT:

Server-side

Make the API re-user centric

Description

The Open Food Facts API is used by a wide variety of applications (more than 200 of them) helping people making better choice around food.

It has grown organically with time, on a volunteer bases, and is sometime messy and complicated to understand. This is a barrier to re-use and does make every one loose time.

Expected outcomes

Propose a new API, limited to most important items, that is Open API compatible, well designed and easy to understand.

Project implementation could be in two ways: either add a module to Product Opener to transform data to fit the new API (in perl), or either create a proxy in front of current API.

The project will be deployed as soon as possible and iteratively provides a more complete API.

Along the way, the current API documentation should be improved, and a full Open API compliant specification for the new API must be written.

Project information

  • repository: https://github.com/openfoodfacts/openfoodfacts-server/
  • Slack channels: #productopener
  • Potential mentors: StĆ©phane, Alex
  • Project duration: 350 hours
  • Skills required: Perl (at least a minimal understanding), Python or Javascript (for the proxy option)
  • Difficulty rating: Medium

Mobile-side

Scan products without barcodes

Description

The Open Food Facts mobile application can only scan products using barcodes. We would like to be able to identify products that do not have barcodes (e.g. fruit and vegetables).

Expected outcomes

This project will initially involve testing/benchmarking several solutions, before determining the best one.

  • Research models (YOLO, MLKit, MediaPipe, etc.)
  • Test and benchmark
  • Implement the selected model in the application
  • stretch goal: Complete recognition with suggestions to narrow the choice

Project information

Personal dashboards from scans

Description

We want users to be able to scan several products (e.g. their shopping list) and then generate reports (health, environment, etc.).

Expected outcomes

This project is the 1st step of future projects (that you may start during the internship). The aim is to build solid, but flexible foundations.

  • Scan multiple products and link them to a list automatically
  • Compute stats from one or more lists (locally)
  • Display statistics to the user with personal dashboards (locally)

Project information

Photo auto-crop and selection

Description

Products photos are one of the most important asset from the database. It is also the first contribution user may do. It gives evidence for facts, and we thoroughly use OCR and Machine Learning to extract as much info as possible. Currently as you take a photo, you have to first select it's role and carefully crop the photo. This slow down contribution quite a lot and may discourage users in the long run. We won't to try to have crop and eventually selection done automatically, on device.

Expected outcomes

  • Use on the shelf image detection libraries and implement auto crop for photos
  • Implement a batch photo acquisition mode
  • from there we can iterate on other features:
    • find a working heuristic to detect front image (or eventually train a model to detect them)
    • try to do detect photo languages automatically

Note that this project is as much as mobile development as machine learning engineering.

Project information

  • repository: https://github.com/openfoodfacts/smooth-app
  • Slack channels: #mobile-app
  • Potential mentors: RaphaĆ«l, Pierre
  • Project duration: 350 hours
  • Skills required: Flutter (Dart), some machine learning
  • Difficulty rating: Medium

Tools

Help boost taxonomy contributions

Description

Taxonomies are at the heart of Open Food Facts in many aspects. It helps identify components (ingredients, labels, brands,ā€¦) and link them to useful properties, at the base of nutri-score, eco-score, allergens identification and some other properties. It is a less known but very important contribution area for the project.

Up to now contributors who wants to contribute to the taxonomy would have to edit in a cumbersome flat file and open a pull request. That's not easy.

Taxonomy editor comes to the rescue. While still in alpha stage, it should rapidly be deployed to production. Now it's time t add a lot of features to really help taxonomy grow rapidly in many languages.

Expected outcomes

The project will develop features that will help taxonomy contributors to adapt and edit the taxonomy.

  • a lot of checks: missing translations, duplicated synonyms, entries with a lot of children
  • enriching the search engine with useful filters
  • helpers to enrich taxonomy properties: links to wikidata, ciqual codes, etc.
  • dashboards at taxonomy level
  • exploration of the graph
  • suggestions or consistency checks by LLMs
  • tracking modifications of nodes to enable comparison with raw taxonomy

It will leverage the graph database as well as external APIs. You will develop iteratively (continuous deployment is already there) getting immediate feedback from the community.

Project information

  • repository: https://github.com/openfoodfacts/taxonomy-editor/
  • Slack channels: #taxonomy-editor #taxonomy
  • Potential mentors: Alexandre F., Alex
  • Project duration: 150 hours or 350 hours
  • Skills required: Reactjs / Python / Neo4j
  • Difficulty rating: Medium

Empower users through easy dataviz

Description

We started a new project to boost search on Open Food Facts. This will help bring really useful informations to the general public, but also to power users like researchers, journalists, hacktivist or food experts. Right now Open Food Facts can draw simple graph that are already really useful to get insights (for example how is ultra-transformation related to Nutri-Score). We would like to use our new search engine to provide even more capabilities for data visualization. This tool will also be re-usable outside our project, as search-a-licious is a reusable ready to use engine.

Expected outcomes

Have an API (on search-a-licious side) and a javascript library (possibly web-components) to enable easily making graphics to explore data. We should at least replicate current capabilities (through a far easier interface) to compare values on a set of products. We will add as many possibilities as we can afford. We will try our best to not re-invent the wheel and adapt existing libraries to our needs (vega-lite might be a good candidate). The regular search-a-licious team will actively help with the API part.

Project information

  • Slack channels: #search
  • Potential mentors: Alexandre F., Alex, RaphaĆ«l
  • Project duration: 150 hours or 350 hours
  • Skills required: Javascript (webcomponents) / Python
  • Difficulty rating: Medium

Help our moderators thanks to quick fix interfaces

Description

As any crowed sourced projects, Open Food Facts is at risk of seeing vandalism or malicious attacks over it's contents. We already have a team of moderators and they are doing a great job, but we would like to ease their task both to coordinate effort and make it effective. We have just created a tool to help on this.

We would like to enhance this tool with specific interfaces that help fix recurring problems in just one interface and as little actions as possible.

Expected outcomes

Create specific interfaces in NutriPatrol to fix recurring problems. Examples are: removing images, unselecting an image, moving a set of photos to the right language, smartly revert a change, retrieve changes from a specific user for review, etc

A discussion with the moderators will help find the most important interfaces to develop.

The project will use the Open Food Facts API to retrieve data on products and edit them. Interfaces will work browser side, ideally reusing and improving our javascript SDK.

Project information

  • Slack channels: #moderation-tool
  • Potential mentors: Alexandre F., RaphaĆ«l, Valentin
  • Project duration: 150 hours or 350 hours
  • Skills required: ReactJS / Python
  • Difficulty rating: Medium

Your idea

You are a candidate and have a specific project idea, that's really welcome.

But to maximize your chances, please:

  • Contribute to the project none the less in the bounding period
  • Check with us that your idea is a good fit and align with our priorities

Project template (TO REMOVE)

<DESCRIPTIVE TITLE>

Description

Explain what, why.

Expected outcomes

Deliverables and KPI / benefits

Project information

  • repository:
  • Slack channels:
  • Potential mentors:
  • Project duration:
  • Skills required:
  • Difficulty rating: