Autonomous taxonomy service

From Open Food Facts wiki

Project 8: Autonomous taxonomy service

Description:

Taxonomies are at the heart of Open Food Facts in many aspects. It helps identify components (ingredients, labels, brands,…) and link them to useful properties, at the base of nutri-score, eco-score, allergens identification and some other properties.

Each taxonomy is a DAG (directed acyclic graph) where leaves have one or more parents.

Currently each taxonomy is loaded in memory by the perl application and this takes quite a lot of memory. While most of the time the structure is processed the apposite module : Tags.pm which offers a basic API. Still there are many places where the structure is used directly.

Expected outcomes:

  • We would like to take out this part in an independant service. The mission will be to build an independant service for taxonomies. It will offer a rich API to tackle all current usage in a practical way.
  • The service will most likely rely on a database to ensure good performance and avoid reinventing the wheel. You will have to study which type of database fits best for the task.
  • You will have to understand current perl code to understand use cases for taxonomy (your mentor will help you in this task). Create a service for taxonomy with a clean API, and implement usage of the API at least for the simplest usage cases (we can replace the old code progressively).
  • Having this independant service will help improve scalability, performance, maintainability and may offer new opportunities for services around taxonomies.

Technologies: You can choose freely, in accordance with your mentor, which technology to use, still the capacity for the contributors community to maintain it in the long run is an important criteria. Some understanding of Perl is necessary.

Skills: this project is not easy to bootstrap. The candidate should be able to cope with complexity, and be able to quickly understand current code and extract possible model for it (mentor will help)

  • Slack channels: #taxonomies
  • Potential mentors: Stéphane Gigandet
  • Project duration: 350 h
  • Difficulty rating: Hard