Nutrition facts table data extraction

From Open Food Facts wiki
Revision as of 11:41, 10 July 2020 by Raphael0202 (talk | contribs)

Introduction

The Nutrition facts table data extraction project aims to automatically extract the nutrition facts values from photos of nutrition facts table.

Extracting nutritional facts from images would allow to fill values when nutritional facts are missing and to check values filled by users.

Why it's very important

  • We need nutrition facts to compute nutritional quality with the Nutri-Score
  • It takes 2 minutes per product to type in nutrition values, a process that is also tedious and error-prone
    • 1M products x 2 minutes = 10 years of work full time without week-ends or vacations
  • Being able to add complete data for a product quickly is key to attain a critical mass of products in new countries

Data, evaluation and test sets

Data sets

How the model fits in the OFF infrastructure

 

SVG source on Google Drawing

Approaches

Two approaches have been tested so far:

  • Regexes used on the OCR output. A new /predict/nutrient endpoint has been added to Robotoff. This approach works best when nutritional information are not displayed in a table (“text” only).
  • A clustering approach that tries to estimate the number of columns and lines of the nutritional tables using K-means or DBSCAN algorithms. Not integrated to Robotoff.

Previous attempts

  • Scoring des méthodes de détection de tableaux nutritionnels PDF docx
  • Analyse approfondie sur la détection

de tableaux nutritionnels pdf

Other approaches

The previous attempts are not state of the art for the table extraction task. Other approaches we may consider:

  • Detection of the table structure: identifying rows, columns and cells of the table. Most of the time rows corresponds to nutrition labels (protein, carbohydrate, energy,...) and columns to the quantity (100g, per portion, % of daily intake). By using pattern matching, we can label the column and rows and extract the values of the nutritional table for 100g or per portion.
  • Using an end-to-end model to directly extract nutritional values. This may be done by scoring candidates for each field (protein_100g, protein_serving, carbohydrate_100g,...), and selecting the highest scoring token as the field value. This is similar to Microsoft Form Recognizer API or Google Document API.

Each approach requires different kinds of annotation, so chosing the most effective approach at the beginning of the project is crucial.

Planning

  • Test of an end-to-end model, through an API as a starter: Form Recognizer or Google Document AI API. If results are promising, we develop in-house end-to-end model, otherwise a layout model.
  • Thourough litterature review
  • Architecture choice
  • Annotation campaign
  • Experimentation: model training and scoring
  • Integration to Robotoff/Hunger Games

Resources

  • Trove - marketplace platform where AI developers who need photos can create projects looking for specific types of photos crowdsource photos from photo takers.
  • Lobe - train a custom machine learning model using a simple visual interface with no code
  • AI Builder - import your photos right away into a Power App (or a Power Automate flow) so you can use the output in any real world application.