Ingredients: Difference between revisions
m (→Database: small fix) |
No edit summary |
||
(2 intermediate revisions by 2 users not shown) | |||
Line 12: | Line 12: | ||
=== Filling ingredients data === | === Filling ingredients data === | ||
This is done through an application. Steps as are follow: | |||
* text acquisition | |||
** either a user type ingredients | |||
** or | |||
*** he upload an image, or select an existing one | |||
*** we send it to an OCR (image to text) | |||
*** we fill the text but ask user to verify and eventually correct it | |||
* text parsing | |||
** when submitted, text description is parsed by the [https://github.com/openfoodfacts/openfoodfacts-server/ product opener] code | |||
** this code heavily relies on the [[ingredients taxonomy]] to identify as much ingredients as possible | |||
** but also on specific local habits in the way to write ingredients (and you may contribute code for your country, or report specific problems) | |||
** it also tries to distinguish between ingredients, origins, processing, quantities, and sub-ingredients (ingredients composing other ingredients) | |||
=== Database === | === Database === | ||
In the database, this field is called <code>ingredients_text_[language code]</code>. | In the database, this field is called <code>ingredients_text_[language code]</code> which contains the raw data by language. | ||
<code>ingredients</code> contains all inferred information thanks to parsing, <code>ingredients_tags</code> contains ingredients taxonomy entries (and their parents) corresponding to identified ingredients. | |||
Other fields are derived from this analysis like <code>allergens_from_ingredients</code> or <code>ingredients_analysis</code> | |||
=== Tools === | === Tools === | ||
==== Taxonomy ==== | ==== Taxonomy ==== | ||
To allow analysis and translation, Open Food Facts community [ | To allow analysis and translation, Open Food Facts community has build an [[Ingredients taxonomy|ingredients taxonomy]]. You can contribute to it. | ||
==== Ingredients analyzing ==== | ==== Ingredients analyzing ==== | ||
Analyzing ingredients is not an easy task. A whole page is dedicated to [[Ingredients Extraction and Analysis]]. | Analyzing ingredients is not an easy task. A whole page is dedicated to [[Ingredients Extraction and Analysis]]. | ||
On every products, you can also see how the software identifies ingredients in the text to verify its understanding. | |||
As a result we can find metrics related to [[Ingredients Analysis Quality]]. | As a result we can find metrics related to [[Ingredients Analysis Quality]]. | ||
Line 29: | Line 47: | ||
==== Ingredients issues ==== | ==== Ingredients issues ==== | ||
See [https://github.com/openfoodfacts/openfoodfacts-server/issues?q=is%3Aissue+is%3Aopen+ingredients+label%3Aingredients issues related to <code>ingredients</code>]. | See [https://github.com/openfoodfacts/openfoodfacts-server/issues?q=is%3Aissue+is%3Aopen+ingredients+label%3Aingredients issues related to <code>ingredients</code>]. | ||
[[Category:Ingredients]] |
Latest revision as of 07:45, 23 October 2023
Ingredients represent one of the most important data to be collected, as it is used for many usages.
Usages
- Nova calculation
- Nutri-Score calculation: it is used to calculate the proportion of fruit and nuts
- evaluation of some food preferences such as vegetarians, vegans, etc.
- automatic identification of allergens (both substances and traces, see below)
- etc.
Help and discussion
Discussions related to ingredients take place in the #ingredients channel on our Slack space.
Filling ingredients data
This is done through an application. Steps as are follow:
- text acquisition
- either a user type ingredients
- or
- he upload an image, or select an existing one
- we send it to an OCR (image to text)
- we fill the text but ask user to verify and eventually correct it
- text parsing
- when submitted, text description is parsed by the product opener code
- this code heavily relies on the ingredients taxonomy to identify as much ingredients as possible
- but also on specific local habits in the way to write ingredients (and you may contribute code for your country, or report specific problems)
- it also tries to distinguish between ingredients, origins, processing, quantities, and sub-ingredients (ingredients composing other ingredients)
Database
In the database, this field is called ingredients_text_[language code]
which contains the raw data by language.
ingredients
contains all inferred information thanks to parsing, ingredients_tags
contains ingredients taxonomy entries (and their parents) corresponding to identified ingredients.
Other fields are derived from this analysis like allergens_from_ingredients
or ingredients_analysis
Tools
Taxonomy
To allow analysis and translation, Open Food Facts community has build an ingredients taxonomy. You can contribute to it.
Ingredients analyzing
Analyzing ingredients is not an easy task. A whole page is dedicated to Ingredients Extraction and Analysis.
On every products, you can also see how the software identifies ingredients in the text to verify its understanding.
As a result we can find metrics related to Ingredients Analysis Quality.