Global taxonomies: Difference between revisions
m (→Draft Taxonomies: +Global stores taxonomy) |
(Documentation structure) |
||
Line 14: | Line 14: | ||
== Generalities == | == Generalities == | ||
=== Languages === | |||
Each language has a 2-letter prefix. e.g. "en" for English and "fr" for French. | Each language has a 2-letter prefix. e.g. "en" for English and "fr" for French. | ||
Whenever possible, the canonical language for each field value should be English. e.g. en:soups is the canonical value for the Soups category. | Whenever possible, the canonical language for each field value should be English. e.g. en:soups is the canonical value for the Soups category. | ||
A value can be defined in another language (which becomes the canonical language), e.g. fr:soupes-a-l-oignon could be the canonical value for "Onion Soups" if we don't have an English | A value can be defined in another language (which becomes the canonical language), e.g. fr:soupes-a-l-oignon could be the canonical value for "Onion Soups" if we don't have an English translation yet. Before December 2013, the taxonomies were defined for each language, with most definitions for French, but English translations will be added progressively. | ||
New values (e.g. categories that do not exist yet) should have an English canonical value. | New values (e.g. categories that do not exist yet) should have an English canonical value. | ||
Line 25: | Line 26: | ||
When a field value needs to be translated to a target language, if the translation does not exist yet, English is shown (or the canonical language if the English translation does not exist either). | When a field value needs to be translated to a target language, if the translation does not exist yet, English is shown (or the canonical language if the English translation does not exist either). | ||
=== Synonyms === | |||
In each language, each value can have a number of synonyms. | In each language, each value can have a number of synonyms. | ||
Line 31: | Line 33: | ||
Synonyms are recursive: if en:yoghurt is a synonym of en:yogurt, then en:banana_yoghurt will automatically be added as a synonym of en:banana_yogurt | Synonyms are recursive: if en:yoghurt is a synonym of en:yogurt, then en:banana_yoghurt will automatically be added as a synonym of en:banana_yogurt | ||
=== Stopwords === | |||
Stopwords can be used to further extend synonyms. e.g. if "à" and "la" are stopwords for French, then "Yaourts fraise" will automatically be mapped to "Yaourts à la fraise". | Stopwords can be used to further extend synonyms. e.g. if "à" and "la" are stopwords for French, then "Yaourts fraise" will automatically be mapped to "Yaourts à la fraise". | ||
=== Taxonomy architecture === | |||
The taxonomy is not a strict hierarchy: values can have multiple parents. But cycles are not allowed. | The taxonomy is not a strict hierarchy: values can have multiple parents. But cycles are not allowed. | ||
Revision as of 14:31, 7 June 2019
Introduction
Open Food Facts uses global taxonomies for fields such as categories, brands, labels and countries. This page explains how taxonomies work in Open Food Facts and how they can be updated and enhanced.
Features
- A global hierarchy / taxonomy for each type of data field (categories, brands, labels, countries etc.)
- Translations for every language of each field value
- Multiple synonyms for each field value in each language
- Stopwords for each language/field type
Generalities
Languages
Each language has a 2-letter prefix. e.g. "en" for English and "fr" for French.
Whenever possible, the canonical language for each field value should be English. e.g. en:soups is the canonical value for the Soups category. A value can be defined in another language (which becomes the canonical language), e.g. fr:soupes-a-l-oignon could be the canonical value for "Onion Soups" if we don't have an English translation yet. Before December 2013, the taxonomies were defined for each language, with most definitions for French, but English translations will be added progressively.
New values (e.g. categories that do not exist yet) should have an English canonical value.
Each field value can be translated to any language.
When a field value needs to be translated to a target language, if the translation does not exist yet, English is shown (or the canonical language if the English translation does not exist either).
Synonyms
In each language, each value can have a number of synonyms.
Simple synonyms (simple singular) are done automatically when possible.
Synonyms are recursive: if en:yoghurt is a synonym of en:yogurt, then en:banana_yoghurt will automatically be added as a synonym of en:banana_yogurt
Stopwords
Stopwords can be used to further extend synonyms. e.g. if "à" and "la" are stopwords for French, then "Yaourts fraise" will automatically be mapped to "Yaourts à la fraise".
Taxonomy architecture
The taxonomy is not a strict hierarchy: values can have multiple parents. But cycles are not allowed.
Format
# stopwords stopwords:en: some,stopwords stopwords:fr: word,that,are,removed,when,matching # synonyms that are not field values but that are contained in field values synonyms:en: global,international en: value, a synonym value, another synonym value fr: valeur, une valeur synonyme, une autre valeur synonyme <en: value en: a child value, a synonym for a child value fr: une valeur enfant, un synonyme d'une valeur enfant <en: value en: another child value <en: a child value <en: another child value en: a grand-child value # properties en: value fr: valeur description:en: a property of value description:fr: french version of the property country_code:en: a property that is the same for all languages -> use English suffix en: wikidata:en:Q89
Taxonomies
The definitions can be edited on this wiki, they are periodically synchronized on the Open Food Facts database and web site.
Taxonomies
- Test taxonomy showing the basic taxonomy definition features
- Global ingredients taxonomy (on Github, account and VCS knowledge needed)
- Global categories taxonomy
- Global brands and companies taxonomy
- Global labels taxonomy
- Global labels taxonomy logos
- Global languages taxonomy
- Global countries taxonomy
- Global origins taxonomy
- Global additives taxonomy
- Global additives classes taxonomy
- Global vitamins taxonomy
- Global minerals taxonomy
- Global amino acids taxonomy
- Global nucleotides taxonomy
- Global other nutritional substances taxonomy
- Global allergens taxonomy
- Global traces taxonomy
- Global states taxonomy
- Global NOVA groups taxonomy