Ingredients ontology: Difference between revisions
Line 103: | Line 103: | ||
Warning: you can not add the three (or more) letter codes used by Wikipedia. OFF does not support those. However you might add them as comment lines, so we have hem for the future. | Warning: you can not add the three (or more) letter codes used by Wikipedia. OFF does not support those. However you might add them as comment lines, so we have hem for the future. | ||
* Other translations | |||
It is possible to add other languages through an online dictionary. [https://www.linguee.com/ Linguee] might help you here if you speak multiple languages. But preferable this should be checked by native speakers. | It is possible to add other languages through an online dictionary. [https://www.linguee.com/ Linguee] might help you here if you speak multiple languages. But preferable this should be checked by native speakers. | ||
Revision as of 09:48, 12 August 2018
Introduction
Why?
Why do we need an ingredients ontology? The ontology describes how ingredients are derived from each other and how ingredients can be combined into new ingredients. An ontology might be useful to:
- Normalise ingredients - Producers take a lot of freedom in describing the ingredients they use. An ontology helps to standardise the ingredients.
- Hidden ingredients - an ingredient might contain hidden ingredients, the ontology might reveal these. For example butter contains butterfat.
- Combined ingredients - an ingredient might appear as a single ingredient. In reality however
- Processed ingredients - often an ingredient is derived from an other ingredient through some process. We can make explicit what these processes are. Example clarified butter is created from butter by separating the milk solids and water from the butterfat.
- Ingredient incompleteness - often an ingredient is incomplete defined in an ingredient list. For instance if an ingredient-list specifies milk, it should be defined from which mammal the milk comes from, for instance cow's milk.
Theory
What theory can be used to base an food ingredients taxonomy on? Is there already a food ontology somewhere?
Nodes
Each node in the ontology is an ingredient, as is found on ingredient lists of food products. Producers take a lot of freedom in describing the ingredients they use. This implies that an approach is needed to standardise the ingredients that are found in the ingredients list.
- Main ingredient name - an ingredient might appear under different names, the synonyms. Of these multiple names one will be chosen as main ingredient name. The main ingredient name will be defined in a single language.
- Synonyms - any synonym will be a separate entry
- Translations - an ingredient can be translated in multiple languages. However one must be careful that one ingredient is the really the same in another language. Legislation or actual production processes can be different. It might be possible to alert the user to such cases. Not all ingredients will be translatable in all languages
- Compound ingredients - sometimes an ingredient list will contain a compound ingredient, i.e. an ingredient (product?) that consists of other ingredients.
Non-ingredient list nodes
It is tempting to add nodes, which do not appear on any ingredients list. Such nodes might help in organizing the ontology. For the moment however we refrain from adding non-ingredient list nodes.
Relationships
The relationships should define how the ingredients are related to each other. An ingredient can be created from an other ingredient by applying some transformation process. This transformation process could remove one of the sub-ingredients. Or the transformation process could change one sub-ingredient into another ingredient.
Formal relationships
- contains - describes if an ingredient contains another ingredient. The relationship could specify the fraction of the ingredient. For example butter contains 80% butterfat, pastry butter contains 99.8% butterfat;
- removes - this transformation process removes a sub-ingredient. For example pastry butter is created from butter by removing 20% water;
- is melted - process whereby the ingredient is made fluid (melted butter)
- isa - describes a detailed specification of an ingredient
- is produced in - describes the location where the ingredient is created. This can be a geographic location (beurre d'isigny aop) or in a type of factory (beurre laitier)
Example
Maybe I can make a drawing of a part of the ontology.
Taxonomy
The ontology should be usable as the translations taxonomy. This taxonomy lists all ingredients, their synonyms and their translations. This taxonomy is already in use.
Encoding relations
The relation with other ingredients can be encoded in this file:
Compounds
- relation:isa:INGREDIENT
Compounds
- relation:contains:INGEDIENT
Transformations
A transformation from one ingredient to another can be encoded by:
- relation:transformedFrom:INGREDIENT:byProcess:PROCESS_NAME
For example to make clarified butter from butter one could write:
- relation:transformedFrom:en:butter:byProcess:en:rendering
Example
# CLARIFIED BUTTER - en:milk fat rendered from butter to separate the en:milk solids and water from the en:butterfat
<en:butter
en:clarified butter
bxr:Шара тоһон
ca:mantega clarificada
cs:přepuštěné máslo
de:Butterschmalz
Explanation
- The # describes a comment line and can be used to add a definition of the ingredient. In this case the definition is taken from wikipedia.
- The <en:butter line describes the parent ingredient of this ingredient. The parent ingredient forms the basis of the current ingredient. This line is optional.
- The en:clarified butter line is the main name of the ingredient. Any synonyms appear after the main name, separated by comma's. The prefix en: defines the language of the main ingredient.
- The next lines provide translations of the main ingredient in other languages. One language per line. Each line starts with a language prefix. Thus de: means german.
Categories Taxonomy
OFF maintains a categories taxonomy to categorise products. This categories taxonomy is related to the ingredient ontology, but subtly different. First a product is something that is on sale and has an identifier (barcode). A product will never be part of an ingredients list. You will never see a barcode in the ingredients.
The relationship between product categories and ingredients can be described as: A product with a product name and barcode from a brand belongs to a category and has one or more ingredients.
For example the product Beurre Gastronomique Doux of the brand Milbona with barcode 20139315 has one ingredient: Beurre pasteurisé and belongs to the Sweet cream butters category.
Maintenance
This section described how the taxonomy file should be maintained, i.e. adding ingredients, editing ingredients, etc.
Automatic add
Describes how automatically new ingredients can be added from the ingredients available in the OFF-database. @stephane
Add single ingredient
An ingredient is a node in the taxonomy file, which describes a single ingredient as is found in an ingredient list.
Add ingredient name
You can add an ingredient anywhere in the file at first. Keep a blank line between the previous and next ingredient. Add the ingredient in the format: LANGUAGE_CODE:MAIN_INGREDIENT_NAME, SYNONYM, SYNONYM.
Thus start by defining the ingredient name you want to add. If there are multiple ingredients names possible, decide which ingredient name will be the MAIN_INGREDIENT_NAME. The other names can be added as SYNONYM. Each ingredient name is separated by comma's.
Determine the language code to be used. Find your language in this lang list and add the corresponding ISO 639-1 code as LANGUAGE_CODE to your MAIN_INGREDIENT_NAME, separated by a colon (:).
Add translations
You can add translations for an ingredient. Each translation should appear on a new line.
- Wikipedia translations
If the ingredient exists as article in Wikipedia, you can add the translations supplied by Wikipedia. You can use the language codes used by Wikipedia (as seen in the article url).
Warning: you can not add the three (or more) letter codes used by Wikipedia. OFF does not support those. However you might add them as comment lines, so we have hem for the future.
- Other translations
It is possible to add other languages through an online dictionary. Linguee might help you here if you speak multiple languages. But preferable this should be checked by native speakers.
Untranslatable ingredients
Sometimes an item can not be translated, as it just does not exist. If possible we could add a description instead.
Sort translations
The translations should be sorted in alphabetical order of the language code.
An exception is the first line, that is the default language, which is used, when no translation is present.
Issues
Language codes
OFF seems to use the short ISO 639-1 language codes. The consequence is that not all languages that are found on Wikipedia can be implemented in OFF. For instance the language Furlan, does not have a ISO 639-1 code, but does have a ISO 639-2 code (fur).