Ingredients/Parsing: Difference between revisions
(Created page with "This page collects what we know about ingredient parsing. <pre> rietsuiker*, plantaardige olie* (zonnebloem, palm), 13% _hazelnoot_*, 7.5% magere cacaopoeder*, magere _mel...") |
(Adding percentage and order relevance) |
||
Line 11: | Line 11: | ||
== Parenthesis == | == Parenthesis == | ||
Can indicate sub-components | Can indicate sub-components | ||
== Percentage == | |||
Indicates the quantity | |||
== Order == | |||
Items are required to be listed in order of largest to smallest quantity | |||
Revision as of 20:07, 24 September 2016
This page collects what we know about ingredient parsing.
rietsuiker*, plantaardige olie* (zonnebloem, palm), 13% _hazelnoot_*, 7.5% magere cacaopoeder*, magere _melk_poeder*, emulgator (_soja_lecithine), vanille*. *Van biologische oorsprong.
Asterisks
Here * is an annotation (sometimes you have multiple like **) that indicates a property of one or more listed ingredient (biological in this case)
Parenthesis
Can indicate sub-components
Percentage
Indicates the quantity
Order
Items are required to be listed in order of largest to smallest quantity
List of ingredients
- http://world.openfoodfacts.org/ingredients (will crash your browser)
- http://world.openfoodfacts.org/files/ingredients.20151117.txt (lighter text version of the above)
- https://files.slack.com/files-pri/T02KVRT1Q-F192GGZE3/download/top500ingredients.xls (normalised Excel version for top 500 of the above)
- Global ingredients taxonomy (Taxonomisation start)
- Wikidata (We could generate a list from Wikidata)