Brands: Difference between revisions

From Open Food Facts wiki
(+ dove)
No edit summary
 
(24 intermediate revisions by 4 users not shown)
Line 1: Line 1:
[[Category:Contribution Guidelines]]
[[Category:Contribution Guidelines]]
This is the brands of the product. The '''main brand''', generally clearly displayed on the front pack, '''should be entered first'''. A product can have other brands:
* These are the brands of the product.  
* when a product is a brand sold by a big company: <code>Actimel</code> is sold by <code>Danone</code>, see https://world.openfoodfacts.org/product/4009700036810/actimel-granatapfel
* In the database, this field is called <code>brands</code>.
* when a product is sold with its brand translated in two languages:
* See [https://github.com/openfoodfacts/openfoodfacts-server/issues?q=is%3Aissue+is%3Aopen+ingredients+label%3Abrands issues related to <code>brands</code>].
** <code>Nature Valley</code> is sometimes written <code>Val Nature</code>; see  https://world.openfoodfacts.org/product/0065633280267/barre-granola-erable-et-cassonade-nature-valley
* See also: https://github.com/openfoodfacts/openfoodfacts-server/pull/4385
** <code>[https://en.wikipedia.org/wiki/No_Name_%28brand%29 No Name]</code> is also written <code>Sans nom</code>; represent [https://world.openfoodfacts.org/brand/no-name 100+ products for No Name] and [https://world-fr.openfoodfacts.org/marque/sans-nom 170+ products for Sans Nom].


When a product has more than one brand, the first brand in the field is taken as the main brand.
=== Help annotate brands using Hunger Games ===
* https://hunger.openfoodfacts.org/brandinator
=== Suggested solution for the taxonomy ===


There's no taxonomy for brands for the moment, so just do your best and don't waste too much time to enter brands.
==== Short-term solution ====
Make brands language-less


=== Data issues ===
Iif a brand has a translated name, in some countries then we consider it to be a different brand (in a lot of cases the products sold under those translated names will be different).
Some brands are difficult to read:
* is it <code>Coop</code>, <code>coop</code>, <code>COOP</code>, <code>CO OP</code>?


Some brands can be related to different companies in different countries:
For products that display the brand in only one language, we just record that brand, and we display it as-is.
* [https://world.openfoodfacts.org/brand/san-miguel San Miguel]
* [https://world.openfoodfacts.org/brand/star Star]
* [https://world.openfoodfacts.org/brand/walkers Walkers] seems to represent 3 different brands ([https://world.openfoodfacts.org/product/5000328123509/choux-de-bruxelle-walkers 1], [https://world.openfoodfacts.org/product/0039047003569/luxury-rich-fruit-cake-walkers 2] and [https://world.openfoodfacts.org/product/5011555031222/chocolate-gingers-walkers 3])


Some brands can be found in the same country but for different types of products:
* [https://world.openfoodfacts.org/brand/dove Dove], which is related to both food and beauty products


Some brands are related to a common name or an ingredient, which can be confusing:
==== Mid-term solution (to be defined/work in progress) ====
* [https://world.openfoodfacts.org/brand/racines Racines] (means "roots" in French)
The taxonomy for brands should have following features:
* [https://world.openfoodfacts.org/brand/la-truffe La Truffe] (means "the truffle" in French)
* a single brand is defined by a block of text and are separated by white lines;
* [https://world.openfoodfacts.org/brand/pure-protein Pure Protein]
* use a four letters code separated by underscore -  '''<[https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes iso_639_language_code]> _<[https://en.wikipedia.org/wiki/List_of_ISO_3166_country_codes iso_3166_country_code]>''' - for the '''key'''. Example: fr_be, de_at.
* [https://world.openfoodfacts.org/brand/best-choice Best Choice] is a brand, not a tagline
* a single line defines the brand in a specific language and country or an attribute. Each line starts with the '''key''' name followed by parameters and separated by a '''colon'''.
* [https://world.openfoodfacts.org/brand/great-value Great Value] also
* a block can contain (mind the order):
* [https://world.openfoodfacts.org/brand/the-belgian The Belgian], often entered <code>Belgian</code>; it produces many false positive from our AI.
** (optional) a reference to a parent defined by "<" + the key (as described above) and the first parameter of an existing brand. For example "<ab_cd:brand1" refers to the parent key "ab_cd:brand1, synonym_of_brand1". '''Remark:''' any attributes shared between the parent and the child only have to be specified in the parent, i.e., they are inherited by the child. '''Remark''': if a reference to a parent is provided it should be the first line of the block.
** (required) a key  (as described above)''.'' For example: "''mn_op:''brand 1". It  should be unique, so that we can distinguish between brands with the same name.
** (optional) a default key (xx_xx:) to be used for any language that is not specifically listed. It is always xx_xx. '''Remark''': if it is provided it should be the last line after all other languages.
** (optional) attributes - each brand can have one or more attributes. '''Remark:''' if provided attribute(s) should be the last line after all other languages and the default key (xx_xx):
***
*** wikipedia:en: - the full link to a page, which explains the brand(!!!). Note that brands are not well available on wikipedia;
*** wikidata:en: - the identifaction of the wikidata entry for the brand(!!). Note that brands are not well available on wikidata;
 
===== Example =====
<blockquote>de_de:lidl
 
de_au:lidl
 
fr_fr:lidl
 
fr_be:lidl


Brands containing an apostrophe are often missing this one:
hr_hr:lidl
* [https://world.openfoodfacts.org/brand/lay-s Lay's] often written <code>Lays</code>
* [https://world.openfoodfacts.org/brand/kellogg-s Kellogg's] often written <code>Kelloggs</code> or something else
* [https://world.openfoodfacts.org/brand/sar-ocean Sar'Ocean].


Brands are changing sometimes. How to deal with that?
xx_xx:lidl


=== Some particular cases for a brand ===
wikipedia:en:<nowiki>https://en.wikipedia.org/wiki/Lidl</nowiki>
* A brand can contain only numbers, such as <code>1664</code>, <code>1883</code> or <code>365</code>.
* A brand can contain quotes, such as <code>The Muffin "Mam" Inc</code>
* A brand can contain &, such as <code>M&M's</code>
* A brand can contain commas, such as <code>Williams, West & Witt's Products</code>; https://world.openfoodfacts.org/brand/williams-west-witt-s-prods
* A brand can have a sub-brand containing its own name: <code>Sainsbury's</code> use a brand called [https://en.wikipedia.org/wiki/Sainsbury%27s#Product_ranges <code>By Sainsbury's</code>].


=== Some particular cases for a product ===
wikidata:Q151954
* A product can have more than 2 brands; eg:
** Coop, in Switzerland, can add up to three brands on a product: <code>Betty Bossi</code>, <code>Karma</code> and <code>Coop</code> in [https://world.openfoodfacts.org/product/7624841023290/smokey-tofu-marroni-betty-bossi this product]; they justify it: "We offer a wide range of own-label brands and brand worlds." ([https://www.coop.ch/en/inspiration-gifts/labels/c/m_0788 source])
** In [https://world.openfoodfacts.org/product/5900497611503/peach-ice-tea-lipton this product], <code>Lipton</code> belongs to <code>Unilever</code> but this product is distributed by <code>Pepsico</code>


=== Implementation in Open Food Facts ===
In the database, this field is called <code>brands</code>.


See [https://github.com/openfoodfacts/openfoodfacts-server/issues?q=is%3Aissue+is%3Aopen+ingredients+label%3Abrands issues related to <code>brands</code>].
<de_de:lidl


Preventing OFF AI to detect some false brands: see [https://github.com/openfoodfacts/robotoff/blob/master/data/ocr/brand_taxonomy_blacklist.txt brand_taxonomy_blacklist.txt].
de_de:snack day


=== Help to collect brands ===
xx_xx:lidl</blockquote>
The IA of Open Food Facts, called [[Artificial Intelligence|Robotoff]], is trying to identify brands. The annotations made by Robotoff are provided to users, asking them to answer to a simple question. There are also used by [https://hunger.openfoodfacts.org/questions?type=brand Hunger Game]. Everyone can use Hunger Game, but be careful to the issues mentioned in this current page.
* on 2020-10-29, there were 43200 annotations and 22748 resting
* on 2020-11-17, there were 46911 annotations and 22720 resting


=== Observations summary ===
Summarising the observations note above, we see the following brands:
*1 universal brand, exact same name used in all countries and languages. e.g. "Nutella"
*2 brand that is translated in different languages or scripts "The Laughing Cow", "La vache qui rit" ([https://nl.openfoodfacts.org/product/6971674070030/精制辣汤-唯滋亲 example])
*3 brands that have the same name, but used in different languages
*4 brands that have the same name, but used in different countries
*5 brands that have the same name, and used in the same country. (e.g. "Ferrero" in Italy: there's also a pasta brand).
*6 brands in non-latin scripts, which can not be latinised
*7 parent brands are sometimes shown on packaging


== Use cases ==
In this example, we put de_de as first language because Lidl is a German company. Nevertheless, it should work the same starting with different language, at the condition that children refers to the parent accordingly.
The brands taxonomy has multiple applications within OFF. Theres are:
*1 Display the brands of a product, in the language requested by the user;
*2 Have a way to list all products of a brand;
*3 Let users enter brands for a product, as they appear on the package (as free text);
*4 Let the user select the correct brand from a list of existing brands. If the same brand text occurs multiple times, the user must be able to sellect the applicable one;
*5 Suggests a brand to the user based on the manufacturer part of the barcode and other information;
*6 Infer category and labels from brand - some brands are only used for specific products. This implies that the product category and possible labels can be implies (suggested);
*7 Infer brand from manufacturer code within the barcode;
*8 Barcode/Brand quality check - if the manufacturer part of the barcode does not match the specified brand, there is an error in either of them;


== Design considerations ==
==== Long-long-term solution (to be defined/work in progress) ====
The observations and use cases lead to several design considerations:
Same as before, adding:
# Unique brand key - as the same brand (string) can exist in multiple geographic areas or within the same geographic area, there must be a way to uniquely distinguish between the various brands. Otherwise a user can not enter the correct brand (UC1), nor can we list all brands (UC2);
# Uses selectable brand - there must be a brand name in the same language/script of the package. If that name occurs multiple times in the taxonomy, it must be specified by product category, country sold, etc in order to make it unique. For instance the label '''Taste''' occurs in France and Argentina, but is used for different categories. So the user should have the choice '''Taste (category 1 - France)''' and '''Taste (category 2 - Argentina)'''. Maybe this can be mixed with the key. (UC4)
# Language/script specific brands - a way to code a single brand in multiple scripts and/or languages, for example in Chinese, Arabic and English (UC1)
# Language independent brand - a way to encode a brand that is valid for multiple languages (UC1);


== Taxonomy encoding ==
* (optional) attributes
Taking the observations, use cases and design considerations into account, it is possible to specify how this can be encoded in a taxonomy. The same approach as all the other taxonomies will be used for thus.
** barcodeprefix:en: - the first 8(?) numbers of the barcode that belong to the brand. With this brands can be automatically assigned to a barcode. Also the barcodes of existing products can be checked;
** brand_owner_opencorporates:xx: - an identification of the probable brand owner/distributor on [https://opencorporates.com/ opencorporates]. It is not always clear what should be written down here.
** idea: local customer service address (one for each country where the product is distributed, and a generic one)
** idea: eu trademark - maybe only useful for name trademarks? (not logo's). But not relevant to the consumer.
** idea: stores - the stores where the brand is sold. Maybe better to create a store taxonomy and link from there to brands?


An overview of the encoding used in the brands taxonomy:
===== Example =====
* blocks/white lines - a single brand is defined by a block of text and are separated by white lines;
<blockquote>zz:laespanola<br>
* definition - a single line defines the brand in a specific language, an attribute, etc. Each line starts with a name followed by parameters and separated by a colon
* parameters:
** parent (<zz:) - a reference to another (parent) brand. Thus any attributes shared between parent and child only have to be specified in the parent;
** key (''zz:'') - an unique is required, so that we can distinguish between brands with the same name;
** default (xx:) - a default value to be used for any language that is not specifically listed;
** language (e.g. ''ru:'') - the brand-name in the language ''ru.'' A brand might be defined in multiple languages as needed, with a single line for each language. For instance the ru: might be used to specify a brand in Cyrillic. Sometimes also country specific brands are required.
** attributes - each brand can have one or more attributes
*** barcodeprefix:en: - the first 8(?) numbers of the barcode that belong to the brand. With this brands can be automatically assigned to a barcode. Also the barcodes of existing products can be checked;
*** wikipedia:en: - the full link to a page, which explains the brand(!!!). Note that brands are not well available on wikipedia;
*** wikidata:en: - the identifaction of the wikidata entry for the brand(!!). Note that brands are not well available on wikidata;
*** brand_owner_opencorporates:xx: - an identification of the probable brand owner/distributor on [https://opencorporates.com/ opencorporates]. It is not always clear what should be written down here.
*** idea: local customer service address (one for each country where the product is distributed, and a generic one)
*** idea: eu trademark - maybe only useful for name trademarks? (not logo's). But not relevant to the consumer.
*** idea: stores - the stores where the brand is sold. Maybe better to create a store taxonomy and link from there to brands?
=== Example ===
zz:laespanola<br>
xx:La Española<br>
xx:La Española<br>
barcodeprefix:en:8410226<br>
barcodeprefix:en:8410226<br>
Line 125: Line 85:
brand_owner_opencorporates:BE:0838355558<br>
brand_owner_opencorporates:BE:0838355558<br>
wikidata:en:Q590921<br>
wikidata:en:Q590921<br>
<nowiki>#</nowiki>11 products @2022-04-23<br>
<nowiki>#</nowiki>11 products @2022-04-23</blockquote>
 
=== Q&A ===
 
* Product has more than a single brands.
** Product is part of a bigger brand or group.
*** For [https://world.openfoodfacts.org/product/4009700036810/actimel-granatapfel example], <code>Actimel</code> is sold by <code>Danone</code>.
*** For [https://en.wikipedia.org/wiki/Sainsbury%27s#Product_ranges example], <code>By Sainsbury's</code> is sold by <code>Sainsbury's</code>.
***<u>'''Solution'''</u>: make <code>Actimel</code>  a child of <code>Danone</code> in this example.
** Synonyms. Product is sold with 2 brands, each for different language/countries.
*** For [https://world.openfoodfacts.org/cgi/product.pl?type=edit&code=0065633280267 example], both <code>Nature Valley</code> and <code>Val Nature</code> can be written on the same product, but not always.
***<u>'''Solution'''</u>: make <code>Val Nature</code>  a child of <code>Nature Valley</code> in this example because <code>Nature Valley</code>is the original U.S. brand ([https://en.wikipedia.org/wiki/Nature_Valley reference]).
*** For [https://world.openfoodfacts.org/brand/sans-nom example], <code>No Name</code> and <code>Sans nom</code> can be written on the same product, but not always.
*** <u>'''Solution'''</u>: make <code>No Name</code> , <code>No Name Sans nom</code> and <code>Sans nom</code> three separate children of  <code>Generic brand</code> in this example because they are all lines of <code>Generic brand</code>which is a Canadian brand ([https://en.wikipedia.org/wiki/No_Name_(brand) reference]). Or, the three of them as synonyms of each others and child of <code>Generic brand</code>.
*** For [https://world.openfoodfacts.org/product/7624841023290/smokey-tofu-marroni-betty-bossi example], <code>Coop</code>, in Switzerland, can add up to three brands on a product: <code>Betty Bossi</code>, <code>Karma</code> and <code>Coop</code>. They justify it: "We offer a wide range of own-label brands and brand worlds." ([https://www.coop.ch/en/inspiration-gifts/labels/c/m_0788 source])
*** '''<u>Solution:</u>'''
*** For [https://world.openfoodfacts.org/product/5900497611503/peach-ice-tea-lipton example], <code>Lipton</code> belongs to <code>Unilever</code> but this product is distributed by <code>Pepsico</code> . <code>Lipton</code> used to belong to <code>Unilever</code>, and now belong to <code>CVC Capital Partners</code>. <code>Lipton</code>'s ready to drink beverages belongs to both <code>Unilever</code> and <code>PepsiCo</code> (who is distributor).
*** '''<u>Solution:</u>'''
*** For [https://nl.openfoodfacts.org/product/6971674070030/精制辣汤-唯滋亲 example], <code>The Laughing Cow</code> and <code>La vache qui rit</code>.
*** '''<u>Solution:</u>''' they both belong to the same parent (<code>Bel Group</code>). They can be en_us and fr_fr.
*** For [https://en.wikipedia.org/wiki/Wall%27s_(ice_cream) example], <code>Miko/wall's ice</code> creams, called <code>Frigo</code> in Spain but <code>Bresler</code> in Chile and Bolivia.
***<u>'''Solution'''</u>: es_es:frigo and es_cl:bresler / es_bo:bresler will be on the same block
*** For example, The Coca-Cola Company sells <code>Fanta</code> in Thailand under <code>Fanta</code> and <code>แฟนต้า</code> names.
*** For example, <code>Danone</code> sells mineral water in Morroco under both <code>عين سايس</code> and <code>aïn Saïss</code> names.
*** For examples, Japanese/English (like <code>味の素</code>/<code>Ajinomoto</code>) or Chinese/English (<code>乐虎</code>/<code>Hi-Tiger</code>)
***'''<u>Solution:</u>''' th_th:แฟนต้า/fanta, แฟนต้า, fanta in the same block as en_us:fanta
** Also synonyms. Product brand has changed over time.
***<u>'''Solution'''</u>: use synonyms in the taxonomy.
** One brand is owned by two companies.
*** For example, <code>Aldi Nord</code> and <code>Aldi Süd</code> use the name <code>Aldi</code> but they operate in different areas (between countries, and inside Germany itself)
*** <u>'''Solution'''</u>: make <code>Aldi Nord</code> and <code>Aldi Süd</code> children of <code>Aldi</code>. In general, people will input <code>Aldi</code> only. However, if someone put <code>Aldi Nord</code>, it should include its parent, <code>Aldi</code> automatically.
** The name of the brand is different from the name of the shop of these brand.
*** For example <code>Aldi Nord</code> and <code>Aldi Süd</code> brand, use the international (now unified) brand <code>Aldi</code> (https://aldi.com/), which is also the common used shop name for both <code>Aldi</code> brands in Germany. So the <code>Aldi Nord</code> and <code>Aldi Süd</code> brand, which are only used as shop indication brands on products (at least in Germany), have to resolved to the common shop name <code>Aldi</code> instead of <code>Aldi Nord</code> or <code>Aldi Süd</code>. For the brand <code>Aldi Süd</code> the shops could be also named <code>Aldi Süd</code>, as [https://www.aldi-sued.de/de/homepage.html written on the logo used inside Germany], but as the shop name of the <code>Aldi Nord</code> brand was always <code>Aldi</code> from the beginning, no one in Germany use <code>Aldi Süd</code> as name for these shops.
** One brand is distributed/produced locally by another brand
*** For [https://www.diffordsguide.com/beer-wine-spirits/2003/san-miguel-cerveza-uk-brewed example], the licensing of beer brands: the same brand may be exploited under a licensing agreement by different companies depending on the country (and this can change with time). My understanding is that it is because for beer, it is a lot cheaper to produce locally than to transport internationally. In the UK San Miguel would probably be under Carlsberg Marstons, but in Spain it would be under Mahou San Miguel and in the Philippine or Hong Kong under the (original) San Miguel Brewery.
*** For [https://www.asahiinternational.com/stories/people/asahi-europe-international-statement/ example], <code>Asahi</code>, manufactured for the European market by its local subsidiaries (<code>Peroni</code> in Italy, for instance) but used to be licensed to <code>Carlsberg Marstons</code> or <code>AB Inbev</code> depending on the country. Some <code>Asahi</code> brands are still licensed to <code>AB InBev</code> at least for some European markets. It used to be very opaque, but recent events made that obvious.
*** In Algeria, a lot of brands are done under licensing because of local laws which forbid a company which is not 51% owned by an Algerian to produce locally while import of a lot of products is forbidden or very limited.
***'''<u>Solution:</u>'''
* Same name shared by different brands.
** For [https://world.openfoodfacts.org/brand/san-miguel example], <code>San Miguel</code> is a beer in Spain [cervezas san miguel], food in Mexico [grupo agroindustrial san miguel], honey in Spain/France [Ramros Trading Company])
** For [https://world.openfoodfacts.org/brand/star example], <code>Star</code> is starfinefood in the U.S., Star S.p.A. in Italy, stardrinks in the United Arab Emirates, star from Ghadawat Indian airlines company,
** For [https://world.openfoodfacts.org/brand/walkers example], <code>Walkers</code> represents 3 different brands ([https://world.openfoodfacts.org/product/5000328123509/choux-de-bruxelle-walkers 1], [https://world.openfoodfacts.org/product/0039047003569/luxury-rich-fruit-cake-walkers 2] and [https://world.openfoodfacts.org/product/5011555031222/chocolate-gingers-walkers 3])
** For [https://world.openfoodfacts.org/brand/dove example], <code>Dove</code> is either cosmetic (from Unilever) or chocolates brand (from Mars)
**<u>'''Solution'''</u>: In long term we could have some set of rules like if it is San Miguel and category is Beers, then rename Cerezas San Miguel. Same for food vs cosmetic.
** For [https://world.openfoodfacts.org/product/8712566099559/cornetto-enigma-chocolat-coeur-au-caramel-miko example], <code>Cornetto</code> is a sub-brand of both <code>Frigo</code> and <code>Miko</code> (see previous discussion)
** <u>'''Solution'''</u>: make <code>Cornetto</code> child of both brands.
* Products without brands written.
** For [https://world-de.openfoodfacts.org/produkt/5711044014353/kulturchampignons-braun-limax-spolka example]. No brand. I used the company name of the packager as brand to make search preview more useful.
** For [https://world.openfoodfacts.org/product/5711044049218/cocktail-rispentomaten-harvest-house example], unbranded product of the brand <code>Harvest House.</code> The producer has a brand <code>Harvest House</code> used for other fresh tomato products, but it's products also appears under [https://world.openfoodfacts.org/product/2009010223322/pemium-mini-cherry-rispentomaten-natur-lieblinge other brands].
* Brand spelling and formatting.
** Is it <code>Coop</code>, <code>coop</code>, <code>COOP</code>, <code>CO OP</code>?
**<u>'''Solution'''</u>: in taxonomy, there are no difference between lower and upper case, there are no difference between space and hyphen. We can make <code>coop</code> a synonym of <code>co op</code>.
** Brands containing an apostrophe or quotes are often missing this one.
*** For examples for apostrophes, [https://world.openfoodfacts.org/brand/lay-s Lay's] vs <code>Lays</code>, [https://world.openfoodfacts.org/brand/kellogg-s Kellogg's] vs <code>Kelloggs</code>, [https://world.openfoodfacts.org/brand/sar-ocean Sar'Ocean].
*** For example for quotes,<code>The Muffin "Mam" Inc</code>
***<u>'''Solution'''</u>: use synonyms in the taxonomy.
** Brands with special characters like &
*** For example for &, <code>M&M's</code>
***<u>'''Solution'''</u>: special characters should be recognized. Eventually, use synonyms in the taxonomy.
** Brands with commas
*** For [https://world.openfoodfacts.org/brand/williams-west-witt-s-prods example] for comma<code>Williams, West & Witt's Products</code>
***<u>'''Solution'''</u>: brand should be written without the comma. it is not possible to use commas in the tags (if you write comma it will start a new tag, this example will result in two tags: <code>Williams</code> and <code>West & Witt's Products</code> or use ";" as separator as OpenStreetMap.
** Brand with only numbers
*** For examples, <code>1664</code>, <code>1883</code> or <code>365</code>.


== Questions / Issues ==
* company structure - do we want to list (and research) all the relationships between owners, marketing companies, etc. I would suggest that we do NOT and limit ourselves to the brands and brand owner (and maybe the production company);
* company structure - do we want to list (and research) all the relationships between owners, marketing companies, etc. I would suggest that we do NOT and limit ourselves to the brands and brand owner (and maybe the production company);
* overkill - it is very tempting to add al kinds details for owners, etc. The actual owners of the product, brand, etc, do not seem useful to the consumer. We just should provide links to third parties for this kind information. I.e. wikipedia, wikidata,, opencorporates, ipo europe, ...
** <u>'''comment'''</u>: overkill - it is very tempting to add al kinds details for owners, etc. The actual owners of the product, brand, etc, do not seem useful to the consumer. We just should provide links to third parties for this kind information. I.e. wikipedia, wikidata,, opencorporates, ipo europe, ...
** <u>'''Decision:'''</u>
* parent brand - when should the parent brand be added? Only if the parent brand is available on the front of the packaging, or also when it is shown on the back of the packaging, or when we can find out the legal final parent owner of a brand? This choice might have an implication for how we structure the data.
* parent brand - when should the parent brand be added? Only if the parent brand is available on the front of the packaging, or also when it is shown on the back of the packaging, or when we can find out the legal final parent owner of a brand? This choice might have an implication for how we structure the data.
** '''<u>comment @benbenben:</u>''' at least having sub-brands/lines of a brand would be beneficial - for quality checks as well as monitoring number of products from the same distributor - even if the brand is not written on the package.
*** For example, <code>S-budget</code> is a child of <code>Spar</code>.
*** For example, <code>Pilos</code> is a child of <code>Lidl</code>.
** <u>'''Decision:'''</u>
* EAN manufacturer codes - is there an open database which we could use?
* EAN manufacturer codes - is there an open database which we could use?
==== Observations summary ====
Summarizing the observations note above, we see the following brands:
*1 universal brand, exact same name used in all countries and languages. e.g. "Nutella"
*2 brand that is translated in different languages or scripts "The Laughing Cow", "La vache qui rit" ([https://nl.openfoodfacts.org/product/6971674070030/精制辣汤-唯滋亲 example]) EDIT they both belong to the same parent (Bel Group)
*3 brands that have the same name, but used in different languages
*4 brands that have the same name, but used in different countries
*5 brands that have the same name, and used in the same country. (e.g. "Ferrero" in Italy: there's also a pasta brand. EDIT pasta brand in Italy named [https://it.openfoodfacts.org/brands Industria-alimentare-'''ferraro''']).
*6 brands in non-latin scripts, which can not be latinised
*7 parent brands are sometimes shown on packaging
==== Use cases (UC below) ====
The brands taxonomy has multiple applications within Open Food Facts. These are:
*1 Display the brands of a product, in the language requested by the user;
*2 Have a way to list all products of a brand;
*3 Let users enter brands for a product, as they appear on the package (as free text);
*4 Let the user select the correct brand from a list of existing brands. If the same brand text occurs multiple times, the user must be able to select the applicable one;
*5 Suggests a brand to the user based on the manufacturer part of the barcode and other information;
*6 Infer category and labels from brand - some brands are only used for specific products. This implies that the product category and possible labels can be implies (suggested);
*7 Infer brand from manufacturer code within the barcode;
*8 Barcode/Brand quality check - if the manufacturer part of the barcode does not match the specified brand, there is an error in either of them;
==== Design considerations ====
The '''observations''' and '''use cases''' lead to several design considerations:
# Unique brand key - as the same brand (string) can exist in multiple geographic areas or within the same geographic area, there must be a way to uniquely distinguish between the various brands. Otherwise, a user can not enter the correct brand (UC1), nor can we list all brands (UC2);
# Uses selectable brand - there must be a brand name in the same language/script of the package. If that name occurs multiple times in the taxonomy, it must be specified by product category, country sold, etc in order to make it unique. For instance, the label '''Taste''' occurs in France and Argentina, but is used for different categories. So the user should have the choice '''Taste (category 1 - France)''' and '''Taste (category 2 - Argentina)'''. Maybe this can be mixed with the key. (UC4). EDIT: there are no '''Taste''' brands neither in [https://fr.openfoodfacts.org/brands France] not in [https://ar.openfoodfacts.org/brands Argentina].
# Language/script specific brands - a way to code a single brand in multiple scripts and/or languages, for example in Chinese, Arabic and English (UC1)
# Language independent brand - a way to encode a brand that is valid for multiple languages (UC1);
=== Brands and Robotoff (AI) ===
Preventing OFF AI to detect some false brands: see [https://github.com/openfoodfacts/robotoff/blob/master/data/ocr/brand_taxonomy_blacklist.txt brand_taxonomy_blacklist.txt].
==== Help to collect brands ====
The AI of Open Food Facts, called [[Artificial Intelligence|Robotoff]], is trying to identify brands in several ways. We use OCR in combination with known values. We also have another technique using pure computer vision and annotations to create cluster of brands by visual similarity. The annotations made by Robotoff are provided to users, asking them to answer to a simple question. There are also used by [https://hunger.openfoodfacts.org/questions?type=brand Hunger Game]. Everyone can use Hunger Game, but be careful to the issues mentioned in this current page.
* on 2020-10-29, there were 43200 annotations and 22748 remaining
* on 2020-11-17, there were 46911 annotations and 22720 remaining
* on 2020-11-17, there were ? and 57879 remaining
==== Known challenges ====
Some brand names are related to common name, expressions or ingredients producing false positive with AI tool. For examples:
* [https://world.openfoodfacts.org/brand/racines Racines] (means "roots" in French)
*[https://world.openfoodfacts.org/brand/la-truffe La Truffe] (means "the truffle" in French)
*[https://world.openfoodfacts.org/brand/pure-protein Pure Protein]
*[https://world.openfoodfacts.org/brand/best-choice Best Choice] is a brand, not a tagline
*[https://world.openfoodfacts.org/brand/great-value Great Value] also
*[https://world.openfoodfacts.org/brand/the-belgian The Belgian], often entered <code>Belgian</code>; it produces many false positive from our AI.
=== External brands databases ===
* https://branddb.wipo.int/en/similarname/brand/FR501998098726080?sort=score%20desc&start=0&rows=30&asStructure=%7B%22_id%22:%22de0a%22,%22boolean%22:%22AND%22,%22bricks%22:%5B%7B%22_id%22:%22de0b%22,%22key%22:%22brandName%22,%22value%22:%22miko%22,%22strategy%22:%22Simple%22%7D,%7B%22_id%22:%22de0f%22,%22key%22:%22niceClass%22,%22value%22:%5B%7B%22value%22:%2243%22,%22label%22:%2243%22,%22label2%22:%2243%20-%20Services%20for%20providing%20food%20and%20drink;%20temporary%20accommodation%22,%22score%22:72,%22highlighted%22:%2243%20-%20Services%20for%20providing%20%3Cem%3Efood%3C%2Fem%3E%20and%20drink;%20temporary%20accommodation%22%7D%5D,%22strategy%22:%22all_of%22%7D%5D%7D&fg=_void_&_=1724426189605&i=18
[[Category:Fields]]
[[Category:Brands]]

Latest revision as of 15:19, 23 August 2024

Help annotate brands using Hunger Games

Suggested solution for the taxonomy

Short-term solution

Make brands language-less

Iif a brand has a translated name, in some countries then we consider it to be a different brand (in a lot of cases the products sold under those translated names will be different).

For products that display the brand in only one language, we just record that brand, and we display it as-is.


Mid-term solution (to be defined/work in progress)

The taxonomy for brands should have following features:

  • a single brand is defined by a block of text and are separated by white lines;
  • use a four letters code separated by underscore - <iso_639_language_code> _<iso_3166_country_code> - for the key. Example: fr_be, de_at.
  • a single line defines the brand in a specific language and country or an attribute. Each line starts with the key name followed by parameters and separated by a colon.
  • a block can contain (mind the order):
    • (optional) a reference to a parent defined by "<" + the key (as described above) and the first parameter of an existing brand. For example "<ab_cd:brand1" refers to the parent key "ab_cd:brand1, synonym_of_brand1". Remark: any attributes shared between the parent and the child only have to be specified in the parent, i.e., they are inherited by the child. Remark: if a reference to a parent is provided it should be the first line of the block.
    • (required) a key (as described above). For example: "mn_op:brand 1". It should be unique, so that we can distinguish between brands with the same name.
    • (optional) a default key (xx_xx:) to be used for any language that is not specifically listed. It is always xx_xx. Remark: if it is provided it should be the last line after all other languages.
    • (optional) attributes - each brand can have one or more attributes. Remark: if provided attribute(s) should be the last line after all other languages and the default key (xx_xx):
      • wikipedia:en: - the full link to a page, which explains the brand(!!!). Note that brands are not well available on wikipedia;
      • wikidata:en: - the identifaction of the wikidata entry for the brand(!!). Note that brands are not well available on wikidata;
Example

de_de:lidl

de_au:lidl

fr_fr:lidl

fr_be:lidl

hr_hr:lidl

xx_xx:lidl

wikipedia:en:https://en.wikipedia.org/wiki/Lidl

wikidata:Q151954


<de_de:lidl

de_de:snack day

xx_xx:lidl


In this example, we put de_de as first language because Lidl is a German company. Nevertheless, it should work the same starting with different language, at the condition that children refers to the parent accordingly.

Long-long-term solution (to be defined/work in progress)

Same as before, adding:

  • (optional) attributes
    • barcodeprefix:en: - the first 8(?) numbers of the barcode that belong to the brand. With this brands can be automatically assigned to a barcode. Also the barcodes of existing products can be checked;
    • brand_owner_opencorporates:xx: - an identification of the probable brand owner/distributor on opencorporates. It is not always clear what should be written down here.
    • idea: local customer service address (one for each country where the product is distributed, and a generic one)
    • idea: eu trademark - maybe only useful for name trademarks? (not logo's). But not relevant to the consumer.
    • idea: stores - the stores where the brand is sold. Maybe better to create a store taxonomy and link from there to brands?
Example

zz:laespanola

xx:La Española
barcodeprefix:en:8410226
barcodeprefix:en:8410660
category:en en:Olive tree products
brand_owner_opencorporates:ES:80245129
#94 products @2022-04-23

zz:latrappe
xx:La Trappe
barcodeprefix:en:8711406
category:en en:belgian-beers
label:en: en:authentic-trappist-product
website:nl:https://nl.latrappetrappist.com/nl/nl.html
brand_owner_opencorporates:BE:0838355558
wikidata:en:Q590921

#11 products @2022-04-23

Q&A

  • Product has more than a single brands.
    • Product is part of a bigger brand or group.
      • For example, Actimel is sold by Danone.
      • For example, By Sainsbury's is sold by Sainsbury's.
      • Solution: make Actimel a child of Danone in this example.
    • Synonyms. Product is sold with 2 brands, each for different language/countries.
      • For example, both Nature Valley and Val Nature can be written on the same product, but not always.
      • Solution: make Val Nature a child of Nature Valley in this example because Nature Valleyis the original U.S. brand (reference).
      • For example, No Name and Sans nom can be written on the same product, but not always.
      • Solution: make No Name , No Name Sans nom and Sans nom three separate children of Generic brand in this example because they are all lines of Generic brandwhich is a Canadian brand (reference). Or, the three of them as synonyms of each others and child of Generic brand.
      • For example, Coop, in Switzerland, can add up to three brands on a product: Betty Bossi, Karma and Coop. They justify it: "We offer a wide range of own-label brands and brand worlds." (source)
      • Solution:
      • For example, Lipton belongs to Unilever but this product is distributed by Pepsico . Lipton used to belong to Unilever, and now belong to CVC Capital Partners. Lipton's ready to drink beverages belongs to both Unilever and PepsiCo (who is distributor).
      • Solution:
      • For example, The Laughing Cow and La vache qui rit.
      • Solution: they both belong to the same parent (Bel Group). They can be en_us and fr_fr.
      • For example, Miko/wall's ice creams, called Frigo in Spain but Bresler in Chile and Bolivia.
      • Solution: es_es:frigo and es_cl:bresler / es_bo:bresler will be on the same block
      • For example, The Coca-Cola Company sells Fanta in Thailand under Fanta and แฟนต้า names.
      • For example, Danone sells mineral water in Morroco under both عين سايس and aïn Saïss names.
      • For examples, Japanese/English (like 味の素/Ajinomoto) or Chinese/English (乐虎/Hi-Tiger)
      • Solution: th_th:แฟนต้า/fanta, แฟนต้า, fanta in the same block as en_us:fanta
    • Also synonyms. Product brand has changed over time.
      • Solution: use synonyms in the taxonomy.
    • One brand is owned by two companies.
      • For example, Aldi Nord and Aldi Süd use the name Aldi but they operate in different areas (between countries, and inside Germany itself)
      • Solution: make Aldi Nord and Aldi Süd children of Aldi. In general, people will input Aldi only. However, if someone put Aldi Nord, it should include its parent, Aldi automatically.
    • The name of the brand is different from the name of the shop of these brand.
      • For example Aldi Nord and Aldi Süd brand, use the international (now unified) brand Aldi (https://aldi.com/), which is also the common used shop name for both Aldi brands in Germany. So the Aldi Nord and Aldi Süd brand, which are only used as shop indication brands on products (at least in Germany), have to resolved to the common shop name Aldi instead of Aldi Nord or Aldi Süd. For the brand Aldi Süd the shops could be also named Aldi Süd, as written on the logo used inside Germany, but as the shop name of the Aldi Nord brand was always Aldi from the beginning, no one in Germany use Aldi Süd as name for these shops.
    • One brand is distributed/produced locally by another brand
      • For example, the licensing of beer brands: the same brand may be exploited under a licensing agreement by different companies depending on the country (and this can change with time). My understanding is that it is because for beer, it is a lot cheaper to produce locally than to transport internationally. In the UK San Miguel would probably be under Carlsberg Marstons, but in Spain it would be under Mahou San Miguel and in the Philippine or Hong Kong under the (original) San Miguel Brewery.
      • For example, Asahi, manufactured for the European market by its local subsidiaries (Peroni in Italy, for instance) but used to be licensed to Carlsberg Marstons or AB Inbev depending on the country. Some Asahi brands are still licensed to AB InBev at least for some European markets. It used to be very opaque, but recent events made that obvious.
      • In Algeria, a lot of brands are done under licensing because of local laws which forbid a company which is not 51% owned by an Algerian to produce locally while import of a lot of products is forbidden or very limited.
      • Solution:
  • Same name shared by different brands.
    • For example, San Miguel is a beer in Spain [cervezas san miguel], food in Mexico [grupo agroindustrial san miguel], honey in Spain/France [Ramros Trading Company])
    • For example, Star is starfinefood in the U.S., Star S.p.A. in Italy, stardrinks in the United Arab Emirates, star from Ghadawat Indian airlines company,
    • For example, Walkers represents 3 different brands (1, 2 and 3)
    • For example, Dove is either cosmetic (from Unilever) or chocolates brand (from Mars)
    • Solution: In long term we could have some set of rules like if it is San Miguel and category is Beers, then rename Cerezas San Miguel. Same for food vs cosmetic.
    • For example, Cornetto is a sub-brand of both Frigo and Miko (see previous discussion)
    • Solution: make Cornetto child of both brands.
  • Products without brands written.
    • For example. No brand. I used the company name of the packager as brand to make search preview more useful.
    • For example, unbranded product of the brand Harvest House. The producer has a brand Harvest House used for other fresh tomato products, but it's products also appears under other brands.
  • Brand spelling and formatting.
    • Is it Coop, coop, COOP, CO OP?
    • Solution: in taxonomy, there are no difference between lower and upper case, there are no difference between space and hyphen. We can make coop a synonym of co op.
    • Brands containing an apostrophe or quotes are often missing this one.
      • For examples for apostrophes, Lay's vs Lays, Kellogg's vs Kelloggs, Sar'Ocean.
      • For example for quotes,The Muffin "Mam" Inc
      • Solution: use synonyms in the taxonomy.
    • Brands with special characters like &
      • For example for &, M&M's
      • Solution: special characters should be recognized. Eventually, use synonyms in the taxonomy.
    • Brands with commas
      • For example for commaWilliams, West & Witt's Products
      • Solution: brand should be written without the comma. it is not possible to use commas in the tags (if you write comma it will start a new tag, this example will result in two tags: Williams and West & Witt's Products or use ";" as separator as OpenStreetMap.
    • Brand with only numbers
      • For examples, 1664, 1883 or 365.
  • company structure - do we want to list (and research) all the relationships between owners, marketing companies, etc. I would suggest that we do NOT and limit ourselves to the brands and brand owner (and maybe the production company);
    • comment: overkill - it is very tempting to add al kinds details for owners, etc. The actual owners of the product, brand, etc, do not seem useful to the consumer. We just should provide links to third parties for this kind information. I.e. wikipedia, wikidata,, opencorporates, ipo europe, ...
    • Decision:
  • parent brand - when should the parent brand be added? Only if the parent brand is available on the front of the packaging, or also when it is shown on the back of the packaging, or when we can find out the legal final parent owner of a brand? This choice might have an implication for how we structure the data.
    • comment @benbenben: at least having sub-brands/lines of a brand would be beneficial - for quality checks as well as monitoring number of products from the same distributor - even if the brand is not written on the package.
      • For example, S-budget is a child of Spar.
      • For example, Pilos is a child of Lidl.
    • Decision:
  • EAN manufacturer codes - is there an open database which we could use?

Observations summary

Summarizing the observations note above, we see the following brands:

  • 1 universal brand, exact same name used in all countries and languages. e.g. "Nutella"
  • 2 brand that is translated in different languages or scripts "The Laughing Cow", "La vache qui rit" (example) EDIT they both belong to the same parent (Bel Group)
  • 3 brands that have the same name, but used in different languages
  • 4 brands that have the same name, but used in different countries
  • 5 brands that have the same name, and used in the same country. (e.g. "Ferrero" in Italy: there's also a pasta brand. EDIT pasta brand in Italy named Industria-alimentare-ferraro).
  • 6 brands in non-latin scripts, which can not be latinised
  • 7 parent brands are sometimes shown on packaging

Use cases (UC below)

The brands taxonomy has multiple applications within Open Food Facts. These are:

  • 1 Display the brands of a product, in the language requested by the user;
  • 2 Have a way to list all products of a brand;
  • 3 Let users enter brands for a product, as they appear on the package (as free text);
  • 4 Let the user select the correct brand from a list of existing brands. If the same brand text occurs multiple times, the user must be able to select the applicable one;
  • 5 Suggests a brand to the user based on the manufacturer part of the barcode and other information;
  • 6 Infer category and labels from brand - some brands are only used for specific products. This implies that the product category and possible labels can be implies (suggested);
  • 7 Infer brand from manufacturer code within the barcode;
  • 8 Barcode/Brand quality check - if the manufacturer part of the barcode does not match the specified brand, there is an error in either of them;

Design considerations

The observations and use cases lead to several design considerations:

  1. Unique brand key - as the same brand (string) can exist in multiple geographic areas or within the same geographic area, there must be a way to uniquely distinguish between the various brands. Otherwise, a user can not enter the correct brand (UC1), nor can we list all brands (UC2);
  2. Uses selectable brand - there must be a brand name in the same language/script of the package. If that name occurs multiple times in the taxonomy, it must be specified by product category, country sold, etc in order to make it unique. For instance, the label Taste occurs in France and Argentina, but is used for different categories. So the user should have the choice Taste (category 1 - France) and Taste (category 2 - Argentina). Maybe this can be mixed with the key. (UC4). EDIT: there are no Taste brands neither in France not in Argentina.
  3. Language/script specific brands - a way to code a single brand in multiple scripts and/or languages, for example in Chinese, Arabic and English (UC1)
  4. Language independent brand - a way to encode a brand that is valid for multiple languages (UC1);


Brands and Robotoff (AI)

Preventing OFF AI to detect some false brands: see brand_taxonomy_blacklist.txt.

Help to collect brands

The AI of Open Food Facts, called Robotoff, is trying to identify brands in several ways. We use OCR in combination with known values. We also have another technique using pure computer vision and annotations to create cluster of brands by visual similarity. The annotations made by Robotoff are provided to users, asking them to answer to a simple question. There are also used by Hunger Game. Everyone can use Hunger Game, but be careful to the issues mentioned in this current page.

  • on 2020-10-29, there were 43200 annotations and 22748 remaining
  • on 2020-11-17, there were 46911 annotations and 22720 remaining
  • on 2020-11-17, there were ? and 57879 remaining

Known challenges

Some brand names are related to common name, expressions or ingredients producing false positive with AI tool. For examples:


External brands databases