Internationalization/Multilingual products
Multilingual products
This page is to discuss how to handle products that have text in multiple languages on their label.
Problems addressed
- Single products with more than one language on the label
- Bilingual products
- Products where all the texts on the label are in two (or more) languages
- Common in bilingual countries (Belgium, Canada...)
- e.g. most Cora products have all the text in French and in Dutch: http://world.openfoodfacts.org/brand/cora
- Bilingual products
- Products with some information in multiple languages
- e.g. ingredients and nutrition facts in many languages
Note that the information can be different between languages (happens on beer).
Problems not addressed
- Products that have different labels in different languages, with the same barcode
- will be addressed by Project:Product versions and history
- Products that have different labels in different languages, with a different barcode
- we simply store them as different products
- we might consider linking the products in some way in a future project
- Translating ingredients
- will be addressed by ingredients taxonomy
Current status
Data entry
- We currently have a field to indicate the "main" language of a product
- The intention of this field was to put the language that is the most prominent on the label.
- There are a few products where the split is 50% / 50% but those are not very common.
- Values entered in the product edit form are considered to be in the "main" language of the product
- Values entered in fields for which we have a taxonomy (categories, countries, labels, traces) are mapped according to the "main" language of the product
- e.g. entering "jus de fruits" in the categories field when the main language is set to French will result in en:fruit-juices to be assigned.
- The intention of this field was to put the language that is the most prominent on the label.
- Values for nutrition facts are global and assigned to a canonical field
- Except when the nutrient's name is unknown (i.e. not in our current nutrient taxonomy)
- The default nutrients shown depend on the country (EU nutrition table vs US/CA nutrition table)
- Images are selected/cropped for the "main" language of the product (product front, ingredients and nutrition facts)
Data display
- Fields for which we have a taxonomy (categories, countries, labels, traces) are displayed in the target language. (target language is set by the subdomain, e.g. using http://world.openfoodfacts.org , we will see categories in English even for French products, using http://es.openfoodfacts.org will result in Spanish)
- Only if the category exists in the taxonomy
- Only if the target language exists for this category in the taxonomy
- If the target language doesn't exist, English is used to display the field value
- Nutrition facts are displayed in the target language
- The nutrition facts are displayed according to the country (different order and presentation for EU nutrition tables vs US/CA nutrition tables)
- Fields without a taxonomy are displayed in the language they were entered in
- common name, quantity, packaging, brands, origin of ingredients, manufacturing or processing places, city/state/country, stores, link to the product page, best before date
- ingredients
- Images for product front, ingredients and nutrition facts are for the main language of the product (and not the target language)
Solution
Data entry
- Taxonomize as many fields as possible
- Taxonomized field need to be completed in only one language
- Packaging, origins of ingredients, purchase places
- Note: packaging and purchase places do not correspond to text on the label
- Enable users to enter data for more than one language:
- Select different images or different part of the images for product front, ingredients and nutrition facts
- Enter data (text) for more than one language
- We keep the "main language" field
- Indicates the most prominent language on the label
- When the split is 50% / 50% (e.g. some products sold in Belgium with one side in Dutch and another side in French), picking either is fine.
Data display
- Taxonomized fields will be displayed in the target language
- Display other fields and pictures in the target language if the product has data in that language
- Indicate in which languages the product data is available, and provide a way to see it
Interface design
- How to make data entry not too overwhelming?
- Solution 1: tabs to switch between languages
- + button to add a language tab
- tab displays only the fields that need to be completed in other languages:
- images
- fiels like generic name, ingredients
- Solution 2: multiply each field by the number of languages, display all languages for one field together
- + button to dynamically add a new language
Technical design
Taxonomize all or most multilingual tag fields
- Packaging, origins of ingredients and purchase places are tag fields (they contain comma separated values) that are not yet taxonomized
- The existing taxonomy system will take care of the mapping between different languages
- The taxonomies need to be created or completed
- To be determined: what to do with brands?
- Typically not translated
- But exceptions exist
- Could benefit from being taxonomized in order to have a hierarchy
- Typically not translated
Selection/crop of images in more than one language
- For product, ingredients and nutrition facts
Entry of fields in more than one language
- Keep all exisiting fields as-is
- including the "main language" field
- e.g. ingredients, generic_name etc.
- "ingredients_text":"Chocolate, milk"
- For compatibility and to enable incremental implementation and deployment
- Create new fields suffixed by _lang, with a hash
- "ingredients_text_lang":{"en":"Chocolate, milk", "fr":"Chocolat, lait"}
- Create new languages field
- Hash that contains languages for which we have values for at least one field
- "languages":{"en":"1", "fr":"1"}