Open Food Facts API Version 2: Difference between revisions
(link to new documentation) |
|||
(16 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
<big>'''IMPORTANT:''' for more up to date information see https://openfoodfacts.github.io/openfoodfacts-server/api/</big> | |||
= Introduction = | = Introduction = | ||
Line 18: | Line 20: | ||
* Consistent types for all fields - see https://github.com/openfoodfacts/openfoodfacts-server/issues/4389 | * Consistent types for all fields - see https://github.com/openfoodfacts/openfoodfacts-server/issues/4389 | ||
* Consistent call parameters across all api's | * Consistent call parameters across all api's | ||
* | * Call and result independence - no need to know the call to interpret the result. | ||
* Multilingual support - | |||
* In the results make clear what is original, what is processed and how. | |||
= Related Issues = | = Related Issues = | ||
Inspiration for a new API can be found in the [https://github.com/openfoodfacts/openfoodfacts-server/issues?q=+label%3AAPI+ issues] with the API label. | |||
= | |||
* https://github.com/openfoodfacts/openfoodfacts-server/issues/4389 | |||
* https://github.com/openfoodfacts/openfoodfacts-server/pull/4436 | |||
== | = Generic requirements = | ||
These requirements are valid for all OFF-API's, whether read or write. | |||
== | == Field names == | ||
For the naming of the fields of the results json the following principles are used: | For the naming of the fields of the results json the following principles are used: | ||
* Typing - the to expected type of the values is encoded in the field names, by a string at the end: | * Typing - the to expected type of the values is encoded in the field names, by a string at the end: | ||
** _tags - expect a String array; | ** _tags - expect a String array; | ||
** _n - expect an Integer; | ** _n - expect an Integer; | ||
** _t - expect a time in seconds since | ** _t - expect a time in seconds since 1970, encoded as Integer; | ||
** _f - expect a float; | ** _f - expect a float; | ||
** none - the default will | ** none - the default will be a String | ||
== Multilingual support == | |||
Multilingual support is a confusing subject, as it in various places, and all can be different: | |||
* The user-interface of the user (application or computer); | |||
* The desired language of the user; | |||
* The main product language; | |||
* Other languages mentioned on the product; | |||
=== Parameters === | |||
As it is undesirable (to large) to present all possible languages, there must be a way to filter the output, by setting parameters: | |||
* '''lc=''' could specify the language desired by the user. This could match his interface, the language he speaks, the language he wants to show to friends, etc. This should be independent of the actual product. The default is english. Any field that is translatable by OFF, will be translated. Otherwise the main language of the product is used. The output provided by OFF should make clear what language is provided. | |||
It is also possible to limit the product languages provided. eg. only products with main language dutch, or products that have dutch. This falls either under the category search or filtering. | |||
These are parameters defined when writing a field in a specific language, say for updating a product name in spanish. This especially an issue when there are multiple languages involved. | |||
1 Using a default as set by another parameter (lc="en"), then any language dependent field will be in english; | |||
2 adapted field-names, i.e specify ingredients_en, ingredients_es, as needed; | |||
3 Explicit encoding for fields that can comprise multiple languages, i.e. labels="en:bottle, fr:bouteille à recycler, nl:glasbak" | |||
=== Json field names === | |||
A related question is how languages should be encoded in a resulting json. For each language dependent field the corresponding language must be known. There are three options: | |||
# Adapt field names - add a postfix to a field name to indicate the language, i.e. name_fr. Draw back of this approach is that the relevant postfixes are not known before hand. A list of possible languages in the json circumvents this. However this requires more extensive programming for the api user. (This approach is currently used). | |||
# Simple dictionary - encode a language dependent fields as a dictionary, i.e { "en":"in english", "fr":"en français"}. This keeps the drawback that the field names are not known before hand; | |||
# Structured dictionary - any field that can be in a specific language can be encoded as a standard "language" dictionary: { {"languageCode":"en", "languageValue":"in english"}, {"languageCode":"fr", "languageValue":"en français"} | |||
This is also valid for fields that will be in the interface language, as a translation is not always available. | |||
== Nutritional values == | |||
=== Units === | |||
The units that are used for each nutritional value, often leads to confusion. By defining each nutritional item as a (value,unit) tuple, this could be circumvented. | |||
= Product read API = | |||
== call == | |||
== result json == | |||
=== platform specific naming issues === | === platform specific naming issues === | ||
Line 43: | Line 82: | ||
=== comments === | === comments === | ||
* The json seems to have two user groups: OFF itself and end users (apps). Can the json be split in two section that correspond to these groups as well? Not sure whether this is a good approach. | * The json seems to have two user groups: OFF itself and end users (apps). Can the json be split in two section that correspond to these groups as well? Not sure whether this is a good approach. | ||
* The json can be more structured, so that it is more | * The json can be more structured, so that it is more readable for a viewer, but also from an parsing interpreting point of view. I.e. everything that corresponds to packaging goes together. | ||
* Improved field-names, which clearly reflect the origin or processing of the data. This should reduce the number of questions around field s and their interpretation. | * Improved field-names, which clearly reflect the origin or processing of the data. This should reduce the number of questions around field s and their interpretation. | ||
* version 1 of the api used a language code suffix (i.e. ''_en''), to indicate the language used for the strings. Which language codes should be looked at, depended on the content of the json itself. This implied that the field names needed to be constructed, adding extra complexity to the json decoder. Another solution is to use a dictionary to encode these strings like { "languageCode" : "en", "string": "stringValue }. | * version 1 of the api used a language code suffix (i.e. ''_en''), to indicate the language used for the strings. Which language codes should be looked at, depended on the content of the json itself. This implied that the field names needed to be constructed, adding extra complexity to the json decoder. Another solution is to use a dictionary to encode these strings like { "languageCode" : "en", "string": "stringValue }. | ||
Line 62: | Line 101: | ||
= Robotoff write API = | = Robotoff write API = | ||
This api allows to write | This api allows to write responses to insight questions provided by the Robotoff read API. | ||
= Search API = | = Search API = | ||
* See [[Open Food Facts Search API Version 2]] | * See [[Open Food Facts Search API Version 2]] | ||
[[Category:API]] |
Latest revision as of 15:31, 7 March 2024
IMPORTANT: for more up to date information see https://openfoodfacts.github.io/openfoodfacts-server/api/
Introduction
Version 1 of the Open Food Facts API has been developed organically since 2012, often according to specific needs (in particular the OFF app), and it has been modeled on the structure of the MongoDB database (which itself has evolved organically) and on the Web site (e.g. reusing or mimicking existing CGI forms).
As a result, version 1 of the API is not as standard / simple / easy / powerful / intuitive / documented etc. as it could be.
This page is to discuss the design of a better version 2 of the API.
Desired properties wishlist
What standards should the new API follow, what overally properties should it have etc.?
You can add what you want, and we can discuss it.
- A documentation that matches the implementation
- Non-breaking changes
- A direct mapping to the MongoDB format
- Consistent types for all fields - see https://github.com/openfoodfacts/openfoodfacts-server/issues/4389
- Consistent call parameters across all api's
- Call and result independence - no need to know the call to interpret the result.
- Multilingual support -
- In the results make clear what is original, what is processed and how.
Related Issues
Inspiration for a new API can be found in the issues with the API label.
- https://github.com/openfoodfacts/openfoodfacts-server/issues/4389
- https://github.com/openfoodfacts/openfoodfacts-server/pull/4436
Generic requirements
These requirements are valid for all OFF-API's, whether read or write.
Field names
For the naming of the fields of the results json the following principles are used:
- Typing - the to expected type of the values is encoded in the field names, by a string at the end:
- _tags - expect a String array;
- _n - expect an Integer;
- _t - expect a time in seconds since 1970, encoded as Integer;
- _f - expect a float;
- none - the default will be a String
Multilingual support
Multilingual support is a confusing subject, as it in various places, and all can be different:
- The user-interface of the user (application or computer);
- The desired language of the user;
- The main product language;
- Other languages mentioned on the product;
Parameters
As it is undesirable (to large) to present all possible languages, there must be a way to filter the output, by setting parameters:
- lc= could specify the language desired by the user. This could match his interface, the language he speaks, the language he wants to show to friends, etc. This should be independent of the actual product. The default is english. Any field that is translatable by OFF, will be translated. Otherwise the main language of the product is used. The output provided by OFF should make clear what language is provided.
It is also possible to limit the product languages provided. eg. only products with main language dutch, or products that have dutch. This falls either under the category search or filtering.
These are parameters defined when writing a field in a specific language, say for updating a product name in spanish. This especially an issue when there are multiple languages involved. 1 Using a default as set by another parameter (lc="en"), then any language dependent field will be in english; 2 adapted field-names, i.e specify ingredients_en, ingredients_es, as needed; 3 Explicit encoding for fields that can comprise multiple languages, i.e. labels="en:bottle, fr:bouteille à recycler, nl:glasbak"
Json field names
A related question is how languages should be encoded in a resulting json. For each language dependent field the corresponding language must be known. There are three options:
- Adapt field names - add a postfix to a field name to indicate the language, i.e. name_fr. Draw back of this approach is that the relevant postfixes are not known before hand. A list of possible languages in the json circumvents this. However this requires more extensive programming for the api user. (This approach is currently used).
- Simple dictionary - encode a language dependent fields as a dictionary, i.e { "en":"in english", "fr":"en français"}. This keeps the drawback that the field names are not known before hand;
- Structured dictionary - any field that can be in a specific language can be encoded as a standard "language" dictionary: { {"languageCode":"en", "languageValue":"in english"}, {"languageCode":"fr", "languageValue":"en français"}
This is also valid for fields that will be in the interface language, as a translation is not always available.
Nutritional values
Units
The units that are used for each nutritional value, often leads to confusion. By defining each nutritional item as a (value,unit) tuple, this could be circumvented.
Product read API
call
result json
platform specific naming issues
- Swift: it should be as easy as possible to decode a json with the standard Swift-libraries.
- no "-", which are not allowed in Swift variable names;
comments
- The json seems to have two user groups: OFF itself and end users (apps). Can the json be split in two section that correspond to these groups as well? Not sure whether this is a good approach.
- The json can be more structured, so that it is more readable for a viewer, but also from an parsing interpreting point of view. I.e. everything that corresponds to packaging goes together.
- Improved field-names, which clearly reflect the origin or processing of the data. This should reduce the number of questions around field s and their interpretation.
- version 1 of the api used a language code suffix (i.e. _en), to indicate the language used for the strings. Which language codes should be looked at, depended on the content of the json itself. This implied that the field names needed to be constructed, adding extra complexity to the json decoder. Another solution is to use a dictionary to encode these strings like { "languageCode" : "en", "string": "stringValue }.
Product write API
Images read API
Images write API
Robotoff read API
There is now a minimal documentation (link). The documentation needs some more explanation.
call
result json
The result json contains fields that seem more useful to the internal workings of Robotoff than the enduser. Can these be split/structured in the json as well?
Robotoff write API
This api allows to write responses to insight questions provided by the Robotoff read API.