Open Food Facts API Version 2

From Open Food Facts wiki

Introduction

Version 1 of the Open Food Facts API has been developed organically since 2012, often according to specific needs (in particular the OFF app), and it has been modeled on the structure of the MongoDB database (which itself has evolved organically) and on the Web site (e.g. reusing or mimicking existing CGI forms).

As a result, version 1 of the API is not as standard / simple / easy / powerful / intuitive / documented etc. as it could be.

This page is to discuss the design of a better version 2 of the API.

Desired properties wishlist

What standards should the new API follow, what overally properties should it have etc.?

You can add what you want, and we can discuss it.

  • A documentation that matches the implementation
  • Non-breaking changes
  • A direct mapping to the MongoDB format
  • Consistent types for all fields - see https://github.com/openfoodfacts/openfoodfacts-server/issues/4389
  • Consistent call parameters across all api's
  • Call and result independence - no need to know the call to interpret the result.
  • Multilingual support -
  • In the results make clear what is original, what is processed and how.

Related Issues

Inspiration for a new API can be found in the issues with the API label.

Generic requirements

These requirements are valid for all OFF-API's, whether read or write.

Multilingual support

Multilingual support is a confusing subject, as it in various places, and all can be different:

  • The user-interface of the user (application or computer);
  • The desired language of the user;
  • The main product language;
  • Other languages mentioned on the product;

Read parameters

As it is undesirable (to large) to present all possible languages, there must be a way to filter the output, by setting parameters:

  • lc= could specify the language desired by the user. This could match his interface, the language he speaks, the language he wants to show to friends, etc. This should be independent of the actual product. The default is english. Any field that is translatable by OFF, will be translated. Otherwise the main language of the product is used. The output provided by OFF should make clear what language is provided.

It is also possible to limit the product languages provided. eg. only products with main language dutch, or products that have dutch. This falls either under the category search or filtering.

Json field names

A related question is how languages should be encoded in a resulting json. For each language dependent field the corresponding language must be known. There are three options:

  1. Adapt field names - add a postfix to a field name to indicate the language, i.e. name_fr. Draw back of this approach is that the relevant postfixes are not known before hand. A list of possible languages in the json circumvents this. However this requires more extensive programming for the api user. (This approach is currently used).
  2. Simple dictionary - encode a language dependent fields as a dictionary, i.e { "en":"in english", "fr":"en français"}. This keeps the drawback that the fieldnames are not known before hand;
  3. Structured dictionary - any field that can be in a specific language can be encoded as a standard "language" dictionary: { {"languageCode":"en", "languageValue":"in english"}, {"languageCode":"fr", "languageValue":"en français"}

This is also valid for fields that will be in the interface language, as a translation is not always available.

Write parameters

These are parameters defined when writing a field in a specific language, say for updating a product name in spanish. This especially an issue when there are multiple languages involved. 1 Using a default as set by another parameter (lc="en"), then any language dependent field will be in english; 2 adapted field-names, i.e specify ingredients_en, ingredients_es, as needed; 3 Explicit encoding for fields that can comprise multiple languages, i.e. labels="en:bottle, fr:bouteille à recycler, nl:glasbak"

Product read API

call

result json

field names

For the naming of the fields of the results json the following principles are used:

  • Typing - the to expected type of the values is encoded in the field names, by a string at the end:
    • _tags - expect a String array;
    • _n - expect an Integer;
    • _t - expect a time in seconds since ?, encoded as Integer;
    • _f - expect a float;
    • none - the default will b a String

platform specific naming issues

  • Swift: it should be as easy as possible to decode a json with the standard Swift-libraries.
    • no "-", which are not allowed in Swift variable names;

comments

  • The json seems to have two user groups: OFF itself and end users (apps). Can the json be split in two section that correspond to these groups as well? Not sure whether this is a good approach.
  • The json can be more structured, so that it is more readible for a viewer, but also from an parsing interpreting point of view. I.e. everything that corresponds to packaging goes together.
  • Improved field-names, which clearly reflect the origin or processing of the data. This should reduce the number of questions around field s and their interpretation.
  • version 1 of the api used a language code suffix (i.e. _en), to indicate the language used for the strings. Which language codes should be looked at, depended on the content of the json itself. This implied that the field names needed to be constructed, adding extra complexity to the json decoder. Another solution is to use a dictionary to encode these strings like { "languageCode" : "en", "string": "stringValue }.

Product write API

Images read API

Images write API

Robotoff read API

There is now a minimal documentation (link). The documentation needs some more explanation.

call

result json

The result json contains fields that seem more useful to the internal workings of Robotoff than the enduser. Can these be split/structured in the json as well?

Robotoff write API

This api allows to write reponses to insight questions provided by the Robotoff read API.

Search API