Metrics: Difference between revisions

From Open Food Facts wiki
(Created page with "An InfluxDB/Grafana stack has been set up to be able to follow essential metrics for Open Food Facts team. The dashboards can be viewed at https://metrics.openfoodfacts.org....")
 
No edit summary
 
(10 intermediate revisions by 3 users not shown)
Line 1: Line 1:
An InfluxDB/Grafana stack has been set up to be able to follow essential metrics for Open Food Facts team.
[[File:Metrics-up.png|thumb]]
The dashboards can be viewed at https://metrics.openfoodfacts.org.
== Why ? ==
== What ? ==
== 🎯 Roadmap for metrics harvesting ==


This is different from monitoring
== How ? ==
=== Real-life metrics ===
* Are people changing their behaviour thanks to Open Food Facts ?
* Are producers improving their products thanks to the Producer Platform ?


There is currently a single InfluxDB bucket (=similar to SQL database) ''off_metrics''.
=== Matomo ===
* We have our own instance of Matomo, to avoid any third party analytics tool which raises privacy concerns.
* We have generic instrumenting with Matomo, and still little Events instrumenting (eg: Category addition, clicking on "Extract ingredients…)
* Our metrics are not yet correlated with product data (eg it's hard at the moment to tell exactly how many Nutri-Scores we displayed in a year)


There are two measurements (=similar to SQL table): ''facets'' and ''insights''.
=== Reporting dashboard ===
An InfluxDB/Grafana stack has been set up to allow Open Food Facts team to be able to follow essential metrics. The dashboards can be viewed at https://metrics.openfoodfacts.org.


=== insights ===
There is a read-only account, you can access the credentials on Slack to have access to it.


Save metrics about Robotoff ''product_insight'' PostgreSQL table. The export is performed daily at night time by Robotoff.
Metrics is different from monitoring (https://monitoring.openfoodfacts.org), we don't store on metrics.openfoodfacts.org infrastructure-related data.
 
We have a single [https://en.wikipedia.org/wiki/InfluxDB InfluxDB] bucket (=similar to SQL database) ''<code>off_metrics</code>''. There are two measurements (=similar to SQL table): ''<code>facets</code>'' and ''<code>insights</code>''.
 
==== <code>insights</code> ====
 
Save metrics about Robotoff <code>product_insight</code> PostgreSQL table. The export is performed daily at night time by Robotoff.
 
Columns:
* <code>annotation</code>: annotation status of the insight, either 0, -1, 1 or <nil>.
* <code>automatic_processing</code>: whether the insight will be (or has been) automatically processed, either "True" or "False".
* <code>predictor</code>: the predictor of the insight (=model or method that generated the insight)
* <code>reserved_barcode</code>: either "False" or "True", if True the product barcode is a reserved barcode, mostly used for variable weight products
* <code>type</code>: the insight type, see [https://github.com/openfoodfacts/robotoff/blob/9598d29cc2a23b802ef6eff1c7c154607a6ccc93/robotoff/types.py#L51 InsightType] class in Robotoff codebase for a complete list.
* <code>count</code>: the number of insights with the previous characteristics.
* <code>percent</code>: the % of insights with the previous characteristics, over the total number of insights.
 
==== <code>facets</code> ====
Save metrics about Product Opener facets, using public facet API. The export is performed daily at night time by Robotoff by calling Product Opener API.
 
Current saved facets:
 
* <code>ingredients-analysis</code>
* <code>data-quality</code>
* <code>ingredients</code>
* <code>states</code>
* <code>misc</code>


Columns:
Columns:
* time: timestamp of the datapoint.
 
* annotation: annotation status of the insight, either 0, -1, 1 or <nil>.
* <code>country</code>: the ISO 2-letter code of the country (ISO_3166-1 alpha-2), or "world" for metrics on the full database. Only a selected subset of country is available.
* automatic_processing: whether the insight will be (or has been) automatically processed, either "True" or "False".
* <code>facet</code>: the name of the facet, it's the lower-case facet name with '-' replaced with '_' (ex: `data_quality` instead of `data-quality`)
* predictor: the predictor of the insight (=model or method that generated the insight)
* <code>tag_name</code>: name of the tag
* reserved_barcode: either "False" or "True", if True the product barcode is a reserved barcode, mostly used for variable weight products
* <code>tag_id</code>: identifier of the tag (ex: <code>en:alcoholic-beverages-category-without-alcohol-value</code>), this is the field you will probably have to use
* type: the insight type, see [https://github.com/openfoodfacts/robotoff/blob/9598d29cc2a23b802ef6eff1c7c154607a6ccc93/robotoff/types.py#L51 InsightType] class in Robotoff codebase for a complete list.
* <code>products</code>: number of products with the given tag
* count: the number of insights with the previous characteristics.
* <code>percent</code>: % of products with the given tag over the total number of product for the country
* percent: the % of insights with the previous characteristics, over the total number of insights.
 
 
== Relevant links ==
* https://github.com/openfoodfacts/recipe-estimator-metrics
* https://github.com/openfoodfacts/openfoodfacts-metrics/issues
* [[Ingredients_Analysis_Quality]]
 
[[Category:Metrics]]

Latest revision as of 11:38, 18 August 2024

Why ?

What ?

🎯 Roadmap for metrics harvesting

How ?

Real-life metrics

  • Are people changing their behaviour thanks to Open Food Facts ?
  • Are producers improving their products thanks to the Producer Platform ?

Matomo

  • We have our own instance of Matomo, to avoid any third party analytics tool which raises privacy concerns.
  • We have generic instrumenting with Matomo, and still little Events instrumenting (eg: Category addition, clicking on "Extract ingredients…)
  • Our metrics are not yet correlated with product data (eg it's hard at the moment to tell exactly how many Nutri-Scores we displayed in a year)

Reporting dashboard

An InfluxDB/Grafana stack has been set up to allow Open Food Facts team to be able to follow essential metrics. The dashboards can be viewed at https://metrics.openfoodfacts.org.

There is a read-only account, you can access the credentials on Slack to have access to it.

Metrics is different from monitoring (https://monitoring.openfoodfacts.org), we don't store on metrics.openfoodfacts.org infrastructure-related data.

We have a single InfluxDB bucket (=similar to SQL database) off_metrics. There are two measurements (=similar to SQL table): facets and insights.

insights

Save metrics about Robotoff product_insight PostgreSQL table. The export is performed daily at night time by Robotoff.

Columns:

  • annotation: annotation status of the insight, either 0, -1, 1 or <nil>.
  • automatic_processing: whether the insight will be (or has been) automatically processed, either "True" or "False".
  • predictor: the predictor of the insight (=model or method that generated the insight)
  • reserved_barcode: either "False" or "True", if True the product barcode is a reserved barcode, mostly used for variable weight products
  • type: the insight type, see InsightType class in Robotoff codebase for a complete list.
  • count: the number of insights with the previous characteristics.
  • percent: the % of insights with the previous characteristics, over the total number of insights.

facets

Save metrics about Product Opener facets, using public facet API. The export is performed daily at night time by Robotoff by calling Product Opener API.

Current saved facets:

  • ingredients-analysis
  • data-quality
  • ingredients
  • states
  • misc

Columns:

  • country: the ISO 2-letter code of the country (ISO_3166-1 alpha-2), or "world" for metrics on the full database. Only a selected subset of country is available.
  • facet: the name of the facet, it's the lower-case facet name with '-' replaced with '_' (ex: `data_quality` instead of `data-quality`)
  • tag_name: name of the tag
  • tag_id: identifier of the tag (ex: en:alcoholic-beverages-category-without-alcohol-value), this is the field you will probably have to use
  • products: number of products with the given tag
  • percent: % of products with the given tag over the total number of product for the country


Relevant links