Jump to content

Ingredients Analysis Quality Evaluation - August 2023: Difference between revisions

no edit summary
No edit summary
 
(7 intermediate revisions by 3 users not shown)
Line 8: Line 8:


* The input ingredient list is incorrect (misspellings etc.)
* The input ingredient list is incorrect (misspellings etc.)
* The ingredients list contains other things than ingredients and we have not been able to split it at the right place.
* The ingredients list contains other things than ingredients, and we have not been able to split it at the right place.
* We have not been able to parse a particular sentence structure or formatting (e.g. "a drop of delicious honey with a shower of powder sugar")
* We have not been able to parse a particular sentence structure or formatting (e.g., "a drop of delicious honey with a shower of powder sugar")
* The ingredient is not yet present in our taxonomy (or not with the right synonym or in the right language)
* The ingredient is not yet present in our taxonomy (or not with the right synonym or in the right language)


In the table below,
In the table below,
* the first column of numbers corresponds to the '''number of unique ingredients''' across all products.
* the first column of numbers corresponds to the '''number of unique ingredients''' across all products.
* And the second column of number corresponds to the '''number of occurrences''' of those ingredients. e.g. if a specific ingredient appears in 5 products, it is counted 5 times.
* And the second column of number corresponds to the '''number of occurrences''' of those ingredients. e.g., if a specific ingredient appears in 5 products, it is counted 5 times.




Line 56: Line 56:
{| class="wikitable"
{| class="wikitable"
|-
|-
!  !! 2023-07-30 || 2022-07-31
!  !! 2023-07-30 || 2023-07-31 || 2023-09-20
|-
|-
| [https://hr.openfoodfacts.org/ingredients?stats=1&no_cache=1 All products:] ||
| [https://hr.openfoodfacts.org/ingredients?stats=1&no_cache=1 All products:] ||
Line 75: Line 75:
unknown 4057 (77.22%) 4735 (15.31%)
unknown 4057 (77.22%) 4735 (15.31%)
all 5254 (100.00%) 30929 (100.00%)
all 5254 (100.00%) 30929 (100.00%)
</pre>
||
<pre>
3420 sastojci:
Type Unique tags Occurrences
known 1413 (41.32%) 29483 (92.97%)
unknown 2007 (58.68%) 2229 (7.03%)
all 3420 (100.00%) 31712 (100.00%)
</pre>
</pre>
|-
|-
Line 82: Line 91:
</pre>
</pre>
||
||


|}
|}
Line 89: Line 99:
{| class="wikitable"
{| class="wikitable"
|-
|-
!  !! 2023-08-15 || 2022-08-16
|| 2022-08-16!! 2023-08-15
!2023-12-01
|-
|-
| [https://pl.openfoodfacts.org/ingredients?stats=1&no_cache=1 All products:] ||
| [https://pl.openfoodfacts.org/ingredients?stats=1&no_cache=1 All products:]  
||
<pre>
14121 składniki:
 
Type Unique tags Occurrences
known 1918 (13.58%) 127042 (89.13%)
unknown 12203 (86.42%) 15488 (10.87%)
all 14121 (100.00%) 142530 (100.00%)
</pre>||
<pre>
<pre>
16160 składniki:
16160 składniki:
Line 100: Line 120:
all 16160 (100.00%) 136980 (100.00%)
all 16160 (100.00%) 136980 (100.00%)
</pre>
</pre>
||
|<pre>
16478 składniki:
 
Type Unique tags Occurrences
known 1957 (11.88%) 135344 (87.87%)
unknown 14521 (88.12%) 18689 (12.13%)
all 16478 (100.00%) 154033 (100.00%)
</pre>
|-
| [https://pl.openfoodfacts.org/popularity/top-10000-pl-scans-2022/ingredients?stats=1&no_cache=1 Top 10k most scanned products:]
||
<pre>
<pre>
14121 składniki:
3627 składniki:


Type Unique tags Occurrences
Type Unique tags Occurrences
known 1918 (13.58%) 127042 (89.13%)
known 1290 (35.57%) 34703 (92.74%)
unknown 12203 (86.42%) 15488 (10.87%)
unknown 2337 (64.43%) 2718 (7.26%)
all 14121 (100.00%) 142530 (100.00%)
all 3627 (100.00%) 37421 (100.00%)
</pre>
</pre>||  
|-
| [https://pl.openfoodfacts.org/popularity/top-10000-pl-scans-2022/ingredients?stats=1&no_cache=1 Top 10k most scanned products:] ||  
<pre>
<pre>
4406 składniki:
4406 składniki:
Line 118: Line 146:
unknown 3110 (70.59%) 4189 (11.70%)
unknown 3110 (70.59%) 4189 (11.70%)
all 4406 (100.00%) 35794 (100.00%)
all 4406 (100.00%) 35794 (100.00%)
</pre>
|<pre>
3809 składniki:
Type Unique tags Occurrences
known 1309 (34.37%) 35540 (92.39%)
unknown 2500 (65.63%) 2926 (7.61%)
all 3809 (100.00%) 38466 (100.00%)
</pre>
|}
=== UK + English ===
{| class="wikitable"
|-
!  !! 2023-09-14
|'''2023-12-01'''
|-
| [https://uk.openfoodfacts.org/ingredients?stats=1&no_cache=1 All products:] ||
<pre>
60330 ingredients:
Type Unique tags Occurrences
known 2840 (4.71%) 649282 (88.62%)
unknown 57490 (95.29%) 83403 (11.38%)
all 60330 (100.00%) 732685 (100.00%)
</pre>
||
<pre>
63758 ingredients:
Type Unique tags Occurrences
known 2916 (4.57%) 723156 (89.04%)
unknown 60842 (95.43%) 89015 (10.96%)
all 63758 (100.00%) 812171 (100.00%)
</pre>
|-
| [https://uk.openfoodfacts.org/popularity/top-10000-gb-scans-2022/ingredients?stats=1&no_cache=1 Top 10k most scanned products:] ||
<pre>
7243 ingredients:
Type Unique tags Occurrences
known 1728 (23.86%) 79991 (92.23%)
unknown 5515 (76.14%) 6740 (7.77%)
all 7243 (100.00%) 86731 (100.00%)
</pre>
</pre>
||
||
3627 składniki:
<pre>
7223 ingredients:


Type Unique tags Occurrences
Type Unique tags Occurrences
known 1290 (35.57%) 34703 (92.74%)
known 1760 (24.37%) 83819 (92.64%)
unknown 2337 (64.43%) 2718 (7.26%)
unknown 5463 (75.63%) 6661 (7.36%)
all 3627 (100.00%) 37421 (100.00%)
all 7223 (100.00%) 90480 (100.00%)</pre>
|}
|}
=== Observations ===
=== Observations ===


Line 133: Line 206:
[[Category:Project:Personalized_Search]]
[[Category:Project:Personalized_Search]]
[[Category:Data quality]]
[[Category:Data quality]]
[[Category:Ingredients]]
[[Category:Metrics]]