Ingredients Analysis Quality Evaluation - August 2023: Difference between revisions
(Created page with "Evaluation of the quality of Ingredients Extraction and Analysis so that we can measure the improvements. == Ingredients parsing and recognition == For each product, the...") Β |
No edit summary |
||
Line 34: | Line 34: | ||
Β || | Β || | ||
<pre> | <pre> | ||
2593 εζζ: | |||
Type Unique tags Occurrences | |||
known 766 (29.54%) 8466 (77.51%) | |||
unknown 1827 (70.46%) 2456 (22.49%) | |||
all 2593 (100.00%) 10922 (100.00%) | |||
</pre> | </pre> | ||
|- | |- | ||
Line 48: | Line 53: | ||
|| | || | ||
<pre> | <pre> | ||
16 εζζ: | |||
Type Unique tags Occurrences | |||
known 15 (93.75%) 15 (93.75%) | |||
unknown 1 (6.25%) 1 (6.25%) | |||
all 16 (100.00%) 16 (100.00%) | |||
</pre> | </pre> | ||
|} | |} | ||
Line 69: | Line 79: | ||
Β || | Β || | ||
<pre> | <pre> | ||
5254 sastojci: | |||
Type Unique tags Occurrences | |||
known 1197 (22.78%) 26194 (84.69%) | |||
unknown 4057 (77.22%) 4735 (15.31%) | |||
all 5254 (100.00%) 30929 (100.00%) | |||
</pre> | </pre> | ||
|- | |- | ||
Line 82: | Line 97: | ||
</pre> | </pre> | ||
|| | || | ||
45 sastojci: | |||
Type Unique tags Occurrences | |||
known 36 (80.00%) 46 (83.64%) | |||
unknown 9 (20.00%) 9 (16.36%) | |||
all 45 (100.00%) 55 (100.00%) | |||
|} | |} | ||
Revision as of 08:25, 31 July 2023
Evaluation of the quality of Ingredients Extraction and Analysis so that we can measure the improvements.
Ingredients parsing and recognition
For each product, the ingredients list is parsed to separate each ingredient. Ingredients that we can match to our multilingual ingredients taxonomy as marked as "known", others as "unknown".
There are different reasons an ingredient can be marked as unknown:
- The input ingredient list is incorrect (misspellings etc.)
- The ingredients list contains other things than ingredients and we have not been able to split it at the right place.
- We have not been able to parse a particular sentence structure or formatting (e.g. "a drop of delicious honey with a shower of powder sugar")
- The ingredient is not yet present in our taxonomy (or not with the right synonym or in the right language)
In the table below,
- the first column of numbers corresponds to the number of unique ingredients across all products.
- And the second column of number corresponds to the number of occurrences of those ingredients. e.g. if a specific ingredient appears in 5 products, it is counted 5 times.
JP - Japanese
2023-07-30 | 2022-07-31 | |
---|---|---|
All products: |
2822 εζζ: Type Unique tags Occurrences known 726 (25.73%) 7191 (69.84%) unknown 2096 (74.27%) 3105 (30.16%) all 2822 (100.00%) 10296 (100.00%) |
2593 εζζ: Type Unique tags Occurrences known 766 (29.54%) 8466 (77.51%) unknown 1827 (70.46%) 2456 (22.49%) all 2593 (100.00%) 10922 (100.00%) |
Top 10k most scanned products: |
16 εζζ: Type Unique tags Occurrences known 15 (93.75%) 15 (93.75%) unknown 1 (6.25%) 1 (6.25%) all 16 (100.00%) 16 (100.00%) |
16 εζζ: Type Unique tags Occurrences known 15 (93.75%) 15 (93.75%) unknown 1 (6.25%) 1 (6.25%) all 16 (100.00%) 16 (100.00%) |
HR - Croatian
2023-07-30 | 2022-07-31 | |
---|---|---|
All products: |
5700 sastojci: Type Unique tags Occurrences known 1304 (22.88%) 25105 (83.02%) unknown 4396 (77.12%) 5134 (16.98%) all 5700 (100.00%) 30239 (100.00%) |
5254 sastojci: Type Unique tags Occurrences known 1197 (22.78%) 26194 (84.69%) unknown 4057 (77.22%) 4735 (15.31%) all 5254 (100.00%) 30929 (100.00%) |
Top 10k most scanned products: |
45 sastojci: Type Unique tags Occurrences known 36 (80.00%) 46 (83.64%) unknown 9 (20.00%) 9 (16.36%) all 45 (100.00%) 55 (100.00%) |
45 sastojci: Type Unique tags Occurrences known 36 (80.00%) 46 (83.64%) unknown 9 (20.00%) 9 (16.36%) all 45 (100.00%) 55 (100.00%) |