75
edits
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
This project aims to investigate all non-standard codes, usually printed on the packaging using a machine, such as batch numbers, the quality numbers ... and get the most information possible. | This project aims to investigate packaging codes and all non-standard codes, usually printed on the packaging using a machine, such as batch numbers, the quality numbers ... and get the most information possible. | ||
==Codes== | |||
=== EMB Codes === | === EMB Codes === | ||
Line 51: | Line 51: | ||
**There is a general csv file with the general link to the data repository for each country: https://github.com/openfoodfacts/eu-food-data/blob/master/list-eu-and-partner-countries.csv | **There is a general csv file with the general link to the data repository for each country: https://github.com/openfoodfacts/eu-food-data/blob/master/list-eu-and-partner-countries.csv | ||
**There are individual folders for each target country (specific url-list for each European Agreement Section and data files) | **There are individual folders for each target country (specific url-list for each European Agreement Section and data files) | ||
* A google sheet document is used to map all files available in the target countries. It also map the section name for every country in its own language (or translation in English) and the related European Section, which is used as a general taxonomy. | * A google sheet document is used to map all files available in the target countries. It also map the section name for every country in its own language (or translation in English) and the related European Section, which is used as a general taxonomy. This Google Sheet can be found here: https://docs.google.com/spreadsheets/d/1egdo58Ds8PNi5G_4F2UtWOWC1V0k3tXBgPhZXs5FRqM/edit?usp=sharing | ||
This Google Sheet can be found here: https://docs.google.com/spreadsheets/d/1egdo58Ds8PNi5G_4F2UtWOWC1V0k3tXBgPhZXs5FRqM/edit?usp=sharing | *We can usually find txt or csv file but for some countries the data is only available in PDF. Those need OCR treatment before data extraction. | ||
===France=== | ===France=== | ||
*I just build a script that takes all the french agreement info from Agriculture Ministry and concatenate them in one file. Next step is to do the same for UK. The step after that is to cleverly agregate the duplicates (some companies have several health agreements under the same agreement number) | *I just build a script that takes all the french agreement info from Agriculture Ministry and concatenate them in one file. Next step is to do the same for UK. The step after that is to cleverly agregate the duplicates (some companies have several health agreements under the same agreement number) |
edits