951
edits
(→CSV daily export: adding about importing CSV to SQLite) |
(Add jsonl export and examples with jq) |
||
Line 71: | Line 71: | ||
==== R stat ==== | ==== R stat ==== | ||
For people who have R stat skills, there are [https://www.kaggle.com/openfoodfacts/world-food-facts/kernels?sortBy=hotness&group=everyone&pageSize=20&datasetId=20&language=R more than 50 notebooks from Kaggle community]. | For people who have R stat skills, there are [https://www.kaggle.com/openfoodfacts/world-food-facts/kernels?sortBy=hotness&group=everyone&pageSize=20&datasetId=20&language=R more than 50 notebooks from Kaggle community]. | ||
=== jsonl export === | |||
jsonl is a huge file! It's not possible to play with it with common editors or common tools. But there is some command line tools that allows interesting things, like [https://stedolan.github.io/jq/manual/v1.6/ jq]. | |||
==== jq ==== | |||
* start decompress the file (be carreful => 17GB after decompression): | |||
$ gunzip openfoodfacts-products.jsonl.gz | |||
* work on a small subset to test. E.g. for 100 products: | |||
$ head -n 100 openfoodfacts-products.jsonl > small.jsonl | |||
You can start playing with jq. Here are examples. | |||
$ cat small.jsonl | jq . # print all file in JSON format | |||
$ cat small.jsonl | jq -r .code # print all products' codes. | |||
$ cat small.jsonl | jq -r '[.code,.product_name] | @csv' # output a CSV file containing code,product_name |