Reusing Open Food Facts Data: Difference between revisions

Reusing Open Food Facts Data (view source)

760 bytes added , 14 May 2020

Add jsonl export and examples with jq

951

edits

@@ Line 71: / Line 71: @@
 ==== R stat ====
 For people who have R stat skills, there are [https://www.kaggle.com/openfoodfacts/world-food-facts/kernels?sortBy=hotness&group=everyone&pageSize=20&datasetId=20&language=R more than 50 notebooks from Kaggle community].
+=== jsonl export ===
+jsonl is a huge file! It's not possible to play with it with common editors or common tools. But there is some command line tools that allows interesting things, like [https://stedolan.github.io/jq/manual/v1.6/ jq].
+==== jq ====
+* start decompress the file (be carreful => 17GB after decompression):
+ $ gunzip openfoodfacts-products.jsonl.gz
+* work on a small subset to test. E.g. for 100 products:
+ $ head -n 100 openfoodfacts-products.jsonl > small.jsonl
+You can start playing with jq. Here are examples.
+ $ cat small.jsonl | jq . # print all file in JSON format
+ $ cat small.jsonl | jq -r .code # print all products' codes.
+ $ cat small.jsonl | jq -r '[.code,.product_name] | @csv' # output a CSV file containing code,product_name