Johnliquid
Johnliquid

Reputation: 423

Converting JSON to .csv

I've found some data that someone is downloading into a JSON file (I think! - I'm a newb!). The file contains data on nearly 600 football players.

Here's the file: https://raw.githubusercontent.com/llimllib/fantasypl_stats/f944410c21f90e7c5897cd60ecca5dc72b5ab619/data/players.1426687570.json

Is there a way I can grab some of the data and convert it to .csv? Specifically the 'Fixture History'?

Thanks in advance for any help :)

Upvotes: 0

Views: 20707

Answers (3)

jq170727
jq170727

Reputation: 14665

Here is a solution using jq

If the file filter.jq contains

  .[]
| {first_name, second_name, all:.fixture_history.all[]}
| [.first_name, .second_name, .all[]]
| @csv

and data.json contains the sample data then the command

jq -M -r -f filter.jq data.json 

will produce the output (note only 10 rows shown here)

"Wojciech","Szczesny","16 Aug 17:30",1,"CRY(H) 2-1",90,0,0,0,1,0,0,0,0,0,1,0,13,7,0,55,2
"Wojciech","Szczesny","23 Aug 17:30",2,"EVE(A) 2-2",90,0,0,0,2,0,0,0,0,0,0,0,5,9,-9306,55,1
"Wojciech","Szczesny","31 Aug 16:00",3,"LEI(A) 1-1",90,0,0,0,1,0,0,0,1,0,2,0,7,15,-20971,55,1
"Wojciech","Szczesny","13 Sep 12:45",4,"MCI(H) 2-2",90,0,0,0,2,0,0,0,0,0,6,0,12,17,-39686,55,3
"Wojciech","Szczesny","20 Sep 15:00",5,"AVL(A) 3-0",90,0,0,1,0,0,0,0,0,0,2,0,14,22,-15931,55,6
"Wojciech","Szczesny","27 Sep 17:30",6,"TOT(H) 1-1",90,0,0,0,1,0,0,0,0,0,4,0,10,13,-5389,55,3
"Wojciech","Szczesny","05 Oct 14:05",7,"CHE(A) 0-2",90,0,0,0,2,0,0,0,0,0,1,0,3,9,-8654,55,1
"Wojciech","Szczesny","18 Oct 15:00",8,"HUL(H) 2-2",90,0,0,0,2,0,0,0,0,0,2,0,7,9,-824,54,1
"Wojciech","Szczesny","25 Oct 15:00",9,"SUN(A) 2-0",90,0,0,1,0,0,0,0,0,0,3,0,16,22,-11582,54,7

Upvotes: 6

brennan
brennan

Reputation: 61

Python has some good libraries for doing this. If you copy the following code into a file and save it as fix_hist.py or something, then save your JSON file as file.json in the same directory, it will create a csv file with the fixture histories each saved as a row. Just run python fix_hist.py in your command prompt (or terminal for mac):

import csv
import json

json_data = open("file.json")
data = json.load(json_data)

f = csv.writer(open("fix_hists.csv","wb+"))

for i in data:
    fh = data[i]["fixture_history"]
    array = fh["all"]
    for j in array:
        f.writerow(j)

json_data.close()

To add additional data to the fixture history, you can add insert statements before writing the rows:

import csv
import json

json_data = open("file.json")
data = json.load(json_data)

f = csv.writer(open("fix_hists.csv","wb+"))

arr = []

for i in data:
    fh = data[i]["fixture_history"]
    array = fh["all"]
    for j in array:

        try:
            j.insert(0,str(data[i]["first_name"]))
        except:
            j.insert(0,'error')

        try:
            j.insert(1,data[i]["web_name"])
        except:
            j.insert(1,'error')

        try:
            f.writerow(j)
        except:
            f.writerow(['error','error'])

json_data.close()

With insert(), just indicate the position in the row you want the data point to occupy as the first argument.

Upvotes: 0

Sobrique
Sobrique

Reputation: 53478

JSON is a more detailed data format than CSV - it allows for more complex data structures. Inevitably if you do this, you 'lose detail'.

If you want to fetch it automatically - that's doable, but I've skipped it because 'doing' https URLs is slightly more complicated.

So assuming you've downloaded your file, here's a possible solution in Perl (You've already got one for Python - both are very powerful scripting languages, but can pretty much cover the same ground - so it's as much a matter of taste as to which you use).

#!/usr/bin/perl

use strict;
use warnings;
use JSON;

my $file = 'players.json';

open( my $input, "<", $file ) or die $!;

my $json_data = decode_json(
    do { local $/; <$input> }
);
foreach my $player_id ( keys %{$json_data} ) {
    foreach my $fixture (
        @{ $json_data->{$player_id}->{fixture_history}->{all} } )
    {
        print join( ",",
            $player_id, $json_data->{$player_id}->{web_name},
            @{$fixture}, "\n", );
    }

}

Hopefully you can see what's going on here - you load the file $input, and decode_json to create a data structure.

This data structure is a nested hash (perl's term for the type of data structure). hashes are key-value pairs.

So we extract the keys from this hash - which is the ID number right at the beginning of each entry.

Then we loop through each of them - extracting the the fixture_history array. And for each element in that array, we print the player ID, their web_name and then the data from fixture_history.

This gives output like:

1,Szczesny,10 Feb 19:45,25,LEI(H) 2-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-2413,52,0,
1,Szczesny,21 Feb 15:00,26,CRY(A) 2-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-2805,52,0,
1,Szczesny,01 Mar 14:05,27,EVE(H) 2-0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1862,52,0,
1,Szczesny,04 Mar 19:45,28,QPR(A) 2-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1248,52,0,
1,Szczesny,14 Mar 15:00,29,WHU(H) 3-0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1897,52,0, 

Does this make sense?

Upvotes: 1

Related Questions