Reputation: 149
I have a CSV file that has the data -
Time,site_name,cell_name,RRC_attempts,rrc_succ_rate
2018-01-12T08:37:00-06:00,910536_ARPIN,910536-24,1,100.0
2018-01-12T08:37:00-06:00,910536_ARPIN,910536-34,0,0.0
2018-01-12T08:37:00-06:00,910536_ARPIN,910536-14,5,100.0
I am using the json module in python to convert this csv to json
import json
import csv
csvfile_ind = open("test.csv",'r')
reader_ind = csv.DictReader(csvfile_ind)
json_file_ind = open("test_json.json", 'w')
for row in reader_ind:
json_file_ind.write(json.dumps(row,sort_keys=False, indent=4, separators=(',', ': ')))
My current output is -
[
{
"Time": "2018-01-12T08:37:00-06:00",
"site_name": "910536_ARPIN",
"cell_name": "910536-24",
"RRC_attempts": "1",
"rrc_succ_rate": "100.0"
},
{
"Time": "2018-01-12T08:37:00-06:00",
"site_name": "910536_ARPIN",
"cell_name": "910536-34",
"RRC_attempts": "0",
"rrc_succ_rate": "0.0"
},
{
"Time": "2018-01-12T08:37:00-06:00",
"site_name": "910536_ARPIN",
"cell_name": "910536-14",
"RRC_attempts": "5",
"rrc_succ_rate": "100.0"
}
]
My desired output is -
[
{
"Time": "2018-01-12T08:37:00-06:00",
"site_name": "910536_ARPIN",
"cell_name": "910536-24",
"RRC_attempts": 1,
"rrc_succ_rate": 100
},
{
"Time": "2018-01-12T08:37:00-06:00",
"site_name": "910536_ARPIN",
"cell_name": "910536-34",
"RRC_attempts": 0,
"rrc_succ_rate": 0
},
{
"Time": "2018-01-12T08:37:00-06:00",
"site_name": "910536_ARPIN",
"cell_name": "910536-14",
"RRC_attempts": 5,
"rrc_succ_rate": 100
}
]
How can tell json to parse the numbers as int or float and not as strings ? Please advise. Note - while writing my CSV file I explicitly converted my values to int or float using int() or float().
Upvotes: 1
Views: 13045
Reputation: 1
If you want ints as ints and floats as floats you can adjust Robs code as follows:
def numerify(row):
for k, v in list(row.items()):
try:
row[k] = int(v)
except ValueError:
try:
row[k] = float(v)
except ValueError:
pass
Upvotes: 0
Reputation: 168626
Here is a way where you don't know have to a priori which values are numeric:
import json
import csv
def numerify(row):
for k, v in list(row.items()):
try:
row[k] = float(v)
row[k] = int(v)
except ValueError:
pass
csvfile_ind = open("test.csv",'r')
reader_ind = csv.DictReader(csvfile_ind)
json_file_ind = open("test_json.json", 'w')
for row in reader_ind:
numerify(row)
json_file_ind.write(json.dumps(row,sort_keys=False, indent=4, separators=(',', ': ')))
Upvotes: 0
Reputation: 51653
Parse the csv yourself, when creating the dictionary parse int()
and float()
where desired, output to json:
import json
with open("test.csv",'r') as f:
# read lines, strip newlines, split at ,
lines = [ x.strip('\n').split(',') for x in f.readlines()]
listDic = []
for lineIndex in range(1,len(lines)):
row = lines[lineIndex] # get data row
row[3] = int(row[3]) # convert data
row[4] = float(row[4]) # convert data
# zip to tuples of (key,value) and append to result list of dicts
listDic.append( dict( zip(lines[0],row)))
with open("test_json.json", 'w') as json_file_ind:
for row in listDic:
json_file_ind.write(json.dumps(row,sort_keys=False,
indent=4, separators=(',', ': ')))
Output: ( file content as created by your json_file_ind
-call)
{
"Time": "2018-01-12T08:37:00-06:00",
"site_name": "910536_ARPIN",
"cell_name": "910536-24",
"RRC_attempts": 1,
"rrc_succ_rate": 100.0
}{
"Time": "2018-01-12T08:37:00-06:00",
"site_name": "910536_ARPIN",
"cell_name": "910536-34",
"RRC_attempts": 0,
"rrc_succ_rate": 0.07
}{
"Time": "2018-01-12T08:37:00-06:00",
"site_name": "910536_ARPIN",
"cell_name": "910536-14",
"RRC_attempts": 5,
"rrc_succ_rate": 100.0
}
Upvotes: 0
Reputation: 781068
Don't write each row as a separate call to json.dumps()
. Collect all the rows into a list, and dump that all at once.
To convert the string fields to integers, call int()
on those entries in the dict
.
import json
import csv
with csvfile_ind = open("test.csv",'r'):
reader_ind = csv.DictReader(csvfile_ind)
rows = []
for row in reader_ind:
row["RRC_attempts"] = int(row["RRC_attempts"])
row["rrc_succ_rate"] = int(row["rrc_succ_rate"])
rows.append(row)
with json_file_ind = open("test_json.json", 'w'):
json.dump(rows, json_file_ind, sort_keys=False, indent=4, separators=(',', ': '))
Upvotes: 2