rfguy
rfguy

Reputation: 149

Format numbers as float or int using json in python

I have a CSV file that has the data -

 Time,site_name,cell_name,RRC_attempts,rrc_succ_rate
 2018-01-12T08:37:00-06:00,910536_ARPIN,910536-24,1,100.0
 2018-01-12T08:37:00-06:00,910536_ARPIN,910536-34,0,0.0
 2018-01-12T08:37:00-06:00,910536_ARPIN,910536-14,5,100.0

I am using the json module in python to convert this csv to json

import json
import csv

csvfile_ind = open("test.csv",'r')

reader_ind = csv.DictReader(csvfile_ind)
json_file_ind = open("test_json.json", 'w')
for row in reader_ind:
    json_file_ind.write(json.dumps(row,sort_keys=False, indent=4, separators=(',', ': ')))

My current output is -

        [
          {
            "Time": "2018-01-12T08:37:00-06:00",
            "site_name": "910536_ARPIN",
            "cell_name": "910536-24",
            "RRC_attempts": "1",
            "rrc_succ_rate": "100.0"
          },
          {
            "Time": "2018-01-12T08:37:00-06:00",
            "site_name": "910536_ARPIN",
            "cell_name": "910536-34",
            "RRC_attempts": "0",
            "rrc_succ_rate": "0.0"
          },
          {
            "Time": "2018-01-12T08:37:00-06:00",
            "site_name": "910536_ARPIN",
            "cell_name": "910536-14",
            "RRC_attempts": "5",
            "rrc_succ_rate": "100.0"
          }
        ]

My desired output is -

        [
          {
            "Time": "2018-01-12T08:37:00-06:00",
            "site_name": "910536_ARPIN",
            "cell_name": "910536-24",
            "RRC_attempts": 1,
            "rrc_succ_rate": 100
          },
          {
            "Time": "2018-01-12T08:37:00-06:00",
            "site_name": "910536_ARPIN",
            "cell_name": "910536-34",
            "RRC_attempts": 0,
            "rrc_succ_rate": 0
          },
          {
            "Time": "2018-01-12T08:37:00-06:00",
            "site_name": "910536_ARPIN",
            "cell_name": "910536-14",
            "RRC_attempts": 5,
            "rrc_succ_rate": 100
          }
        ]

How can tell json to parse the numbers as int or float and not as strings ? Please advise. Note - while writing my CSV file I explicitly converted my values to int or float using int() or float().

Upvotes: 1

Views: 13045

Answers (4)

Nas
Nas

Reputation: 1

If you want ints as ints and floats as floats you can adjust Robs code as follows:

def numerify(row):
for k, v in list(row.items()):
    try:
        row[k] = int(v)
    except ValueError:
        try:
            row[k] = float(v)
        except ValueError:
            pass

Upvotes: 0

Robᵩ
Robᵩ

Reputation: 168626

Here is a way where you don't know have to a priori which values are numeric:

import json
import csv

def numerify(row):
    for k, v in list(row.items()):
        try:
            row[k] = float(v)
            row[k] = int(v)
        except ValueError:
            pass

csvfile_ind = open("test.csv",'r')

reader_ind = csv.DictReader(csvfile_ind)
json_file_ind = open("test_json.json", 'w')
for row in reader_ind:
    numerify(row)
    json_file_ind.write(json.dumps(row,sort_keys=False, indent=4, separators=(',', ': ')))

Upvotes: 0

Patrick Artner
Patrick Artner

Reputation: 51653

Parse the csv yourself, when creating the dictionary parse int() and float() where desired, output to json:

import json 

with open("test.csv",'r') as f:
    # read lines, strip newlines, split at ,
    lines = [ x.strip('\n').split(',') for x in f.readlines()]    


listDic = []
for lineIndex in range(1,len(lines)):
    row = lines[lineIndex]     # get data row
    row[3] = int(row[3])       # convert data
    row[4] = float(row[4])     # convert data

    # zip to tuples of (key,value) and append to result list of dicts 
    listDic.append( dict( zip(lines[0],row)))  


with open("test_json.json", 'w') as json_file_ind:
    for row in listDic:
        json_file_ind.write(json.dumps(row,sort_keys=False, 
                            indent=4, separators=(',', ': ')))

Output: ( file content as created by your json_file_ind-call)

{
    "Time": "2018-01-12T08:37:00-06:00",
    "site_name": "910536_ARPIN",
    "cell_name": "910536-24",
    "RRC_attempts": 1,
    "rrc_succ_rate": 100.0
}{
    "Time": "2018-01-12T08:37:00-06:00",
    "site_name": "910536_ARPIN",
    "cell_name": "910536-34",
    "RRC_attempts": 0,
    "rrc_succ_rate": 0.07
}{
    "Time": "2018-01-12T08:37:00-06:00",
    "site_name": "910536_ARPIN",
    "cell_name": "910536-14",
    "RRC_attempts": 5,
    "rrc_succ_rate": 100.0
}

Upvotes: 0

Barmar
Barmar

Reputation: 781068

Don't write each row as a separate call to json.dumps(). Collect all the rows into a list, and dump that all at once.

To convert the string fields to integers, call int() on those entries in the dict.

import json
import csv

with csvfile_ind = open("test.csv",'r'):
    reader_ind = csv.DictReader(csvfile_ind)
    rows = []
    for row in reader_ind:
        row["RRC_attempts"] = int(row["RRC_attempts"])
        row["rrc_succ_rate"] = int(row["rrc_succ_rate"])
        rows.append(row)

with json_file_ind = open("test_json.json", 'w'):
    json.dump(rows, json_file_ind, sort_keys=False, indent=4, separators=(',', ': '))

Upvotes: 2

Related Questions