sim
sim

Reputation: 488

How to rectify the key error while converting CSV to json

I am converting csv to json

csv file is below

Country,Email,Flag
Germany,[email protected],Y
France,[email protected],Y
Germany,[email protected],Y

COde is below

import csv
data= []
with open('a.csv') as obj:
    csv_content = obj.read().split('\n')
    csv_file = csv.reader(csv_content)
    csv_file_list = list(csv_file)
    csv_file_list_n = [x for x in csv_file_list if x != []]
    for entry in csv_file_list_n[1:]:
        row = {key: entry[idx] for idx, key in enumerate(csv_file_list_n[0])}
#         print(row)
        merge_flag = False
        for item in data:
            if item['Country'] == row['Country']:
                merge_flag = True
                item['Email'] = [item['Email'] , row['Email']]
                break
        if not merge_flag:
            data.append(row)
data           

My output is below

[{'Country': 'Germany',
  'Email': ['[email protected]', '[email protected]'],
  'Flag': 'Y'},
 {'Country': 'France', 'Email': '[email protected]', 'Flag': 'Y'}]

Below csv is example

Country,Email,Flag
Germany,[email protected],Y
England,
France,[email protected],Y
Germany,[email protected],Y
Wales

You can see England is coming with comma and Whales is coming without comma

Expected out is below, default flag will be N

[{'Country': 'Germany',
  'Email': ['[email protected]', '[email protected]'],
  'Flag': 'Y'},
 {'Country': 'France', 'Email': '[email protected]', 'Flag': 'Y'},
{'Country': 'England', 'Flag': 'N'},
{'Country': 'Wales', 'Flag': 'N'}]

By using above code i am getting index error

Upvotes: 1

Views: 252

Answers (1)

Martin Evans
Martin Evans

Reputation: 46759

How about the following approach. Use a DictReader() to read the CSV in as dictionary rows. This will automatically add None for missing keys. Use a defaultdict(list) to group rows by country. Then build your required output format:

from collections import defaultdict
import csv

countries = defaultdict(list)

with open('a.csv', newline='') as f_input:
    csv_input = csv.DictReader(f_input)
    
    for row in csv_input:
        countries[row['Country']].append([row['Email'], row['Flag'] if row['Flag'] else 'N'])

data = []

for country, entries in countries.items():
    row = {'Country' : country}
    
    for email, flag in entries:
        if email:
            if 'Email' in row:
                row['Email'].append(email)
            else:
                row['Email'] = [email]
        if flag:
            row['Flag'] = flag

    data.append(row)
    
print(data)

This would give you :

[
    {'Country': 'Germany', 'Email': ['[email protected]', '[email protected]'], 'Flag': 'Y'}, 
    {'Country': 'England', 'Flag': 'N'}, 
    {'Country': 'France', 'Email': ['[email protected]'], 'Flag': 'Y'}, 
    {'Country': 'Wales', 'Flag': 'N'}
]

Upvotes: 1

Related Questions