Lulzsec
Lulzsec

Reputation: 261

Convert CSV file into list of dictionaries with precise type and order

Everything is in the title, I've found this code that almost does what I want : https://stackoverflow.com/a/21572244/5455842

import csv

with open('test.csv') as f:
    a = [{k: int(v) for k, v in row.items()}
        for row in csv.DictReader(f, skipinitialspace=True)]

My CSV is made of 5 columns and 6 lines (the first one being fieldnames).

My problems are the following :

Firstname,Lastname,Birthdate,Answers(good/middle/bad),Comments
Mark,Tolonen,12/10/1986,"3154,0",The first one
John,Travolta,02/18/1954,"42,21",Would grease again
Albert,Einstein,03/14/1879,"18,19,20",This guy is not stupid
Isaac,Newton,12/25/1642,"2000,20,20", Should eat apple
Alan,Turing,06/23/1912,"42,42,42",Hey what's up

And here's a sample of desired thing :

[{'Birthdate': datetime.date(1986, 12, 10),
  'Comments': 'The first one',
  'Firstname': 'Mark',
  'Lastname': 'Tolonen',
  'Answers(good/middle/bad)': [3154, 0]},
 {'Birthdate': datetime.date(1954, 02, 18),
  'Comments': 'Would grease again',
  'Firstname': 'John',
  'Lastname': 'Travolta',
  'Answers(good/middle/bad)': [42, 21]},
...
}]

Upvotes: 0

Views: 685

Answers (1)

Mark Tolonen
Mark Tolonen

Reputation: 177971

Given your sample data and desired output, if you use Python 3.7 or later, the dictionary order will be as desired:

import csv
from datetime import datetime
from pprint import pprint

order = 'Birthdate','Comments','Firstname','Lastname','Answers(good/middle/bad)'

with open('input.csv') as f:
    reader = csv.DictReader(f)
    data = []

    for row in reader:

        # Post-process the non-string columns.
        row['Birthdate'] = datetime.strptime(row['Birthdate'],'%m/%d/%Y').date()
        row['Answers(good/middle/bad)'] = [int(x) for x in row['Answers(good/middle/bad)'].split(',')]

        # Re-write the dict with the desired key order.
        # Python 3.7 (or CPython 3.6) or later required to keep insertion order.
        # 3.7 made insertion order preservation part of the language spec.
        # Specifically, the implementation of CPython 3.6 preserves insertion order
        # as an implementation detail.
        # For older versions use collections.OrderedDict instead.
        data.append({k:row[k] for k in order})
        
pprint(data)

Output:

[{'Birthdate': datetime.date(1986, 12, 10),
  'Comments': 'The first one',
  'Firstname': 'Mark',
  'Lastname': 'Tolonen',
  'Answers(good/middle/bad)': [3154, 0]},
 {'Birthdate': datetime.date(1954, 2, 18),
  'Comments': 'Would grease again',
  'Firstname': 'John',
  'Lastname': 'Travolta',
  'Answers(good/middle/bad)': [42, 21]},
 {'Birthdate': datetime.date(1879, 3, 14),
  'Comments': 'This guy is not stupid',
  'Firstname': 'Albert',
  'Lastname': 'Einstein',
  'Answers(good/middle/bad)': [18, 19, 20]},
 {'Birthdate': datetime.date(1642, 12, 25),
  'Comments': ' Should eat apple',
  'Firstname': 'Isaac',
  'Lastname': 'Newton',
  'Answers(good/middle/bad)': [2000, 20, 20]},
 {'Birthdate': datetime.date(1912, 6, 23),
  'Comments': "Hey what's up",
  'Firstname': 'Alan',
  'Lastname': 'Turing',
  'Answers(good/middle/bad)': [42, 42, 42]}]

Upvotes: 1

Related Questions