IslandPatrol
IslandPatrol

Reputation: 263

How to read data from CSV into nested key-value pairs for future retrieval?

I have a CSV file (data.csv) that looks like:

Time,Data1,Data2,Data3
0,10,25,100
1,20,30,120
2,25,35,125
3,30,50,150

I want to be able to access the data for a given column at a specified time (for example: inputs of Data1 and Time: 1 should return 20).

I tried:

import csv

dataFile = 'data.csv'

with open(dataFile) as inputData:
    csv_input = csv.reader(inputData)
    headers = next(csv_input)
    data = list(zip(*[map(int, row) for row in csv_input]))

my_list = list(zip(data[0], data[1:]))
my_dictionary = dict(zip(headers, my_list))

print(my_dictionary['Data1'][1])

However, this returns:

(25, 30, 35, 50)

I would instead like to return:

20

How can I update my code to achieve this?

Upvotes: 3

Views: 449

Answers (4)

MasterOfTheHouse
MasterOfTheHouse

Reputation: 1339

You do not need to load the immense Panda library to read a CSV file.Python provides modules for this:


import csv
import collections

filename = "data_80.csv"


def read_csv(filename):
    columns = collections.defaultdict(list)
    with open(filename, 'rt') as file:
        rows = csv.DictReader(file)
        for row in rows:
            for key, val in row.items():
                columns[key].append(val)
    return dict(columns)  

data = read_csv(filename)

print(data) # data is a dictionary of list
{
'Time': ['0', '1', '2', '3'], 
'Data1': ['10', '20', '25', '30'], 
'Data2': ['25', '30', '35', '50'], 
'Data3': ['100', '120', '125', '150']
}

# You just can do 
print(data['Data1'][0])

Upvotes: 2

DarrylG
DarrylG

Reputation: 17156

Keeping part posters original software

import csv

dataFile = 'data.csv'

with open(dataFile) as inputData:
    csv_input = csv.reader(inputData)
    headers = next(csv_input)
    data = list(zip(*[map(int, row) for row in csv_input]))

Modification, dicitonary is simply:

my_dictionary = dict(zip(headers, data))

print(my_dictionary ['Data1'][1])
>>> 20

Upvotes: 1

Andrej Kesely
Andrej Kesely

Reputation: 195468

Solution without pandas:

import csv

dataFile = 'data.csv'

with open(dataFile) as inputData:
    csv_input = csv.reader(inputData)
    i = zip(next(csv_input), zip(*csv_input))
    data, (_, times) = {}, next(i)
    for k, line in i:
        for t, l in zip(times, line):
            data.setdefault(k, {}).setdefault(t, {})
            data[k][int(t)] = l

print(data['Data1'][1])

Prints:

20

Upvotes: 0

merit_2
merit_2

Reputation: 471

Use Pandas.

Like this,

import pandas as pd

I made a file from your provided data and imported,

df = pd.read_csv('rcsv.csv')

print(df.head())

It looks like this,

Time  Data1  Data2  Data3
0     0     10     25    100
1     1     20     30    120
2     2     25     35    125
3     3     30     50    150

You get a specific element like this (second element in second column, zero indexed)

print(df.iloc[1][1])

20

If you have a date and/or time axis then we could approach it differently.

Upvotes: 0

Related Questions