gliptak
gliptak

Reputation: 3670

reading sparse csv file into pandas

I have a space separated CSV file in following format:

2012-11-01 1 2012-12-01 4 2013-02-01 6
2012-12-01 2 2013-01-01 nan
2012-11-01 3 2012-12-01 5 2013-01-01 5 2013-04-01 7

basically dates followed by a value, but the dates are sparse. Some of the values are nan, or also could be missing. I would like to be able to read this into Pandas and line up the values based on the corresponding dates.

Running Pandas:

import pandas as pd
pd.read_csv('sparse.csv', sep=" ", parse_dates=True)

errors with:

ValueError: Expecting 6 columns, got 8 in row 1

What would be a way to read this file and align the date/values?

(Is there some "pre-processing" I could do maybe?)

Thanks

Upvotes: 1

Views: 1987

Answers (1)

mechmind
mechmind

Reputation: 1767

CSV should contain rows with same count of fields. If it just pairs of date-number without relations between pairs, it isnt CSV, but just file of pairs. So, it should be parsed as file of pairs:

input = open("sparse.csv").read().split() # split by newlines and spaces
i = iter(input)
for date in i:
    if date != "nan":
        value = i.next()
        # process pairs

Upvotes: 2

Related Questions