Reputation: 360
I have a csv-file with severel values for several stocks that looks like this:
date stock_A date stock_B date stock_C
30.10.2017 09:00 3223 30.10.2017 09:00 53234 ... .....
30.10.2017 09:02 2544 30.10.2017 09:01 24337 ... .....
30.10.2017 09:04 925 30.10.2017 09:02 4529 ... .....
30.10.2017 09:05 3210 30.10.2017 09:03 8534 ... .....
As you can see, every second column is a datetime index. However, it is not in the same order/frequency. Is there a way to import this data with pandas such that I get only one index and the data is mapped accordingly?
I already tried this code:
pd.read_csv(file, sep=";", header=0, index_col=['date'], parse_dates=True, dtype=object)
But it only imports the first row as index and the other dates as columns with values. However, I would like to have my DataFrame as follows:
date stock_A stock_B stock_C
30.10.2017 09:00 3223 53234 122
30.10.2017 09:01 0 24337 1215
30.10.2017 09:02 2544 4529 0
30.10.2017 09:03 0 8534 1354
...
Upvotes: 1
Views: 1438
Reputation: 863166
Use list comprehension with concat
and set_index
for DatetimeIndex
for each pair:
df = pd.read_csv(file, sep=";")
a = df.columns[::2]
b = df.columns[1::2]
df=pd.concat([df[[j]].set_index(pd.to_datetime(df[i])) for i, j in zip(a,b)],axis=1).fillna(0)
print (df)
stock_A stock_B
2017-10-30 09:00:00 3223.0 53234.0
2017-10-30 09:01:00 0.0 24337.0
2017-10-30 09:02:00 2544.0 4529.0
2017-10-30 09:03:00 0.0 8534.0
2017-10-30 09:04:00 925.0 0.0
2017-10-30 09:05:00 3210.0 0.0
Last for column from index:
df = df.rename_axis('date').reset_index()
print (df)
date stock_A stock_B
0 2017-10-30 09:00:00 3223.0 53234.0
1 2017-10-30 09:01:00 0.0 24337.0
2 2017-10-30 09:02:00 2544.0 4529.0
3 2017-10-30 09:03:00 0.0 8534.0
4 2017-10-30 09:04:00 925.0 0.0
5 2017-10-30 09:05:00 3210.0 0.0
Upvotes: 1