Reputation: 2046
I have a dictionary that looks like the below
defaultdict(list,
{'Open': ['47.47', '47.46', '47.38', ...],
'Close': ['47.48', '47.45', '47.40', ...],
'Date': ['2016/11/22 07:00:00', '2016/11/22 06:59:00','2016/11/22 06:58:00', ...]})
My purpose is to convert this dictionary to a dataframe and to set the 'Date' key values as the index of the dataframe.
I can do this job by the below commands
df = pd.DataFrame(dictionary, columns=['Date', 'Open', 'Close'])
df.index = df.Date
Output:
Date Date Open Close
2016/11/22 07:00:00 2016/11/22 07:00:00 47.47 47.48
2016/11/22 06:59:00 2016/11/22 06:59:00 47.46 47.45
2016/11/22 06:58:00 2016/11/22 06:58:00 47.38 47.38
but, then I have two 'Date' columns, one of which is the index and the other is the original column.
Is there any way to set index while converting dictionary to dataframe, without having overlapping columns like the below?
Date Close Open
2016/11/22 07:00:00 47.48 47.47
2016/11/22 06:59:00 47.45 47.46
2016/11/22 06:58:00 47.38 47.38
Upvotes: 39
Views: 50565
Reputation: 23071
If the original dictionary is not needed, then an alternative is to simply pop the Date
key.
df = pd.DataFrame(mydict, index=pd.Series(mydict.pop('Date'), name='Date'))
That said, I think set_index
is the more convenient and less verbose option that can be called immediately on the newly created frame:
df = pd.DataFrame(mydict).set_index('Date')
Upvotes: 2
Reputation: 862511
Use set_index
:
df = pd.DataFrame(dictionary, columns=['Date', 'Open', 'Close'])
df = df.set_index('Date')
print (df)
Open Close
Date
2016/11/22 07:00:00 47.47 47.48
2016/11/22 06:59:00 47.46 47.45
2016/11/22 06:58:00 47.38 47.40
Or use inplace
:
df = pd.DataFrame(dictionary, columns=['Date', 'Open', 'Close'])
df.set_index('Date', inplace=True)
print (df)
Open Close
Date
2016/11/22 07:00:00 47.47 47.48
2016/11/22 06:59:00 47.46 47.45
2016/11/22 06:58:00 47.38 47.40
Another possible solution filter out dict
by Date
key and then set index by dictionary['Date']
:
df = pd.DataFrame({k: v for k, v in dictionary.items() if not k == 'Date'},
index=dictionary['Date'],
columns=['Open','Close'])
df.index.name = 'Date'
print (df)
Open Close
Date
2016/11/22 07:00:00 47.47 47.48
2016/11/22 06:59:00 47.46 47.45
2016/11/22 06:58:00 47.38 47.40
Upvotes: 46