Reputation: 299
I have a csv file with 400000 rows. The data consists of 3 columns date, stock name, stock price. The csv file has data for over 50 stocks for the past 15 years. I have used data frame for reading the data from csv file. The dataframe is similar to the df1 as given below.
df1 = pd.DataFrame([['d1', 'd2', 'd3', 'd1', 'd2', 'd3', 'd1', 'd2', 'd3'], list('AAABBBCCC'), [1.1,1.2,1.3,2.1,2.2,2.3,3.1,3.2,3.3]]).T
df1.columns = ['dt','stockname','price']
print(df1)
I want stock names as header and dates as index. How to convert or organize the csv to represent a dataframe as similar to the df2 given below.
df2 = pd.DataFrame([['d1','d2','d3'],[1.1,1.2,1.3],[2.1,2.2,2.3],[3.1,3.2,3.3]]).T
df2.columns=['dt','A','B','C']
df2.set_index('dt')
print(df2)
Upvotes: 1
Views: 304
Reputation: 120439
Use pivot
:
>>> df1.pivot('dt', 'stockname', 'price').rename_axis(columns=None).reset_index()
dt A B C
0 d1 1.1 2.1 3.1
1 d2 1.2 2.2 3.2
2 d3 1.3 2.3 3.3
Upvotes: 1