Reputation: 11
I have the following time series dataframe:
dataframe = pd.DataFrame({
'date': pd.to_datetime([
'2020-04-01', '2020-04-02', '2020-04-03',
'2020-04-01', '2020-04-02', '2020-04-03']),
'Ticker': ['A', 'A', 'A', 'AAPL', 'AAPL', 'AAPL'],
'Price': ['8', '10', '12', '100', '200', '50']})
date Ticker Price
0 2020-04-01 A 8
1 2020-04-02 A 10
2 2020-04-03 A 12
3 2020-04-01 AAPL 100
4 2020-04-02 AAPL 200
5 2020-04-03 AAPL 50
The final result should look like:
dataframe_2 = pd.DataFrame({
'date': pd.to_datetime(['2020-04-01', '2020-04-02','2020-04-03']),
'A': [8, 10, 12],
'AAPL': [100, 200, 50]})
date A AAPL
0 2020-04-01 8 100
1 2020-04-02 10 200
2 2020-04-03 12 50
Initially I tried using the groupby function but with not much success.
Upvotes: 1
Views: 146
Reputation: 417
The operation you are trying to do is called pivoting. That is, creating new columns from the categorical values of a column.
You can do either of these (same results):
df = dataframe.set_index("date").pivot(columns="Ticker", values="Price")
df = dataframe.pivot(index="date", columns="Ticker", values="Price")
It is important to set the index; otherwise, the pivot will not know how to combine rows and you will get extra rows with NaN values. For the sample data, without the index, it would not know to treat rows 0 and 3 in your base data as the same date.
Upvotes: 1