Reputation: 11
I have a dataframe that has rows with a index of a date and 3 numbers in columns for each date. e.g.:
Date 02/21 | 3 | 4 | 2
Date 03/21 | 2 | 5 | 7
Date 04/21 | 5 | 4 | 2
I want to turn this dataframe so that each number is in its own row assigned to its indexed date e.g.:
Date 02/21 | 3
Date 02/21 | 4
Date 02/21 | 2
Date 03/21 | 2
Date 03/21 | 4
Date 03/21 | 7
Date 04/21 | 5
Date 04/21 | 4
Date 04/21 | 2
Any idea how I can transform my dataframe into this? The data is in a spreadsheet so I know how to load, but it's just transforming into the layout below that and I don't know how to do.
Upvotes: 1
Views: 96
Reputation: 146
Here is a simpler implementation so that you can read rows from your spreadsheet into a dataframe and dont have to type values manually. I have typed dataframe values manually as a sample but that can be replaced with code to read rows from your spreadsheet
>>> import pandas as pd
>>> df=pd.DataFrame([['Date 02/21 | 3 | 4 | 2'],['Date 03/21 | 2 | 5 | 7'],['Date 04/21 | 5 | 4 | 2']], columns=['a'])
>>> df
a
0 Date 02/21 | 3 | 4 | 2
1 Date 03/21 | 2 | 5 | 7
2 Date 04/21 | 5 | 4 | 2
Split the columns on a specified delimiter(|)
>>> df2=pd.DataFrame(list(df.a.str.split('|')), columns=[1,2,3,4])
>>> df2
1 2 3 4
0 Date 02/21 3 4 2
1 Date 03/21 2 5 7
2 Date 04/21 5 4 2
Unpivot
>>> df3=df2.melt(id_vars=[1])
>>> df3
1 variable value
0 Date 02/21 2 3
1 Date 03/21 2 2
2 Date 04/21 2 5
3 Date 02/21 3 4
4 Date 03/21 3 5
5 Date 04/21 3 4
6 Date 02/21 4 2
7 Date 03/21 4 7
8 Date 04/21 4 2
Concatenate columns with the required delimiter for final output
>>> df4=pd.DataFrame(df3[1]+"|"+df3['value'])
>>> df4
0
0 Date 02/21 | 3
1 Date 03/21 | 2
2 Date 04/21 | 5
3 Date 02/21 | 4
4 Date 03/21 | 5
5 Date 04/21 | 4
6 Date 02/21 | 2
7 Date 03/21 | 7
8 Date 04/21 | 2
Upvotes: 0
Reputation: 1728
You want to 'unpivot' your table. Pandas can do that with the melt() method: https://dfrieds.com/data-analysis/melt-unpivot-python-pandas.html
Upvotes: 0
Reputation: 5032
You can use melt to achieve this -
>>> import pandas as pd
>>> d = ['02/21','03/21','04/21']
>>> c1 = [3,2,5]
>>> c2 = [4,5,4]
>>> c3 = [2,7,2]
>>> df = pd.DataFrame({'Date':d,'Col1':c1,'Col2':c2,'Col3':c3})
>>> df
Date Col1 Col2 Col3
0 02/21 3 4 2
1 03/21 2 5 7
2 04/21 5 4 2
>>> pd.melt(df,id_vars=['Date'],value_vars=['Col1','Col2','Col3']).sort_values(by=['Date','variable'])
Date variable value
0 02/21 Col1 3
3 02/21 Col2 4
6 02/21 Col3 2
1 03/21 Col1 2
4 03/21 Col2 5
7 03/21 Col3 7
2 04/21 Col1 5
5 04/21 Col2 4
8 04/21 Col3 2
>>>
Upvotes: 1