Reputation:
I am trying to sort timestamps
across multiple columns
in a pandas
df
into a single time ordered column
.
So for the df below I'd like to combine these to create one column
import pandas as pd
d = ({
'' : ['Bar','Foo','Fubar'],
'A' : ['8:00','8:29','8:58'],
'B' : ['8:30','8:59','9:28'],
'C' : ['9:00','9:29','10:00'],
})
df = pd.DataFrame(data=d)
Output:
A B C
0 Bar 8:00 8:30 9:00
1 Foo 8:29 8:59 9:29
2 Fubar 8:58 9:28 10:00
Intended Output:
1 2 3
0 Bar 1 8:00
1 Foo 1 8:29
2 Bar 2 8:30
3 Fubar 1 8:58
4 Foo 2 8:59
5 Bar 3 9:00
6 Fubar 2 9:28
7 Foo 3 9:29
8 Fubar 3 10:00
I can sort them via df = df.sort_values(by='1',ascending=True)
but I need to merge them somehow. I have tried;
df = df.sum(axis=1)
I've also tried similar joining methods but the result is always
0 Bar8:008:309:00
1 Foo8:298:599:29
2 Fubar8:589:2810:00
Update:
Using @Wen's code I get the following output
df.columns=['',1,2,3]
df = pd.melt(df, '')
df = df.sort_values(by='value',ascending=True)
variable value
8 Fubar 3 10:00 #All ordered except for the first row?
0 Bar 1 8:00
1 Foo 1 8:29
3 Bar 2 8:30
2 Fubar 1 8:58
4 Foo 2 8:59
6 Bar 3 9:00
5 Fubar 2 9:28
7 Foo 3 9:29
It's all ordered except for the 1st row?
Upvotes: 2
Views: 209
Reputation: 71570
Try this:
newdf = pd.DataFrame(np.repeat(df.T.values,3,axis=1)).T
newdf.columns=df.columns
newdf['new']=list(set(df['A']))+list(set(df['B']))+list(set(df['C']))
newdf['']=newdf[''][::3].tolist()*3
newdf['n']=sorted([1,2,3]*3)
newdf=newdf[['','n','new']]
print(newdf)
Output:
n new
0 Bar 1 8:00
1 Foo 1 8:29
2 Fubar 1 8:58
3 Bar 2 8:59
4 Foo 2 8:30
5 Fubar 2 9:28
6 Bar 3 9:00
7 Foo 3 10:00
8 Fubar 3 9:29
Upvotes: 0
Reputation: 323226
IIUC
df.columns=['',1,2,3]
df.melt('')
Out[99]:
variable value
0 Bar 1 8:00
1 Foo 1 8:29
2 Fubar 1 8:58
3 Bar 2 8:30
4 Foo 2 8:59
5 Fubar 2 9:28
6 Bar 3 9:00
7 Foo 3 9:29
8 Fubar 3 10:00
Upvotes: 1