Reputation: 1518
I have a dataframe looks like this:
df = pd.DataFrame({'col1': [i+1 for i in range(10)] + [-i-1 for i in range(10)],
'col2': ['random string'] *20})
print(df)
col1 col2
0 1 random string
1 2 random string
2 3 random string
3 4 random string
4 5 random string
5 6 random string
6 7 random string
7 8 random string
8 9 random string
9 10 random string
10 -1 random string
11 -2 random string
12 -3 random string
13 -4 random string
14 -5 random string
15 -6 random string
16 -7 random string
17 -8 random string
18 -9 random string
19 -10 random string
and I want to make it look like this:
col1 col2
0 1 random string
1 -1 random string
2 2 random string
3 -2 random string
4 3 random string
5 -3 random string
6 4 random string
7 -4 random string
8 5 random string
9 -5 random string
10 6 random string
11 -6 random string
12 7 random string
13 -7 random string
14 8 random string
15 -8 random string
16 9 random string
17 -9 random string
18 10 random string
19 -10 random string
My own way to do it seems to take quite a few lines, aka not pythonic. My code:
df2 = pd.DataFrame(index = df.index,columns = df.columns)
Ypos = df[df['col1'] > 0]
Yneg = df[df['col1'] < 0]
ind_pos = [2*i for i in range(10)]
ind_neg = [2*i+1 for i in range(10)]
df2.loc[ind_pos] = Ypos.rename({k:v for k,v in zip(Ypos.index, ind_pos)})
df2.loc[ind_neg] = Yneg.rename({k:v for k,v in zip(Yneg.index, ind_neg)})
print(df2)
Is there any more pythonic way to accomplish the same result? Thank you in advance.
EDIT: I'd like a more general method to deal with dataframe like this
col1 col2
0 1 random string
1 2 random string
2 3 random string
3 4 random string
4 5 random string
5 1x random string
6 2x random string
7 3x random string
8 4x random string
9 5x random string
10 1y random string
11 2y random string
12 3y random string
13 4y random string
14 5y random string
Upvotes: 1
Views: 80
Reputation: 59549
If the size of the subgroups is known, let's call it n
, and your DataFrame
is chunked with each group following the other, we just need some math:
n=5
df.index = df.index%n + (df.index//n)/(len(df)/n)
df = df.sort_index().reset_index(drop=True)
col1 col2
0 1 random_string
1 1x random_string
2 1y random_string
3 2 random_string
4 2x random_string
5 2y random_string
6 3 random_string
7 3x random_string
8 3y random_string
9 4 random_string
10 4x random_string
11 4y random_string
12 5 random_string
13 5x random_string
14 5y random_string
Upvotes: 1
Reputation: 323236
Sort after create helper key with abs
newdf=df.assign(key=df.col1.abs()).sort_values('key').drop('key',1)
newdf
Out[60]:
col1 col2
0 1 random string
10 -1 random string
1 2 random string
11 -2 random string
2 3 random string
12 -3 random string
3 4 random string
13 -4 random string
4 5 random string
14 -5 random string
5 6 random string
15 -6 random string
6 7 random string
16 -7 random string
17 -8 random string
7 8 random string
18 -9 random string
8 9 random string
9 10 random string
19 -10 random string
Upvotes: 2