Reputation: 2436
I have this dataframe:
s = pd.DataFrame({'A': [*'1112222'], 'B': [*'abcdefg'], 'C': [*'ABCDEFG']})
that is like this:
A B C
0 1 a A
1 1 b B
2 1 c C
3 2 d D
4 2 e E
5 2 f F
6 2 g G
I want to do a groupby like this:
groups = s.groupby("A")
for example, the group 2 is:
g2 = groups.get_group("2")
that looks like this:
A B C
3 2 d D
4 2 e E
5 2 f F
6 2 g G
Anyway, I want to do some operation in each group.
Let me show how my final result should be:
A B C D
1 1 b B a=b;A=B
2 1 c C a=c;A=C
4 2 e E d=e;D=E
5 2 f F d=f;F=F
6 2 g G d=g;D=G
Actually, I am dropping the first row in each group but combining it with the other rows of the group to create column C
Summary of what I want to do in two lines: I want to do a group by and in each group, I want to drop the first row. I also want to add a column to the whole dataframe that is based on the rows of the group
What I have tried:
In order to solve this, I am going to create a function:
def func(g):
first_row_of_group = g.iloc[0]
g = g.iloc[1:]
g["C"] = g.apply(lambda row: ";".join([f'{a}={b}' for a, b in zip(row, first_row_of_group)]))
return g
Then I am going to do this:
groups.apply(lambda g: func(g))
Upvotes: 1
Views: 527
Reputation:
You can apply a custom function to each group where you add the elements from the first row to the remaining rows and remove it:
def remove_first(x):
first = x.iloc[0]
x = x.iloc[1:]
x['D'] = first['B'] + '=' + x['B'] + ';' + first['C'] + '=' + x['C']
# an equivalent operation
# x['D'] = first.iloc[1] + '=' + x.iloc[:,1] + ';' + first.iloc[2] + '=' + x.iloc[:,2]
return x
s = s.groupby('A').apply(remove_first).droplevel(0)
Output:
A B C D
1 1 b B a=b;A=B
2 1 c C a=c;A=C
4 2 e E d=e;D=E
5 2 f F d=f;D=F
6 2 g G d=g;D=G
Note: The dataframe shown in your question is constructed from
s = pd.DataFrame({'A': [*'1112222'], 'B': [*'abcdefg'], 'C': [*'ABCDEFG']})
but you give a different one as raw input.
Upvotes: 1