Reputation: 7458
I am trying to create a boolean column using GroupBy.transform
on a df
like this,
id type
1 1.00000
1 1.00000
2 2.00000
2 3.00000
3 2.00000
the code is like,
df['has_two'] = df.groupby('id')['type'].transform(lambda x: x == 2)
but instead of boolean values, has_two
has float values, e.g. 0.0
. I am wondering why is that.
UPDATE
I created a test case,
df = pd.DataFrame({'id':['1', '1', '2', '2', '3'], 'type':[1.0, 1.0, 2.0, 1.0, 2.0]})
df['has_2'] = df.groupby('id')['type'].transform(lambda x: x == 2)
this gave me,
id type has_2
0 1 1.0 0.0
1 1 1.0 0.0
2 2 2.0 1.0
3 2 1.0 0.0
4 3 2.0 1.0
if I am using df['has_2'] = df['type'] == 2
as suggested by jezrael
, it is fine,
id type has_2
0 1 1.0 False
1 1 1.0 False
2 2 2.0 True
3 2 1.0 False
4 3 2.0 True
I am using pandas==0.20.3
on Python 3.5.2
. I am wondering what's going on, do I need an update on pandas
or python 3
?
UPDATE
Updated pandas
to 0.22.0
fixed this issue.
Upvotes: 3
Views: 1549
Reputation: 200
Use this line
df['has_two'] = df.groupby('id')['type'].transform(lambda x: x == 2) == 2
Worked for me :)
Upvotes: 0
Reputation: 863291
For me it working nice, I get boolean column:
df['has_two'] = df.groupby('id')['type'].transform(lambda x: x == 2)
print (df)
id type has_two
0 1 1.0 False
1 1 1.0 False
2 2 2.0 True
3 2 3.0 False
4 3 2.0 True
But maybe is possible only compare column:
df['has_two'] = df['type'] == 2
print (df)
id type has_two
0 1 1.0 False
1 1 1.0 False
2 2 2.0 True
3 2 3.0 False
4 3 2.0 True
Upvotes: 2