Reputation: 10021
Given a small dataset as follows:
id a b
0 1 lol lolec
1 2 rambo ram
2 3 ki pio
3 4 iloc loc
4 5 strip rstrip
5 6 lambda lambda
I would like to create a new column c
based on the following criterion:
If a
is equal or substring of b
or vise versa, then create a new column c
with value 1
, otherwise keep it as 0
.
How could I do that in Pandas or Python?
The expected result:
id a b c
0 1 lol lolec 1
1 2 rambo ram 1
2 3 ki pio 0
3 4 iloc loc 1
4 5 strip rstrip 1
5 6 lambda lambda 1
To check whether a
is in b
or b
is in a
, we can use:
df.apply(lambda x: x.a in x.b, axis=1)
df.apply(lambda x: x.b in x.a, axis=1)
Upvotes: 0
Views: 69
Reputation: 214957
Use zip
and list comprehension:
df['c'] = [int(a in b or b in a) for a, b in zip(df.a, df.b)]
df
id a b c
0 1 lol lolec 1
1 2 rambo ram 1
2 3 ki pio 0
3 4 iloc loc 1
4 5 strip rstrip 1
5 6 lambda lambda 1
Or use apply
, just combine both conditions with or
:
df['c'] = df.apply(lambda r: int(r.a in r.b or r.b in r.a), axis=1)
Upvotes: 6