Reputation: 3260
Consider the following:
word = 'analphabetic'
df = pd.DataFrame({'substring': list('abcdefgh') + ['ab', 'phobic']})
I want to add a column with the name of word
and each row it shows True/False
whether the substring in that row is in word
. Can I do this with a built-in pandas method?
Desired output:
substring analphabetic
0 a True
1 b True
2 c True
3 d False
4 e True
5 f False
6 g False
7 h True
8 ab True
9 phobic False
The other way around can be done by doing something like df.substring.str.contains(word)
. I guess you could do something like:
df[word] = [i in word for i in df.substring]
But then the built-in function str.contains()
could be done by:
string = 'a'
df = pd.DataFrame({'words': ['these', 'are', 'some', 'random', 'words']})
df[string] = [string in i for i in df.words]
So my thought is that there is also a built-in method to do my trick.
Upvotes: 1
Views: 133
Reputation: 31
Yes you could use the contains
to Find a Substring in a Pandas DataFrame.
You can also use the in
Operator, the in
operator is used to check data structures in Python. It also returns a Boolean (either True or False)
Upvotes: 0
Reputation: 25333
A possible solution (which should work for substrings longer than a single letter):
df['analphabetic'] = df['substring'].map(lambda x: x in word)
Output:
substring analphabetic
0 a True
1 b True
2 c True
3 d False
4 e True
5 f False
6 g False
7 h True
Using list comprehension:
df['analphabetic'] = [x in word for x in df.substring]
Using apply
:
df['analphabetic'] = df['substring'].apply(lambda x: x in word)
Upvotes: 1