Reputation:
this is simple question but i don't know why i cannot compare if corectly.
df:
A,B
1,marta
2,adam1
3,kama
4,mike
i want to print 'exist' if specific name exist in df
for example, i want to check if marta exist in df['B']
code:
string='www\marta2'
if df['B'].str.contains(string,regex=False).all()==True:
print('exist')
else:
print('not exist')
when i am using .bool()
instead of all()
i am receiving error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I am reveiving False on each line, why ? should i compare this type of string some how in different way?
EDIT:
I need to use IF
statement because in my code instead of print
my code need to assigh variables, normaly i would use different way.
If my string='marta'
it works well but with additional string not
EDIT:
new code:
string='www\marta2'
if df['B'].str.rfind(string).any():
print('exist')
else:
print('not exist')
but it compares everything, so even if one letter is in column it will print 'exist'
Upvotes: 0
Views: 8420
Reputation:
answer on my question:
to recive only one answer if string exist or not in column, good way is to use df.str.contains()
, as we know str.contains is comparing only whole string, that's why my first code doesn't work.
second way is to use rfind
but it will be always true because this function is comparing single letters in my case.
the answer is to prepare string that i am comparing to receive expected result
string='www\marta2'
new_string=string.split('\\')[-1][0:5]
if df['B'].str.contains(new_string,regex=False).any():
print('exist')
else:
print('not exist')
Upvotes: 0
Reputation: 896
May be this will help you:
>>> for b in df["B"].values:
... if string.rfind(b) != -1:
... print("exists")
... break
...
The looping which is for
loop includes df["B"].values
which returns array values of column B
. Now if you have the array you can loop through it and thus get the output.
In the condition if
statement, I have just compared each of the values of the B
column. rfind()
given the output of the partially matched string output or substring.
It thus the magic.
Upvotes: 0
Reputation: 71
if you want to check if the string exists at all in the whole df, use any()
instead of all()
.
If you want to check if the string exists for each row, you can create a new column and don't have to use if statement
df.loc[df['B'].str.contains(string,regex=False), 'C'] = 'exist'
df.loc[~(df['B'].str.contains(string,regex=False)), 'C'] = 'not exist'
EDIT: I tried this and it works as long as the string is exactly what you're looking for.
string='www\marta2'
if df['name'].str.contains(string,regex=False).any():
print('exist')
else:
print('not exist')
Upvotes: 1