Reputation: 1521
I have got a string stored in a dataframe column
import pandas as pd
df = pd.DataFrame({"ID": 1, "content": "froyay-xcd = (E)-cut-2-froyay-xcd"}, index=[0])
print(df)
idx = df[df['content'].str.contains("froyay-xcd = (E)-cut-2-froyay-xcd")]
print(idx)
I'm trying to find the index of the row that contains a search string and the following warning occurs
UserWarning: This pattern has match groups. To actually get the groups, use str.extract.
return func(self, *args, **kwargs)
I'm not sure why an empty dataframe is returned when the search string actually is present in the dataframe columns.
Any suggestions will be highly appreciated. I expect the output to return the row stored in the dataframe.
Upvotes: 2
Views: 48
Reputation: 7693
You can add \
before (
and )
to avoid it and then get index using .index
df.content.str.contains("froyay-xcd = \(E\)-cut-2-froyay-xcd").index
Int64Index([0], dtype='int64')
If you have more regex special character better to use regex=False
as @jezrael said.
Upvotes: 1
Reputation: 862651
You can add regex=False
parameter for avoid convert values to regex, here ()
are special regex characters:
idx = df[df['content'].str.contains("froyay-xcd = (E)-cut-2-froyay-xcd", regex=False)]
print(idx)
ID content
0 1 froyay-xcd = (E)-cut-2-froyay-xcd
Or you can escape regex by:
import re
idx = df[df['content'].str.contains(re.escape("froyay-xcd = (E)-cut-2-froyay-xcd"))]
print(idx)
ID content
0 1 froyay-xcd = (E)-cut-2-froyay-xcd
Upvotes: 1