Reputation: 2693
I'm attempting to select rows from a dataframe using the pandas str.contains()
function with a regular expression that contains a variable as shown below.
df = pd.DataFrame(["A test Case","Another Testing Case"], columns=list("A"))
variable = "test"
df[df["A"].str.contains(r'\b' + variable + '\b', regex=True, case=False)] #Returns nothing
While the above returns nothing, the following returns the appropriate row as expected
df[df["A"].str.contains(r'\btest\b', regex=True, case=False)] #Returns values as expected
Any help would be appreciated.
Upvotes: 15
Views: 19600
Reputation: 146
Following command work for me:
df.query('text.str.contains(@variable)')
Upvotes: 0
Reputation: 56
I had the exact same problem when parsing a 'variable' to str.contains(variable).
Try using str.contains(variable, regex=False)
It worked for me perfectly.
Upvotes: -1
Reputation: 402483
Both word boundary characters must be inside raw strings. Why not use some sort of string formatting instead? String concatenation as a rule is generally discouraged.
df[df["A"].str.contains(fr'\b{variable}\b', regex=True, case=False)]
# Or,
# df[df["A"].str.contains(r'\b{}\b'.format(variable), regex=True, case=False)]
A
0 A test Case
Upvotes: 23