Reputation: 433
Can i understand why the str.startswith() is not dealing with Regex :
col1
0 country
1 Country
i.e : df.col1.str.startswith('(C|c)ountry')
it returns all the values False :
col1
0 False
1 False
Upvotes: 13
Views: 18163
Reputation: 1
Series.str.startswith
can also receive a tuple like this:
df.col1.str.startswith(("Country","country"))
All elements from the tuple are now searched for. You can also read the tuple as an OR operator within Series.str.startswith
.
Upvotes: 0
Reputation: 114440
Series.str.startswith
does not accept regex because it is intended to behave similarly to str.startswith
in vanilla Python, which does not accept regex. The alternative is to use a regex match (as explained in the docs):
df.col1.str.contains('^[Cc]ountry')
The character class [Cc]
is probably a better way to match C
or c
than (C|c)
, unless of course you need to capture which letter is used. In this case you can do ([Cc])
.
Upvotes: 27
Reputation: 13913
Series.str.startswith
does not accept regexes. Use Series.str.match
instead:
df.col1.str.match(r'(C|c)ountry', as_indexer=True)
Output:
0 True
1 True
Name: col1, dtype: bool
Upvotes: 8