Reputation: 433

str.startswith using Regex

Can i understand why the str.startswith() is not dealing with Regex :

   col1
0  country
1  Country

i.e : df.col1.str.startswith('(C|c)ountry')

it returns all the values False :

   col1
0  False
1  False

Upvotes: 13

Answers (3)

Patrick Reichert

Reputation: 1

Series.str.startswith can also receive a tuple like this:

df.col1.str.startswith(("Country","country"))

All elements from the tuple are now searched for. You can also read the tuple as an OR operator within Series.str.startswith.

Upvotes: 0

Mad Physicist

Reputation: 114440

Series.str.startswith does not accept regex because it is intended to behave similarly to str.startswith in vanilla Python, which does not accept regex. The alternative is to use a regex match (as explained in the docs):

df.col1.str.contains('^[Cc]ountry')

The character class [Cc] is probably a better way to match C or c than (C|c), unless of course you need to capture which letter is used. In this case you can do ([Cc]).

Upvotes: 27

Alicia Garcia-Raboso

Reputation: 13913

Series.str.startswith does not accept regexes. Use Series.str.match instead:

df.col1.str.match(r'(C|c)ountry', as_indexer=True)

Output:

0    True
1    True
Name: col1, dtype: bool

Upvotes: 8

str.startswith using Regex

Answers (3)

Related Questions