Jay
Jay

Reputation: 150

Difference between pandas.Series.str.match and pandas.Series.str.contains

What's the difference between pandas.Series.str.contains and pandas.Series.str.match? Why is the case below?

s1 = pd.Series(['house and parrot'])
s1.str.contains(r"\bparrot\b", case=False)

I got True, but when I do

s1.str.match(r"\bparrot\b", case=False)

I got False. Why is the case?

Upvotes: 8

Views: 13615

Answers (1)

Mack123456
Mack123456

Reputation: 386

The documentation for str.contains() states:

Test if pattern or regex is contained within a string of a Series or Index.

The documentation for str.match() states:

Determine if each string matches a regular expression.

The difference in these two methods is that str.contains() uses: re.search, while str.match() uses re.match.

As per documentation of re.match()

If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding match object. Return None if the string does not match the pattern; note that this is different from a zero-length match.

So parrot does not match the first character of the string so your expression returns False. House does match the first character so it finds house and returns true.

Upvotes: 12

Related Questions