Reputation: 150
What's the difference between pandas.Series.str.contains
and pandas.Series.str.match
? Why is the case below?
s1 = pd.Series(['house and parrot'])
s1.str.contains(r"\bparrot\b", case=False)
I got True
, but when I do
s1.str.match(r"\bparrot\b", case=False)
I got False
. Why is the case?
Upvotes: 8
Views: 13615
Reputation: 386
The documentation for str.contains()
states:
Test if pattern or regex is contained within a string of a Series or Index.
The documentation for str.match()
states:
Determine if each string matches a regular expression.
The difference in these two methods is that str.contains()
uses: re.search
, while str.match()
uses re.match
.
As per documentation of re.match()
If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding match object. Return None if the string does not match the pattern; note that this is different from a zero-length match.
So parrot
does not match the first character of the string so your expression returns False. House
does match the first character so it finds house
and returns true.
Upvotes: 12