Reputation: 87
I have a data frame name df and I want to count '|' and '/' in name1 & name2 respectively.
id name1 name2
1 a|b a/b
2 a|b|c a/b/c
3 a a
4 a|b|c|d a/b/c/d
This is the code
[In] 1: import pandas as pd
data = {'id' : pd.Series([1, 2, 3, 4]),
'name1': pd.Series(['a|b', 'a|b|c', 'a', 'a|b|c|d']),
'name2': pd.Series(['a/b', 'a/b/c', 'a', 'a/b/c/d'])}
df = pd.DataFrame(data)
[In] 2: df['name1'].str.count('|')
[Out] 2: 4
6
2
8
[In] 3: df['name2'].str.count('/')
[Out] 3: 1
2
0
3
The problem which I am facing is it gives correct output for 3 but for 2 it gives incorrect.
Note: I want to count '|' separately because in original data only '|' this is present not '/'.
Upvotes: 2
Views: 60
Reputation: 863541
Problem is |
is regex special character, so necessary escaping by \
:
a = df['name1'].str.count('\|')
print (a)
0 1
1 2
2 0
3 3
Name: name1, dtype: int64
If check Series.str.count
:
Count occurrences of pattern in each string of the Series/Index.
This function is used to count the number of times a particular regex pattern is repeated in each of the string elements of the Series.
Upvotes: 1