Reputation: 35646
Lets say I have a Pandas DataFrame
like following.
In [31]: frame = pd.DataFrame({'a' : ['A/B/C/D', 'A/B/C', 'A/E','D/E/F']})
In [32]: frame
Out[32]:
a
0 A/B/C/D
1 A/B/C
2 A/E
3 D/E/F
And I have string list like following.
In [33]: mylist =['A/B/C/D', 'A/B/C', 'A/B']
Here two of the patterns in mylist is available in my DataFrame. So I need to get output saying 2/3*100 = 67%
In [34]: pattern = '|'.join(mylist)
In [35]: frame.a.str.contains(pattern).count()
This is not working. Any help to get my expected output.
Upvotes: 0
Views: 59
Reputation: 21873
You can do this way :
In [1]: len(frame[frame.a.isin(mylist)])/float(len(mylist)) * 100
Out[1]: 66.66666666666666
Or with you method :
In [2]: pattern = '|'.join(mylist)
In [2]: count = frame.a.str.contains(pattern).sum() # will add up True values
In [3]: count/float(len(mylist))*100
Out[3]: 66.666666666666
Upvotes: 1