Reputation: 105
So i am trying to find a specific word in txt file, and add up its occurrence, the code I used
import re
pattern = re.compile(r"\bshall\b")
pattern1 = re.compile(r"\bmay not\b")
pattern2 = re.compile(r"\bmust\b")
with open('C:\Python27\projects\Alabama\New folder\\4.txt', 'r') as myfile:
for line in myfile:
m = re.findall(pattern, line)
#m1 = re.findall(pattern1, line)
#m2 = re.findall(pattern2,line)
k = len(m)
#k1 = len(m1)
#k2 = len(m2)
#sumk = sum(len(k) for k in myfile)
print k
when I print out k, it gives a vertical list of number of [0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 3, 0, 2........] I can tell that these are the number of occurrence of the string "shall" in each line of the text, my question is how to do I add up these list of numbers to get the sum/total occurrence of "shall" in the whole text file.
Upvotes: 0
Views: 99
Reputation: 30813
If you intend to sum
a list, you could use sum
, but you need to define k
outside such that it won't be replaced with new value every time:
k = [] #define k as empty list here
for line in myfile:
m = re.findall(pattern, line)
k.append(len(m)) #append the list with new item
val = sum(k) #get the sum here
Upvotes: 1
Reputation: 4112
One way is to use a running total:
total = 0
for line in myfile:
m = re.findall(pattern, line)
total += len(m)
print total
Upvotes: 1