Reputation: 217
I have a big text file and am looking for the best way to implement the following:
define a set of strings where each string looks like "x y", each of x, y is an integer that can take on a number of values.
look through the file, and locate and count each instance of "x y". Return the result as a list that looks roughly like ("x y": count).
I'm a beginner in programming and Python, and the only thing I can think of is something like
f = open('file', 'r')
for x in xrange:
for y in yrange:
xystring = str("%i %i") %(x,y)
count = 0
for line in f:
count += line.count(xystring)
print xystring, count
Now my obvious problems are that this looks inelegant even to me, and that it will scale badly - I will ultimately need this method to count all instances of, say 7^7 different strings. I will also need to scan this across multiple files while keeping track of the counts for each string. I am looking for the most efficient and Python-esque way of getting this done.
Thanks!
Upvotes: 2
Views: 3208
Reputation: 215039
Something like (untested):
from collections import Counter
pairs = Counter()
with open(...) as fp:
for line in fp:
pairs.update(re.findall(r'\d+\s+\d+', line))
Upvotes: 3