Python: counting occurrences of a set of strings in a file

Question

I have a big text file and am looking for the best way to implement the following:

define a set of strings where each string looks like "x y", each of x, y is an integer that can take on a number of values.
look through the file, and locate and count each instance of "x y". Return the result as a list that looks roughly like ("x y": count).

I'm a beginner in programming and Python, and the only thing I can think of is something like

f = open('file', 'r')
for x in xrange:
    for y in yrange:
        xystring = str("%i %i") %(x,y)
        count = 0
        for line in f:
            count += line.count(xystring)
        print xystring, count

Now my obvious problems are that this looks inelegant even to me, and that it will scale badly - I will ultimately need this method to count all instances of, say 7^7 different strings. I will also need to scan this across multiple files while keeping track of the counts for each string. I am looking for the most efficient and Python-esque way of getting this done.

Thanks!

georg · Accepted Answer

Something like (untested):

from collections import Counter

pairs = Counter()

with open(...) as fp:
   for line in fp:
      pairs.update(re.findall(r'\d+\s+\d+', line))

Python: counting occurrences of a set of strings in a file

Answers (1)

Related Questions