How do I use groupby on the output of a mapper?

Question

This is a continuation of my previous question:

How to print only if a character is an alphabet?

I now have a mapper that is working perfectly, and it's giving me this output when I use a text file with the string `It's a beautiful life".

Now I am trying to send this output into a script to get an output like this:

a [(1, 0, 0), (1, 1, 1)]
b [(1, 0, 0)]
e [(1, 0, 0), (1, 0, 1)]
f [(1, 0, 0), (1, 0, 0)]
i [(1, 0, 0), (1, 0, 0), (1, 1, 0)]  
l [(1, 0, 0), (1, 0, 0)]
s [(1, 0, 0)]
t [(1, 0, 0), (1, 0, 0)]
u [(1, 0, 0), (1, 0, 0)]

so that each tuple is added each time the letter from the output of mapper is matched.

I have some code that was from a different but similar problem that I am trying to change around so it works with my mapper:

from itertools import groupby
from operator import itemgetter
import sys

def read_mapper_output(file):
    for line in file:
        yield line.strip().split(' ')

#Call the function to read the input which is (, 1)
data = read_mapper_output(sys.stdin)

#Each word becomes key and is used to group the rest of the values by it.
#The first argument is the data to be grouped
#The second argument is what it should be grouped by. In this case it is the 
for key, keygroup in groupby(data, itemgetter(0)):
    values = ' '.join(sorted(v for k, v in keygroup))
    print("%s %s" % (key, values))

I am having trouble changing the last block of code to work with my mapper. I know that I will have to print out a list of tuples for every instance of a letter occurring in the mapper.

How do I use groupby on the output of a mapper?

Answers (1)

Related Questions