Dnaiel
Dnaiel

Reputation: 7832

function to compute averages and retrieve information over defined indexes in a tuple of strings

Let us say i have a tuple of strings as follows:

tos = ('12|edr4r\tedward\t21\n',
       '1|edr4r\tedward\t21\n',
       '3|edr4r\tedward\t21\n',
       '8|edr4r\tedward\t21\n',
       '10|edr4r\tedward\t21\n',
       '2|edr4r\tedward\t21\n')

Where the format for each element in the tuple is:

'integer_number|id\tname\tage\n'

and each element in the tuple contains the same information, in this case,

'edr4r\tedward\t21\n'

and a map list that tells (over) which elements to compute the averages over the integer_numbers of tos.

map_lst = [0,0,1,2,1,0]

i.e., one average will be over tos[0], tos[1] and tos[5] (since 0 appears in positions 0, 1 and 5 of map_lst), the other average will be over tos[2] and tos[4], and finally one over tos[3].

I'd like to compute the averages of the numbers before '|' in an avgs_list that contain the averages, and (only) some of the information in each element of tos:

avgs_list = ['edr4r\tedward\t(12+1+2)/3\n', 
             'edr4r\tedward\t(3+10)/2\n', 
             'edr4r\tedward\t8\n']

Is there any pythonic way to do this. I am looking for a solution as generic as possible without hardcoding the number of indexes, etc.

I could do some for looping over the list, store and then compute averages but I thought there may be a more pythonic way to do it, using the map function or something else...

Upvotes: 0

Views: 101

Answers (3)

Trevor Merrifield
Trevor Merrifield

Reputation: 4701

How is this?

def average(tos, map_lst):
    """
    given
        tos: a sequence of N|user\tname\tAGE\n
        map_lst: a list with positions corresponding to those in tos, and values
                 indicating which group each tos element will be averaged with.
    return the groups of averages as a list of user\tname\tAVG\n
    """

    # get the leading nums
    nums = [s.partition('|')[0] for s in tos]

    # group them into lists that will be averaged together (based on the map)
    avg_groups = [[] for i in set(map_lst)]
    for i,n in zip(map_lst, nums):
        avg_groups[i].append(n)

    # generate the averages
    def fmt(tup):
        mid = tos[0].partition('|')[2].rpartition('\t')[0] # user\tname
        if len(tup) > 1:
            avg = '({0})/{1}'.format('+'.join(tup), len(tup))
        else:
            avg = str(tup[0])
        return "{0}\t{1}\n".format(mid, avg)

    return [fmt(l) for l in avg_groups]

Test:

tos = ('12|edr4r\tedward\t21\n','1|edr4r\tedward\t21\n','3|edr4r\tedward\t21\n','8|edr4r\tedward\t21\n','10|edr4r\tedward\t21\n','2|edr4r\tedward\t21\n')
map_lst = [0,0,1,2,1,0]
print(average(tos,map_lst))
>> ['edr4r\tedward\t(12+1+2)/3\n', 'edr4r\tedward\t(3+10)/2\n', 'edr4r\tedward\t8\n']

Upvotes: 1

dmvianna
dmvianna

Reputation: 15730

You could use pandas:

from pandas import *
import re

data = [re.split(r'\t|\|', x) for x in tos]
data = DataFrame(data)
data[3] = data[3].str.rstrip('\n')
data[0] = data[0].astype(int)
data[4] = map_lst

data.groupby([1,2,3,4])[0].mean()



Out[1]:

1      2       3   4
edr4r  edward  21  0    5.0
                   1    6.5
                   2    8.0
Name: 0, dtype: float64

Upvotes: 1

jonrsharpe
jonrsharpe

Reputation: 122126

To actually calculate the averages of the leading integer, you could use something like:

averages = []
for n in range(max(map_lst) + 1): # however many averages needed
    averages.append(sum(int(v.split("|")[0]) # get int from v
                        for i, v in enumerate(tos) # index and value 
                        if map_lst[i] == n) # whether to use this v
                    / float(map_lst.count(i))) # divide by number of ints

For your data, this gives

averages == [5.0, 6.5, 8.0]

I am a little confused by your output format, which seems to include the calculation to carry out but not the answer. I think you should focus less on using strings in your code; parse them at the start, create them at the end, but use other data structures in-between.

Upvotes: 1

Related Questions