aggregation of data combining dict of list

Question

I have a file with following contents.

1234:yahoo google microsoft apple yahoo

2345:apple google google

4567:yahoo apple apple

I am interested in getting the output

"Output"--> searchTerm : UserCnt, searchCnt

yahoo: 2, 3

apple: 3, 4

and so on...

fname="/tmp/sample.txt"
with open(fname) as f:
   content = f.readlines()

value = [ i.strip().split(':') for i in content ]
dict = {k:v.split('	') for k,v  in value}

d = defaultdict(int)
for k,v in dict.items():
    for name in v:
      d[name] +=1
    print k,d

But, how do I get user count and search count for each search term.

Srini · Accepted Answer

Yes, you can use a defaultdict to do this (or just a regular dict too, but I think a defaultdict is more flexible)

In [36]: a = defaultdict(defaultdict)

In [40]: l  = ["1234:yahoo	google	microsoft	apple	yahoo", "2345:apple	google	google", "4567:yahoo	apple	apple"]

In [48]: for li in l:
    ...:     search_id, terms = li.split(":")[0], li.split(":")[1]
    ...:     terms = terms.split("	")
    ...:     for term in terms:
    ...:         if "search_cnt" in a[term]:
    ...:             a[term]["search_cnt"] += 1
    ...:         else:
    ...:             a[term]["search_cnt"] = 1
    ...:     for term in set(terms):
    ...:         if "user_cnt" in a[term]:
    ...:             a[term]["user_cnt"] += 1
    ...:         else:
    ...:             a[term]["user_cnt"] = 1

In [49]: a
Out[49]:
defaultdict(collections.defaultdict,
            {'apple': defaultdict(None, {'search_cnt': 4, 'user_cnt': 3}),
             'google': defaultdict(None, {'search_cnt': 3, 'user_cnt': 2}),
             'microsoft': defaultdict(None, {'search_cnt': 1, 'user_cnt': 1}),
             'yahoo': defaultdict(None, {'search_cnt': 3, 'user_cnt': 2})})

The default dict above contains the counts you need.

The reason I use the set for the second term iteration is that if 1 user searched for a term multiple times, the term's user count should not increment :)

aggregation of data combining dict of list

Answers (1)

Related Questions