Pandas Col to dict with key, value pair where value is frequency of string occurrence

Question

so my df['data'] looks like this

[[G, 37, T], [-, 447, A]]
[[G, 78, A], [-, 447, A], [A, 1023, -]]
[[-, 447, A], [C, 3049, T]]
[[-, 447, A]]

What I need is dictionary like this

{'-447A':4,'G37T':1,'G78T':1,'C3049T':1,'A1023-':1}

I've tried to do df.to_list() but i get list of lists of lists and then joining each item to get list of strings and then convert them to dict but all i got is some

TypeError: 'float' object is not iterable TypeError: sequence item 0: expected str instance, list found

so i think there is a better/faster option than for loops

akuiper · Accepted Answer

Don't think there's a more optimized way to do this in pandas, just convert it to a list, transform and count it using collections.Counter:

from collections import Counter
Counter((''.join(map(str, slst)) for lst in df.data.to_list() for slst in lst))
# Counter({'-447A': 4, 'G37T': 1, 'G78A': 1, 'A1023-': 1, 'C3049T': 1})

Pandas Col to dict with key, value pair where value is frequency of string occurrence

Answers (1)

Related Questions