Reputation: 403
Sorry the title is hard to understand—not sure how to phrase this one. Say I have a series that looks like this
s = pd.Series(index = ['a','b','c'], data = [['x','y','z'], ['y','z'], ['x','z']]).
I would want something like this
{'x':['a','c'], 'y':['a','b'], 'z':['a','b','c']}
I.e. I can see which keys correspond to each element from the series of lists. Any ideas how I could do this as efficiently as possible? Thanks!
Upvotes: 3
Views: 652
Reputation: 153570
Another solution using default dict for speed:
from collections import defaultdict
d = defaultdict(list)
q = s.explode()
for k, v in q.items():
d[v].append(k)
dict(d)
Output:
{'x': ['a', 'c'], 'y': ['a', 'b'], 'z': ['a', 'b', 'c']}
Timings:
%timeit s.explode().reset_index().groupby(0)['index'].agg(list).to_dict()
3.94 ms ± 119 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)%%timeit d = defaultdict(list) method
300 µs ± 33.4 µs per l0op (mean ± std. dev. of 7 runs, 1000 loops each)
Upvotes: 1
Reputation: 7222
Here's a second solution as well:
x = s.explode()
pd.DataFrame({'X':x.index, 'Y':x.values}).groupby('Y')['X'].apply(list).to_dict()
# {'x': ['a', 'c'], 'y': ['a', 'b'], 'z': ['a', 'b', 'c']}
Upvotes: 0
Reputation: 323396
Let us use explode
s.explode().reset_index().groupby(0)['index'].agg(list).to_dict()
{'x': ['a', 'c'], 'y': ['a', 'b'], 'z': ['a', 'b', 'c']}
Upvotes: 4