Frank
Frank

Reputation: 91

Aggregate sets in pandas

I have a table made like this:

col1    col2
a       {...}
a       {...}
b       {...}
c       {...}
c       {...}
c       {...}

Where col2 is made up by sets. I need to aggregate by col1 such that col2 is the union of the sets.

My best attempt so far was this:

def set_union(*sets):
    return reduce(lambda a, b: a.union(b), sets)

mytable.groupby('col1', as_index=False)['equivalente_new'].agg(set_union)

Which yields:

ValueError: Must produce aggregated value

Does anyone have any solution?

Upvotes: 0

Views: 184

Answers (1)

piRSquared
piRSquared

Reputation: 294278

Remove the splat in your function signature

def set_union(sets):
    return reduce(lambda a, b: a.union(b), sets)

mytable.groupby('col1', as_index=False).agg(set_union)

  col1       col2
0    a     {1, 2}
1    b        {3}
2    c  {4, 5, 6}

I like this better (without the reduce)

def set_union(sets):
    return set().union(*sets)

mytable.groupby('col1', as_index=False).agg(set_union)

Upvotes: 3

Related Questions