Patrick Hingston
Patrick Hingston

Reputation: 292

Counting duplicate values in dataframe and consolidating them

I have a bill of material (BOM) that contains parts and some of their attributes

  REFDES  Tol    size value
0     R1   1%  CR0402   10K
1     R2   1%  CR0402    1K
2     C1  20%  CC0603  10uF
3     C2  20%  CC0603  10uF
4     C3  10%  CC0603  10uF
5     R3   1%  CR0402   10K

The BOM contains duplicate parts. I want to consolidate the duplicate parts, combine their REFDES, and keep track of quantity. The ideal output would look like this:

    QTY REFDES  Tol    size value
0    2  R1,R3   1%  CR0402   10K
1    1     R2   1%  CR0402    1K
2    2  C1,C2  20%  CC0603  10uF
3    1     C3  10%  CC0603  10uF

So far, I've been able identify duplicate parts and count their quantity, but I am not sure an elegant way to track the REFDES attribute

Here is what I've done so far

df = df.groupby(['size', 'value', 'Tol']).size().reset_index(name='QTY')

     size value  Tol  QTY
0  CC0603  10uF  10%    1
1  CC0603  10uF  20%    2
2  CR0402   10K   1%    2
3  CR0402    1K   1%    1

Upvotes: 0

Views: 442

Answers (1)

BENY
BENY

Reputation: 323306

Using agg + set

df.groupby(['size', 'value', 'Tol']).agg([lambda x : ','.join(x),'count']).reset_index()
Out[441]: 
     size value  Tol   REFDES      
                     <lambda> count
0  CC0603  10uF  10%       C3     1
1  CC0603  10uF  20%    C1,C2     2
2  CR0402   10K   1%    R1,R3     2
3  CR0402    1K   1%       R2     1

Upvotes: 1

Related Questions