Sana
Sana

Reputation: 563

Removing duplicates from Excel rows by adding values of some columns

I have an Excel file containing 500 rows containing product details and their quantities in different programs. All these 500 rows are duplicated products with different quantities. I would like to remove duplicates and add up quantities so instead of 5 rows having productID D1 I want to have one row only (quantities of 4 other rows is added to the remaining row) [I am looking to create bottom table from top one]

enter image description here I found similar problem on stackoverflow and people suggested uploading the data in the database and have sum(quantity1), sum(quantity2),...but I have 150 columns so I would not be able to write a query for that. (Removing duplicate rows by adding column value)

I am thinking of writing a python script but I am not sure how I can handle the duplicates.

Many Thanks.

Upvotes: 0

Views: 1680

Answers (4)

Copy first 3 columns to another sheet or somewhere else. Use "Remove Duplicates". In order to find sum of quantites use a formula similar to this:

=SUMIFS($D$2:$D$500, $A$2:$A$500, $A2, $B$2:$B$500, $B2, $C$2:$C$500, $C2)

Then drag the formula to find other quantities.

Upvotes: 3

keineahnung2345
keineahnung2345

Reputation: 2701

import pandas as pd
import numpy as np

df = pd.DataFrame({
        "ProductsID": ["A1", "A1", "B1", "B1", "B1"], 
        "Category": ["MM", "MM", "NN", "NN", "NN"],
        "Price": [50, 50, 42, 42, 42],
        "Quantity1": [1, 8, 5, np.nan, 4],
        "Quantity2": [6, 3, np.nan, 2, 3]})

grouped = df.groupby(by=["ProductsID", "Category"])
agged = grouped.agg({"Price": "max", 
                "Quantity1": "sum", 
                "Quantity2":"sum"})

result = agged.reset_index()

The result:

  ProductsID Category  Quantity1  Quantity2  Price
0         A1       MM        9.0        9.0     50
1         B1       NN        9.0        5.0     42

Upvotes: 0

Harassed Dad
Harassed Dad

Reputation: 4704

I can't see your image but assuming it's fairly standard layout: If you can sort the data by productid you can use the built in functionality of Excel for this. Use Data, Subtotal - specify that on each change in ProductID you wish to Sum quantity. Then collapse the resultant table so only the totals are shown. Use Home, Find and Select,Goto Special, visible cells only. Then copy and past to a second sheet You now have all the products totals.

Upvotes: 0

Related Questions