How to add new specific cells from a csv file to another without duplication

Question

I want to update a .csv file's specific column/columns from another .csv file's columns. However, when I execute the following script, after lots of trials, the output .csv file either gets its old fields removed or old fields are duplicated. I also want to not add the already added fields from the input file when the script runs again.

with open('Shop Export File.csv', 'r', encoding='utf-8-sig') as shop_file:
    with open('Shipping CSV.csv', 'a', encoding='utf-8-sig', newline='') as shipping_file:
        shop_csv = csv.DictReader(shop_file)
        shipping_csv = csv.DictWriter(shipping_file, fieldnames=fieldnames)
        for r in shop_csv:
            data_input = {
                'Deliver To Name': r['Delivery Customer'],
                'Deliver To Business Name': r['Delivery Company Name'],
                'Deliver To State': r['Delivery State'].strip('AU-'),
                'Deliver To Suburb': r['Delivery City'],
                'Deliver To Postcode': r['Delivery Zip Code'].strip('"'),
                'Deliver To Phone Number': r['Shipping Phone'].strip('"')
            }
            shipping_csv.writerow(data_input)

The 'Deliver To' are the fieldnames in the output .csv and the latter ones are from the input .csv.

In another way, I want only specific columns of the output csv file totally overwritten by the columns in the input csv file with no change to other columns.

Ahmed Elsisi · Accepted Answer

I was able to find the solution by renaming the headers of the columns from file1 the same names as the headers of file2, so that they could add to each other using csv.DictReader and csv.DictWriter methods.

Finally, I was able to prevent duplication of pre-existing column data by checking if the name in that specific row is in the list of names already existing or not, if yes, it would skip the whole row and get the other name with the information in that row.

import csv

try:
    with open('Shipping CSV.csv', 'r', encoding='utf-8-sig') as shipping_read:
        shipping_old_data = csv.reader(shipping_read)
        shipping_names = []
        for row in shipping_old_data:
            shipping_names.append(row[12])

    with open('Shop Export File.csv', 'r', encoding='utf-8-sig') as shop_file:
        with open('Shipping CSV.csv', 'a', encoding='utf-8-sig', newline='') as shipping_file:
            shop_rows_as_dicts = csv.DictReader(shop_file)
            rows = []
            for row in shop_rows_as_dicts:
                phone_number = row['Shipping Phone'].strip('"')
                zip_code = row['Delivery Zip Code'].strip('"')
                rows.append({
                    'Deliver To Name': row['Delivery Customer'],
                    'Deliver To Business Name': row['Delivery Company Name'],
                    'Deliver To State': row['Delivery State'].strip('AU-'),
                    'Deliver To Suburb': row['Delivery City'],
                    'Deliver To Postcode': f"'{zip_code}",
                    'Deliver To Phone Number': f"'{phone_number}",
                    'Deliver To Address Line 1': row['Delivery Street Name&Number'],
                    'Item Description': row['Item\'s Variant'].replace('Product', '')})

            shipping_csv = csv.DictWriter(shipping_file, fieldnames=fieldnames)
            counter = 0
            for row in rows:
                if row['Deliver To Name'] in shipping_names:
                    pass
                else:
                    counter += 1
                    shipping_csv.writerow(row)
                    print(f'[+] Successfully transferred row {counter}')

            input('
[+] Process Finished.
Press enter to exit.')

except FileNotFoundError:
    input('File names might have changed. Press Enter to Exit.')

Thanks to the answer above by scotscotmcc, it helped me get to this solution aswell!

How to add new specific cells from a csv file to another without duplication

Answers (2)

Related Questions