Arseassin
Arseassin

Reputation: 37

Using openpyxl to compare and modify columns from different files

I'm writing a program that does the following:

1.) Read a CSV file containing processed formatted data from another script

2.) Compare data from the CSV to an XLSX file that is used to track trends over time, sorted by sheets; if data categories from the CSV do not already exist in the XLSX, it will add that category to the bottom of the corresponding sheet; it should then enter the corresponding numerical values from the CSV to the end of the row in the XLSX

Heres what I have so far:

import csv
import openpyxl

logfile = 'logbook.xlsx'
wb = openpyxl.load_workbook(logfile)

with open ('working.csv', 'r') as csvfile:
    infile = csv.reader(csvfile, delimiter=',')
    for col in infile:
        if col[1] == 'typeCol':
            list = []
            list.append(col[3])
            ws = wb['typeCol']

This will read the desired column and add the unique entry types to a list.

What I want to do now is compare this list to a specific column in the logbook.xlsx, however, I cannot seem to figure out how to get openpyxl to iterate through a specific column on a specific sheet for comparison.

-UPDATE-

In response to the reply "There's not enough information given to determine a solution. What columns are in the CSV and in the excel sheet? Which column is the key field to do the match?"

The key field is the type - they will be an exact match in both the CSV and excel.

The CSV and excel columns will both contain a combination of letters and numbers and possibly symbols, so really anything. Example:

CSV:
Col1 Col2 Col3
typ1 asdf 1300
typ2 b14f 150
typ3 a8-j 11

XLSX:
Col1 Col2 Col3 Col4 Col5 Col6     SEP1 col populated by corresponding #'s
date ---- JUN1 JUL1 AUG1 SEP1     from CSV above
typ1 asdf 10   955  756  
typ2 b14f 0    6191 3435 
typ3 z1z9 919  0    1499 
-GENERATE TYP3 FROM CSV HERE AS NEW ROW-

Any assistance is greatly appreciated!

Upvotes: 1

Views: 959

Answers (1)

Mike67
Mike67

Reputation: 11342

This code should get the result you're looking for. It opens the csv and excel file and updates the excel file according to the key columns in the csv. If no matching key columns are found, it appends a row.

# Create CSV for test
cdata = '''
Col1 Col2 Col3
typ1 asdf 1300
typ2 b14f 150
typ3 a8-j 11
'''.strip()

with open ('working.csv', 'w') as csvfile:
    csvfile.write(cdata)


###################### Main Script ##########################

import csv
import openpyxl

logfile = 'logbook.xlsx'
wb = openpyxl.load_workbook(logfile)
ws = wb.worksheets[0]

curcol = 6  # SEP1

with open ('working.csv', 'r') as csvfile:
    infile = csv.reader(csvfile, delimiter=' ')
    for i, row in enumerate(infile):
       if i==0: continue  # skip headers
       for xr in range(3, ws.max_row+1):  # check excel sheet
          if ws.cell(xr,1).value == row[0] and ws.cell(xr,2).value == row[1]: # key columns match
              ws.cell(xr,curcol).value = float(row[2])  # copy csv value
              break  # found entry
       else: # did not find entry, must add row
          ws.cell(xr+1, 1).value = row[0]
          ws.cell(xr+1, 2).value = row[1]
          ws.cell(xr+1, curcol).value = float(row[2])
          
wb.save('logbookNew.xlsx')  

Before

BeforeUpdate

After

AfterUpdate

Upvotes: 1

Related Questions