Softcover Walk
Softcover Walk

Reputation: 51

Python compare two csv

How to compare columns and extract values in two csv files similar to Excel VLOOKUP?

a.csv
name,type
test1,A
test2,B
test3,A
test4,E
test5,C
test6,D


b.csv
type,value
A,1.0
B,0.5
C,0.75
D,0.25

Expected output after comparison of "type column", create a new csv file with these values

newfile.csv
name,type,value
test1,A,1.0
test2,B,0.5
test3,A,1.0
test4,E,N/A
test5,C,0.75
test6,D,0.25

So far, codes as below

A = 'a.csv'
B = 'b.csv'

df_B = pd.read_csv(B)

with open(A, 'r') as reference:
  with open('newfile.csv', 'w') as results:    
    reader = csv.reader(reference)
    writer = csv.writer(results)

    writer.writerow(next(reader, []) + ['value'])

    for row in reader:
      checkRecords = df_B.loc[df_B['type'] == row[1]]
      #checkRecords_A = df_B[df_B.type == row[1]].iloc[0] # IndexError: index 0 is out of bounds for axis 0 with size 0

      if checkRecords.empty:
        value = 'N/A'
      else:
        value = checkRecords.value
        print(value)
        # This value have name and dtype which is not expected

      writer.writerow(row + [value])
  results.close()

Upvotes: 0

Views: 315

Answers (1)

webb
webb

Reputation: 585

Using pandas, you can merge two DataFrames where one contains relevant information which will be used in the other DataFrame. Here's an example:

import pandas as pd

csv1 = pd.DataFrame({"name":["test1","test2","test3","test4","test5"],"type":["A","B","C","A","D"]})

csv2 = pd.DataFrame({"type":["A","B","C"],"value":[1,2,3]})

pd.merge(csv1, csv2, on="type", how='outer')

And the output would be:

name    type    value
test1   A   1.0
test4   A   1.0
test2   B   2.0
test3   C   3.0
test5   D   NaN

Upvotes: 3

Related Questions