Reputation: 95
I have three CSV files. The first (csv1) can be considered a positive dataset where the first column (column 1) consists of certain IDs. The same goes second column as well. The data in csv1 are paired data meaning the corresponding entries in the CSV cells are pairs. Ex:
colA colB
A.1 B.1
C.1 D.1
Here, A.1 and B.1 can be considered a pair, and the same goes for C.1 and D.1. In the second file (csv2), it only consists of the data of entries of Column A of file 1. Ex:
Col X1 X2 X3 X4
A.1 0.1 0.2 0.3 0.4
C.1 0.2 0.3 0.4 0.5
And similarly, the third file (csv3) consists of the data of entries of Column B of file 1. Ex:
Col X1 X2 X3 X4
B.1 0.1 0.2 0.3 0.4
D.1 0.2 0.3 0.4 0.5
I am writing a code where I first import all the three files and then iterate through the length of column A of file 1 and assign the values of the first cell of Column A and Column B to x and y respectively. I want to write a code where after assigning the respective values to x and y I will search whether these values are in file2 (x value) and file 3 (y value). If it is there then I want to extract the corresponding rows and concatenate them and save them in a separate CSV.
So, if my "x" is assigned a value of A.1 (hereby assigning I am assigning the string A.1) and "y" is assigned a value of B.1, then I want my code to first search if A.1 is there in file2 and B.1 is there in file3. If it is there, I want to extract the corresponding row values for A.1 (0,1,0.2,0.3,0.4...) and B.1 (0.2,0.3,0.4,0.5...) and concatenate their values:
col x1 x2 x3 x4 x5 x6
A.1_B.1 0.1 0.2 0.3 0.4 0.2 0.3
This is what I have written, but I am facing a "Keyerror". Whereas, when I checked my CSV file the ID is there. Any help would be much appreciated.
file1 = pd.read_csv("/home/file1.csv")
file2 = pd.read_csv("/home/file2.csv")
file3 = pd.read_csv("/home/file3.csv")
for i in range(len(file1['ID'])):
x = ID_A[i]
y = ID_B[i]
if x in CT_ID_A:
if y in CT_ID_B:
d1 = file2.loc[x]
d2 = file3.loc[y]
d3 = pd.concat([d1,d2],axis=1)
here, ID_A and ID_B consist of the corresponding IDs of columns of file1, and CT_ID_A and CT_IS_B consist of IDs of file2 and file3, that is:
ID_A = ['A.1','C.1']
ID_B = ['B.1','D.1']
CT_ID_A = ['A.1','C.1']
CT_ID_B = ['B.1','C.1']
Upvotes: 0
Views: 96
Reputation: 341
If your key error is ID
then there is a possibility that yoyr csv file header does not have any column with the name ID
Upvotes: 1