Reputation: 5
I have to nearly identical files about customer information. I would like to validate one file against the other. Is there a good way in R that can pinpoint where 'status' is different across two files? I tried to merge the two files and change status in file 2 to status2, but ran into issues and then started wondering if there was a better way to go about this. My data looks like this:
file 1 file 2
CustomerID Status CustomerID Status
1709 low 1709 low
1803 high 1803 low
1951 medium 1951 medium
Upvotes: 0
Views: 51
Reputation: 5
Thank you for the help everyone. There was a slight difference in the names of the status variables that seems to have caused one not to import on the merge. When I made the names identical, everything worked smoothly.
Upvotes: 0
Reputation: 636
Assuming that you named your files file1
and file2
and both are of equal length, you can do:
unequal <- which(file1$Status != file2$Status)
This will return the row index numbers. If you want to have the CustomerID, you can do:
unequalCustomerID <- file1$CustomerID[unequal]
Or of course in one statement:
file1$CustomerID[which(file1$Status != file2$Status)]
Upvotes: 1
Reputation: 467
If you read your files into dataframes, say df1 and df2. You can try something like df1['status'] != df2['status']
df1 = data.frame(id = c(1,2,3), status = c("low","low","high"))
df2 = data.frame(id = c(1,2,3), status = c("high","low","high"))
df1['status'] != df2['status']
status
[1,] TRUE
[2,] FALSE
[3,] FALSE
Upvotes: 0