Reputation: 5

Comparing the same value across to datasets in R

I have to nearly identical files about customer information. I would like to validate one file against the other. Is there a good way in R that can pinpoint where 'status' is different across two files? I tried to merge the two files and change status in file 2 to status2, but ran into issues and then started wondering if there was a better way to go about this. My data looks like this:

file 1                     file 2
CustomerID  Status                 CustomerID  Status
1709         low                      1709      low     
1803         high                     1803      low
1951         medium                   1951      medium

Upvotes: 0

Answers (3)

J.Gorman

Reputation: 5

Thank you for the help everyone. There was a slight difference in the names of the status variables that seems to have caused one not to import on the merge. When I made the names identical, everything worked smoothly.

Upvotes: 0

StatMan

Reputation: 636

Assuming that you named your files file1 and file2 and both are of equal length, you can do:

unequal <- which(file1$Status != file2$Status)

This will return the row index numbers. If you want to have the CustomerID, you can do:

unequalCustomerID <- file1$CustomerID[unequal]

Or of course in one statement:

file1$CustomerID[which(file1$Status != file2$Status)]

Upvotes: 1

tokiloutok

Reputation: 467

If you read your files into dataframes, say df1 and df2. You can try something like df1['status'] != df2['status']

df1 = data.frame(id = c(1,2,3), status = c("low","low","high"))
df2 = data.frame(id = c(1,2,3), status = c("high","low","high"))

df1['status'] != df2['status']

 status
[1,]   TRUE
[2,]  FALSE
[3,]  FALSE

Upvotes: 0

Comparing the same value across to datasets in R

Answers (3)

Related Questions