Reputation: 718
I have two columns in a dataset after merging two seperate datasets. I would like to merge these columns into one column, BNR.x.
For the cases listed below my prefered outcomes would be:
1. Nothing. BNR.x has data, that's fine.
2. Nothing. Data in both colums is the same, that's fine.
3. Data from BNR.y is copied to BNR.x
4. Nothing. Same as 2.
5. Data in colums is different. Preferably i'd get an extra column with a 1 of FALSE as warning in this row.
6. No Data. Preferably i'd get a warning here aswell to notify me that i dont have any data for this item.
+----+-------+-------+
| ID | BNR.x | BNR.y |
+----+-------+-------+
| 1 | 123 | NA |
| 2 | 234 | 234 |
| 3 | NA | 345 |
| 4 | 456 | 456 |
| 5 | 678 | 677 |
| 6 | NA | NA |
+----+-------+-------+
Is there a method or package that will do this for me?
Upvotes: 3
Views: 2643
Reputation: 49033
If your data are in a data frame called d
, you can do :
## Copy BNR.y if BNR.x is missing
d$BNR.x[is.na(d$BNR.x)] <- d$BNR.y[is.na(d$BNR.x)]
## List the indices of BNR.x that are still missing
which(is.na(d$BNR.x))
## List the indices where BNR.x is different from BNR.y
which(d$BNR.x != d$BNR.y)
Upvotes: 1
Reputation: 508
From:
df
V1 V2 V3
1 1 123 NA
...
df[which(is.na(df$V2)),]$V2 <- df[which(is.na(df$V2)),]$V3
df$warn <- 0
df[which(is.na(df$V2)),]$warn <- 1
df[which(df$V2 != df$V3 & !is.na(df$V3)),]$warn <- 1
Ok, overuse of which and transform is nicer, but I have to start somewhere :)
ps. am I wrong or
d$BNR.x[is.na(d$BNR.x)] <- d$BNR.y
won't work because it will place "wrongly aligned" BNR$y values in correspondence to BNR$x NAs?
Upvotes: 0
Reputation: 81683
Here's a proposal. dat
is the name of the data frame:
idx <- is.na(dat$BNR.x) # create logical index for NAs in BNR.x
dat$BNR.x[idx] <- dat$BNR.y[idx] # replace NAs with values from BNR.y
# Add a logical column:
dat <- transform(dat, warn = is.na(BNR.x) | (BNR.x != BNR.y & !is.na(BNR.y)))
The result:
ID BNR.x BNR.y warn
1 1 123 NA FALSE
2 2 234 234 FALSE
3 3 345 345 FALSE
4 4 456 456 FALSE
5 5 678 677 TRUE
6 6 NA NA TRUE
Upvotes: 2