Reputation: 323
Hopefully this is an easy one. I just can't seem to piece together an answer. I have a data frame. For each row, I have values that I need to change to NA. It is not the same value that needs to be changed for every row. I want to change values to NA for each row based on a value that is in a specified column.
mydata = as.data.frame(rbind(c("AA","CC","BB","DC","CC"),c("CC","CC","BB","DC","BB"),c("BB","BB","BB","DC","DC")))
> mydata
V1 V2 V3 V4 V5
1 AA CC BB DC CC
2 CC CC BB DC BB
3 BB BB BB DC DC
#for each row, replace values that match the value in column 5 with NA
apply(mydata[,1:4], 1, function(x){
x[x %in% x$V5] = NA
})
Desired output
> mydata
V1 V2 V3 V4 V5
1 AA NA BB DC CC
2 CC CC NA DC BB
3 BB BB BB NA DC
Thanks!
----UPDATE----
Using the code below from arvi1000 works great for comparing values in a row to a single column of values. Is there a way to do something like this but comparing the values to 2 or more columns?
Current code
mydata[,1:4][mydata[,1:4]==mydata[,5]] <- NA
Let's say I also have a column 6. By row, I want to change values that do not equal values in columns 5 or 6 to NA.
mydata = as.data.frame(rbind(c("AA","CC","BB","DC","CC","AA"),c("CC","CC","BB","DC","BB","CC"),c("BB","BB","BB","DC","DC","BB")),stringsAsFactors=F)
> mydata
V1 V2 V3 V4 V5 V6
1 AA CC BB DC CC AA
2 CC CC BB DC BB CC
3 BB BB BB DC DC BB
Desired output
> mydata
V1 V2 V3 V4 V5 V6
1 AA CC NA NA CC AA
2 CC CC BB NA BB CC
3 BB BB BB DC DC BB
I tried to do this, but received an error
mydata[,1:4][mydata[,1:4]==mydata[,5]|mydata[,6]] <- NA
Error in mydata[, 1:4] == mydata[, 5] | mydata[, 6] :
operations are possible only for numeric, logical or complex types
Upvotes: 1
Views: 1890
Reputation: 37879
Another way would be using apply:
mydata = as.data.frame(rbind(c("AA","CC","BB","DC","CC"),c("CC","CC","BB","DC","BB"),c("BB","BB","BB","DC","DC")))
mydata <- data.frame(t(apply(mydata,1,function(x) {
for ( i in 1:(ncol(mydata)-1)){
if ( x[i] == x[ncol(mydata)]) {
x[i] <- NA
}
}
return(x)
})))
output:
> mydata
V1 V2 V3 V4 V5
1 AA <NA> BB DC CC
2 CC CC <NA> DC BB
3 BB BB BB <NA> DC
Upvotes: 1
Reputation: 9582
Add stringsAsFactors=F to as.data.frame. This is key because 'CC'!='CC'
when they are different levels of different factors.
mydata = as.data.frame(rbind(c("AA","CC","BB","DC","CC"),c("CC","CC","BB","DC","BB"),c("BB","BB","BB","DC","DC")),
stringsAsFactors=F)
Then:
mydata[,1:4][mydata[,1:4]==mydata[,5]] <- NA
Voila:
V1 V2 V3 V4 V5
1 AA <NA> BB DC CC
2 CC CC <NA> DC BB
3 BB BB BB <NA> DC
Upvotes: 1