Reputation: 3502
I randomly selected 30 values from variable a
in the df data.frame
.
set.seed(123)
date <- as.Date(seq(as.Date("2003-01-01"), as.Date("2003-05-31"), by = 1), format="%Y-%m-%d")
a <- runif(151, 0.005, 2.3)
df <- data.frame(date, a)
#select 30 random samples
rans <-sample(length(df$a), 30)
I tried this and it replaced all values in df$a
that are equal to rans
with NAs
.
df[,2][rans] <- NA
But I want to replace all values in df$a
that are NOT EQUAL to rans
with NAs
so I tried the following but it didn't work
df[,2][!rans] <- NA #didn't work
df[,2][!rans %in% df] <- NA #replaced all values in df$a with NAs
Any suggestions how to do that?
Upvotes: 2
Views: 1796
Reputation: 887511
It may not be better to use negative index, instead use setdiff
. We get the row index of those the sequence of rows that are not found in 'rans' by using the setdiff
, and then assign the 2nd column values corresponding to those rows as NA.
df[setdiff(seq_len(nrow(df)), rans),2] <- NA
Or instead of setdiff
, we use %in%
to get a logical vector of common elements and then negate (!
) so that TRUE becomes FALSE and FALSE as TRUE. Assign the 2nd column values that corresponds to the rows as NA.
df[!(seq_len(nrow(df)) %in% rans), 2] <- NA
If we use data.table
, we convert the 'data.frame' to 'data.table' (setDT(df)
), and assign 'a' to 'NA' for those row that doesn't satisfy the condition (as mentioned above).
library(data.table)
setDT(df)[!(1:.N %in% rans), a:= NA]
Why the OP's code didn't work?
First option
df[,2][!rans] <- NA
didn't work because
!rans
#[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#[23] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
gives all FALSE
values.
The negation operator (!
) converts whichever value that is '0' in the vector/column
to TRUE and all others to FALSE. As the 'rans' did not have any 0 value, all of them got converted to FALSE. So, by assigning based on the logical index of all FALSE is not going to replace any corresponding value in the 2nd column to NA.
Second option
df[,2][!rans %in% df] <- NA
'df' is a data.frame
and the values in the columns don't match with the values in 'rans'. So it will be all FALSE again.
rans %in% df
#[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#[23] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
By negating the above, all the elements are now TRUE, so it subsets all the values in 2nd column, and by assigning those elements to NA, we get a column with full NA values.
Upvotes: 1
Reputation: 698
You can try
df[-rans,2] <- NA
the negative values will just discard those elements in the list
Upvotes: 1