Reputation: 309
# Create a data frame
> df <- data.frame(a = rnorm(7), b = rnorm(7), c = rnorm(7), threshold = rnorm(7))
> df <- round(abs(df), 2)
>
> df
a b c threshold
1 1.17 0.27 1.26 0.19
2 1.41 1.57 1.23 0.97
3 0.16 0.11 0.35 1.34
4 0.03 0.04 0.10 1.50
5 0.23 1.10 2.68 0.45
6 0.99 1.36 0.17 0.30
7 0.28 0.68 1.22 0.56
>
>
# Replace values in columns a, b, and c with NA if > value in threshold
> df[1:3][df[1:3] > df[4]] <- "NA"
Error in Ops.data.frame(df[1:3], df[4]) :
‘>’ only defined for equally-sized data frames
There could be some obvious solutions that I am incapable of producing. The intent is to replace values in columns "a", "b", and "c" with NA if the value is larger than that in "threshold". And I need to do that row-by-row.
If I had done it right, the df would look like this:
a b c threshold
1 NA NA NA 0.19
2 NA NA NA 0.97
3 0.16 0.11 0.35 1.34
4 0.03 0.04 0.10 1.50
5 0.23 NA NA 0.45
6 NA NA 0.17 0.30
7 0.28 NA NA 0.56
I had also tried the apply() approach but to no avail. Can you help, please??
Upvotes: 2
Views: 370
Reputation: 26343
The problem with your code was the usage of df[4]
instead of df[, 4]
. The difference is that df[4]
returns a data.frame
with one column and df[, 4]
returns a vector.
That's why
df[1:3] > df[4]
returns
error in Ops.data.frame(df[1:3], df[4]) : ‘>’ only defined for equally-sized data frames
While this works as expected
df[1:3][df[1:3] > df[, 4]] <- NA
df
# a b c threshold
#1 0.63 0.74 NA 0.78
#2 NA NA 0.04 0.07
#3 0.84 0.31 0.02 1.99
#4 NA NA NA 0.62
#5 NA NA NA 0.06
#6 NA NA NA 0.16
#7 0.49 NA 0.92 1.47
data
set.seed(1)
df <- data.frame(a = rnorm(7), b = rnorm(7), c = rnorm(7), threshold = rnorm(7))
df <- round(abs(df), 2)
Upvotes: 1
Reputation: 2359
You can use apply function across dataframe
df[,c(1:3)]<- apply(df[,c(1:3),drop=F], 2, function(x){ ifelse(x>df[,4],NA,x)})
Upvotes: 2
Reputation: 3183
You should use dplyr
for most of such use cases.
One way below:
> set.seed(10)
> df <- data.frame(a = rnorm(7), b = rnorm(7), c = rnorm(7), threshold = rnorm(7))
> df <- round(abs(df), 2)
> df
a b c threshold
1 0.02 0.36 0.74 2.19
2 0.18 1.63 0.09 0.67
3 1.37 0.26 0.95 2.12
4 0.60 1.10 0.20 1.27
5 0.29 0.76 0.93 0.37
6 0.39 0.24 0.48 0.69
7 1.21 0.99 0.60 0.87
>
> df %>%
+ mutate_at(vars(a:c), ~ifelse(.x > df$threshold, NA, .x))
a b c threshold
1 0.02 0.36 0.74 2.19
2 0.18 NA 0.09 0.67
3 1.37 0.26 0.95 2.12
4 0.60 1.10 0.20 1.27
5 0.29 NA NA 0.37
6 0.39 0.24 0.48 0.69
7 NA NA 0.60 0.87
Upvotes: 3
Reputation: 2253
You can use a for-loop like this:
for(i in 1:(ncol(df)-1)){
df[, i] <- ifelse(df[, i] > df[, 4], NA, df[, i])
}
Upvotes: 0