Reputation: 190
I want to apply an IF statement to multiple columns (essentially an entire data frame) and am taking the approach of creating a function to do so. The aim is to replace the data in the columns with a number representing the group that number falls into.
The data sample looks as such:
> Mat
A B C D E
E1 8.45 6.65 7.35 5.18 3.11
E2 12.59 4.18 4.08 0.95 1.75
E3 15.93 3.05 1.81 2.77 4.42
E4 15.93 3.05 1.81 2.77 4.42
E5 11.57 4.48 4.70 2.01 1.08
E6 8.17 7.05 7.70 5.38 3.45
E7 11.57 4.48 4.70 2.01 1.08
E8 9.49 5.41 6.51 5.78 3.20
E9 11.71 4.40 4.58 1.87 1.11
E10 9.52 5.49 6.63 6.07 3.49
The function I tried to create will take an IF statement and look at each value in a column and depending on the value replace it with a group number from 1 to 6 (for numbers between 1 and 10) and an NA
for numbers greater than 10. The IF statement itself worked when I write it out manually for ONE column. The function I wrote is as such (called Grouping):
# write user function to apply the loop
Grouping = function(data) {
for(i in 1:length(x)) {
if(x[i] < 1) {
x[i] = 1
} else if (x[i] < 3) {
x[i] = 3
} else if (x[i] < 4) {
x[i] = 4
} else if (x[i] < 5) {
x[i] = 5
} else if (x[i] < 10) {
x[i] = 6
} else
x[i] = "NA"
}
}
When I attempted to use apply
with the function my error was:
> apply(Mat, 1, Grouping)
Error in FUN(newX[, i], ...) : object 'x' not found
Clearly the problem is in my construction of the user function but I'm not sure where I've gone wrong as I'm quite new to function creation.
Any help is appreciated!
Thanks!
Upvotes: 2
Views: 2751
Reputation: 57686
You really should use ifelse
when working on a vector, rather than a loop.
grouping <- function(x)
{
ifelse(x < 1, 1,
ifelse(x < 3, 3,
ifelse(x < 4, 4,
ifelse(x < 5, 5,
ifelse(x < 10, 6,
NA)))))
}
data[] <- lapply(data, grouping)
Or better yet, use cut
to turn a numeric vector into bands:
grouping <- function(x)
{
x <- cut(x, c(-Inf, 1, 3, 4, 5, 10), labels=c(1, 3, 4, 5, 6), right=FALSE)
as.numeric(as.character(x))
}
data[] <- lapply(data, grouping)
Upvotes: 2
Reputation: 3711
Here is a way, just changing your data to x;
Grouping = function(x) {
if(x < 1) {
x = 1
} else if (x < 3) {
x = 3
} else if (x < 4) {
x = 4
} else if (x < 5) {
x = 5
} else if (x < 10) {
x = 6
} else
x = "NA"
}
Dummy data
> set.seed(1)
> mat<-matrix(rnorm(100,5,5), nrow=10)
> mat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1.8677309 12.558906 9.594887 11.793398 4.177382 6.9905294 17.008089 7.3775476 2.156656 2.287400
[2,] 5.9182166 6.949216 8.910682 4.486061 3.733192 1.9398680 4.803800 1.4502678 4.324107 11.039339
[3,] 0.8218569 1.893797 5.372825 6.938358 8.484817 6.7055985 8.448697 8.0536318 10.890435 10.802013
[4,] 12.9764040 -6.073499 -4.946758 4.730975 7.783316 -0.6468155 5.140011 0.3295118 -2.617834 8.501068
[5,] 6.6475389 10.624655 8.099129 -1.885298 1.556222 12.1651185 1.283634 -1.2681670 7.969731 12.934167
[6,] 0.8976581 4.775332 4.719356 2.925027 1.462524 14.9019995 5.943961 6.4572312 6.664752 7.792432
[7,] 7.4371453 4.919049 4.221022 3.028550 6.822910 3.1638926 -4.024793 2.7835406 10.315499 -1.382961
[8,] 8.6916235 9.719181 -2.353762 4.703433 8.842665 -0.2206731 12.327774 5.0055268 3.479080 2.133673
[9,] 7.8789068 9.106106 2.609250 10.500127 4.438269 7.8485981 5.766267 5.3717066 6.850094 -1.123063
[10,] 3.4730581 7.969507 7.089708 8.815879 9.405539 4.3247270 15.863058 2.0523953 6.335494 2.632997
apply function
> matrix(lapply(mat, Grouping), nrow = 10)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 3 "NA" 6 "NA" 5 6 "NA" 6 3 3
[2,] 6 6 6 5 4 3 5 3 5 "NA"
[3,] 1 3 6 6 6 6 6 6 "NA" "NA"
[4,] "NA" 1 1 5 6 1 6 1 1 6
[5,] 6 "NA" 6 1 3 "NA" 3 1 6 "NA"
[6,] 1 5 5 3 3 "NA" 6 6 6 6
[7,] 6 5 5 4 6 4 1 3 "NA" 1
[8,] 6 6 1 5 6 1 "NA" 6 4 3
[9,] 6 6 3 "NA" 5 6 6 6 6 1
[10,] 4 6 6 6 6 5 "NA" 3 6 3
Upvotes: 1