Luuk
Luuk

Reputation: 75

If/else function on a data frame to create conditional matrix

Here is the data set for reproducibility:

    a=c(90.41,37.37,18.98)
    b=c(103.39,39.44,51.68)
    c=c(83.51,36.41,47.46)
    d=c(94.60,38.57,50.22)
    e=c(95.04,38.81,50.49)
    xx=t(data.frame(a,b,c,d,e))
    df=data.frame(xx)

And here is the if/else function I am trying run on the data frame

 classify=function(df){
 if (df>=110) {
 class="5"}
 else if (df<110 & df>=103){
 class="4"}
 else if (df<103 & df>=95){
 class="3"}
 else if (df<95 & df>=76){
 class="2"}
 else if (df<76){
 class="1"}
 else {class="none"}
 }  

However, what I want the if/else function to produce is a new data frame that looks like this:

     df
     X1 X2 X3
    a  2  1  1
    b  4  1  1
    c  2  1  1
    d  2  1  1
    e  3  1  1

I am unsure as to how to do this so any help would be super appreciated. I anticipate something is wrong in the if/else function itself but I am quite inexperienced and I don't know how to detect errors in the script that easily. Thank you!

Upvotes: 4

Views: 238

Answers (3)

utubun
utubun

Reputation: 4505

Quite a similar approach to your example, using case_when from dplyr:

library(dplyr)

classify <- function(x){
  case_when(
    x >= 110 ~ "5",
    x >= 103 & x < 110 ~ "4",
    x >= 95 & x < 103 ~ "3",
    x >= 76 & x < 95 ~ "2",
    x < 76 ~ "1",
    TRUE ~ "none"
  )
}

a = c(90.41, 37.37, 18.98)
b = c(103.39, 39.44, 51.68)
c = c(83.51, 36.41, 47.46)
d = c(94.60, 38.57, 50.22)
e = c(95.04, 38.81, 50.49)

df <- data.frame(matrix(c(a, b, c, d, e), ncol = 3, byrow = T))

mutate_all(df, classify)

#  X1 X2 X3
#1  2  1  1
#2  4  1  1
#3  2  1  1
#4  2  1  1
#5  3  1  1

In case if:

df
#      X1    X2    X3
#1   -Inf 37.37 18.98
#2 103.39   NaN 51.68
#3  83.51 36.41 47.46
#4  94.60   Inf 50.22
#5  95.04 38.81    NA

The results look like this:

mutate_all(df, classify)
#  X1   X2   X3
#1  1    1    1
#2  4 none    1
#3  2    1    1
#4  2    5    1
#5  3    1 none

Upvotes: 2

Lennyy
Lennyy

Reputation: 6132

sapply(df, function(x) {as.numeric(as.character(cut(x, c(-Inf,76,95,103,110,Inf), seq(1:5))))})

     X1 X2 X3
[1,]  2  1  1
[2,]  4  1  1
[3,]  2  1  1
[4,]  2  1  1
[5,]  3  1  1

Use cut to set the intervals (its 2nd argument) and the labels (its 3rd argument). However, it returns a factor, so convert back to numeric if you like that. Since you want to run the function over the the full dataframe, use sapply or lapply.

Upvotes: 5

Rui Barradas
Rui Barradas

Reputation: 76673

You can do this with findInterval. All you have to do is to pass it a non-decreasing vector of break points.

classify <- function(DF, breaks = c(-Inf, 76, 95, 103, 110, Inf)){
  f <- function(x, breaks) findInterval(x, breaks)
  DF[] <- lapply(DF, f, breaks)
  DF
}

classify(df)
#  X1 X2 X3
#a  2  1  1
#b  4  1  1
#c  2  1  1
#d  2  1  1
#e  3  1  1

Upvotes: 2

Related Questions