Meesha
Meesha

Reputation: 821

Using Ifelse in a dataframe

The data frame that I am using is

> df <- data.frame(Name=c("Joy","Jane","Jack","Jad"),M1=c(10,40,55,90))
> df
  Name M1
1  Joy 10
2 Jane 40
3 Jack 55
4  Jad 90

> df$Final <- ifelse(df$M1<=50,60,max(75,df$M1))
> df
  Name M1 Final
1  Joy 10    60
2 Jane 40    60
3 Jack 55    90
4  Jad 90    90

If the M1 value is less than or equal to 50 then I need 60 as my final value, while if the M1 value is greater than 50 then I need the maximumm(75,M1). In the case of Jack, M1 is 55, so I should get the max(75,55) which is 75. I think it is giving me the max of entire M1 column. How to avoid this?

Desired output

  Name M1 Final
1  Joy 10    60
2 Jane 40    60
3 Jack 55    75
4  Jad 90    90

Upvotes: 3

Views: 1269

Answers (5)

lmo
lmo

Reputation: 38520

You can also use pmax instead of max:

ifelse(df$M1 <= 50, 60, pmax(75, df$M1))

From the help file, pmax takes

one or more vectors (or matrices) as arguments and return(s) a single vector giving the ‘parallel’ maxima ... of the vectors. The first element of the result is the maximum ... of the first elements of all the arguments, the second element of the result is the maximum ... of the second elements of all the arguments and so on.

Thus the third argument to ifelse, the "else" value, is the pairwise maximum of 75 (recycled as many times as needed) and the values of df$M1.

Upvotes: 9

HubertL
HubertL

Reputation: 19544

If d$M1 only contains positive and non null integers, using a look-up might be more efficient:

lookup <-  c(rep(60, 50),rep(75, 25), 76:max(df$M1,76))
lookup[df$M1]

If it contains also negative or null integers :

lookup <-  c(rep(60, 50-min(df$M1)+1),rep(75, 25), 76:max(df$M1,76))
lookup[df$M1-min(df$M1)+1]

Upvotes: 0

Frank
Frank

Reputation: 66819

You're essentially describing a rule like...

  • up to 50, replace with 60
  • up to 75, replace with 75
  • up to x, replace with y
  • ...

If we put the rule into a data.frame, it is more explicit and probably allows for more efficient derivation of the results (instead of computing many inequalities). Here are two ways:

findInterval

m = data.frame(up_to = c(50, 75), replace_with = c(60, 75))

df$Final = df$M1
r = m$replace_with[ findInterval(df$M1, m$up_to) + 1L ]
df$Final = replace(df$M1, !is.na(r), na.omit(r))

data.table rolling joins

library(data.table)    
setDT(df)

m = data.table(up_to = c(50, 75), replace_with = c(60, 75))

df[, Final := M1]
r = m[df, on=c(up_to = "M1"), roll=-Inf][!is.na(replace_with), Final := replace_with]$Final
df[, Final := r]

Upvotes: 3

C_Z_
C_Z_

Reputation: 7816

You can use dplyr and rowwise

library(dplyr)

df %>%
  rowwise() %>%
  mutate(Final = ifelse(M1<=50,60,max(75,M1)))

Upvotes: -1

HubertL
HubertL

Reputation: 19544

What about :

ifelse(df$M1<=50,60,ifelse(df$M1>75,df$M1,75))

Upvotes: 3

Related Questions