phill
phill

Reputation: 95

Ifelse statement with 4 conditions

With the following sample data, I'm trying to create a new column "NOTA_NUM" (value 0 or 1 or 2 or 3 or 4) in my dataframe(df) based on the values of four conditional variables ("A", "B", "C", "D", "E") from one existing column (column1).

I have already tried:

df$NOTA_NUM <- ifelse(rowSums(df[ , "column1"]=="A"), 0,
        ifelse(rowSums(df[ , "column1"]=="B"), 1,
               ifelse(rowSums(df[ ,"column1"]=="C"), 2,
                      ifelse(rowSums(df[ , "column1"]=="D",3,4))

but that didn't work the way I would like.

I want "NOTA_NUM" to look like:

column1   NOTA_NUM
A             0
C             2
B             1
D             3
E             4

Upvotes: 0

Views: 1045

Answers (4)

Will Oldham
Will Oldham

Reputation: 1054

I like dplyr::case_when for these situations:

library(dplyr)

df <- data.frame(column1 = c("A", "C", "B", "D", "E")) %>% 
  mutate(NOTA_NUM = case_when(column1 == "A" ~ 0, 
                              column1 == "B" ~ 1, 
                              column1 == "C" ~ 2, 
                              column1 == "D" ~ 3, 
                              TRUE ~ 4))

Upvotes: 1

G. Grothendieck
G. Grothendieck

Reputation: 270438

Here are some approaches. No packages are used.

1) match Using DF shown reproducibly in the Note at the end match each element in column1 to LETTERS[1:4] and use 5 if no match. Subtract 1 from that.

transform(DF, NOIA_NUM = match(column1, LETTERS[1:4], nomatch = 5) - 1)

giving:

  column1 NOIA_NUM
1       A        0
2       C        2
3       B        1
4       D        3
5       E        4

2) switch Another possibility is to use switch:

transform(DF, NOTA_NUM = sapply(column1, switch, A = 0, B = 1, C = 2, D = 3, 4))

3) arithmetic This uses an arithmetic expression which evaluates to the required values:

transform(DF, NOTA_NUM = (0-4) * (column1 == "A") + 
                         (1-4) * (column1 == "B") + 
                         (2-4) * (column1 == "C") + 
                         (3-4) * (column1 == "D") + 
                         4)

Note

DF <- data.frame(column1 =  c("A", "C", "B", "D", "E"), stringsAsFactors = FALSE)

Upvotes: 4

neilfws
neilfws

Reputation: 33822

Not sure that I'd recommend as.numeric(factor(...)) as a general solution, but works for your case:

library(dplyr)

set.seed(1001) # for reproducible sample
# column1 = factor as stringsAsFactors = TRUE (default)
data.frame(column1 = sample(LETTERS[1:5], 50, replace = TRUE)) %>% 
  mutate(NOTA_NUM = as.numeric(column1)-1)

Upvotes: 0

IRTFM
IRTFM

Reputation: 263499

I would avoid ifelse for this purpose. You should employ a more efficient and compact approach to a table lookup or conversion. Try using a named vector as the table and pass the inputs to the "[" function:

> lookup = c(A=0, C= 2, B =  1, D= 3, E = 4)
> df <- data.frame( cl1 = names(lookup))
> df
  cl1
1   A
2   C
3   B
4   D
5   E
> df$NOTA_NUM= lookup[df$cl1]
> df
  cl1 NOTA_NUM
1   A      0
2   C      1
3   B      2
4   D      3
5   E      4

If you need these to be letters then quote them in the lookup vector but beware that the data.frame function will make them factors unless you explicitly prevent that default action. See ?data.frame for the proper use of stringsAsFactors parameter

Upvotes: 0

Related Questions