Reputation: 1316

Switch statement in R to replace every row with two new rows based on a column value

I have the following dataframe.

DATA

df <- structure(list(ID = structure(1:10, .Label = c("7519", "7522", 
"7525", "7526", "7527", "7530", "7537", "7538", "7541", "7542"
), class = "factor"), MN = c(2, 1, 0, 1, 2, 1, 0, 1, 2, 0)), class = "data.frame", row.names = c(NA, 
-10L))

     ID MN
1  7519  2
2  7522  1
3  7525  0
4  7526  1
5  7527  2
6  7530  1
7  7537  0
8  7538  1
9  7541  2
10 7542  0

I would like to turn this into a new dataframe with 20 rows based on MN. MN has 3 conditions: 0,1,2

If MN==0, I would like to add M=1, N=0

If MN==1, I would like to add M=1, N=1

If MN==2, I would like to add M=0, N=1

The desired is:

Result

      ID name  value
1   7519    M  0
2   7519    N  1
3   7522    M  1
4   7522    N  1
5   7525    M  1
6   7525    N  0
7   7526    M  1
8   7526    N  1
9   7527    M  0
10  7527    N  1
11  7530    M  1
12  7530    N  1
13  7537    M  1
14  7537    N  0
15  7538    M  1
16  7538    N  0
17  7541    M  0
18  7541    N  1
19  7542    M  1
20  7542    N  0

Upvotes: 2

Answers (3)

dc37

Reputation: 16178

You can create two new columns M and N based on MN values and then use pivot_longer function from tidyr to reshape your data into a longer format:

library(dplyr)
library(tidyr)
df %>% mutate(M = ifelse(MN == 2, 0,
                         ifelse(MN == 1, 1,0)),
              N = ifelse(MN == 2, 1,
                         ifelse(MN ==1, 1,0))) %>%
  pivot_longer(cols = c(M,N), names_to = "Name", values_to = "Value") %>% select(-MN)

# A tibble: 20 x 3
   ID    Name  Value
   <fct> <chr> <dbl>
 1 7519  M         0
 2 7519  N         1
 3 7522  M         1
 4 7522  N         1
 5 7525  M         0
 6 7525  N         0
 7 7526  M         1
 8 7526  N         1
 9 7527  M         0
10 7527  N         1
11 7530  M         1
12 7530  N         1
13 7537  M         0
14 7537  N         0
15 7538  M         1
16 7538  N         1
17 7541  M         0
18 7541  N         1
19 7542  M         0
20 7542  N         0

Upvotes: 2

www

Reputation: 39154

This is a method based on join and the convert to long-format.

library(tidyverse)

lookup <- tibble(
  MN = 0:2,
  M = c(1, 1, 0),
  N = c(0, 1, 1)
)

df2 <- df %>%
  left_join(lookup, by = "MN") %>%
  select(-MN) %>%
  pivot_longer(cols = -ID)
df2
 # # A tibble: 20 x 3
 #   ID    name  value
 #   <fct> <chr> <dbl>
 #  1 7519  M         0
 #  2 7519  N         1
 #  3 7522  M         1
 #  4 7522  N         1
 #  5 7525  M         1
 #  6 7525  N         0
 #  7 7526  M         1
 #  8 7526  N         1
 #  9 7527  M         0
 # 10 7527  N         1
 # 11 7530  M         1
 # 12 7530  N         1
 # 13 7537  M         1
 # 14 7537  N         0
 # 15 7538  M         1
 # 16 7538  N         1
 # 17 7541  M         0
 # 18 7541  N         1
 # 19 7542  M         1
 # 20 7542  N         0

Upvotes: 2

R. Schifini

Reputation: 9313

This can be done with a simple merge, but you need to create first the data frame that maps the MN value to the M and N and their corresponding values:

df_MN = data.frame(MN = c(0, 0, 1, 1, 2, 2), 
                   name = c("M", "N", "M", "N", "M", "N"), 
                   value = c(1, 0, 1, 1, 0, 1))

df2 = merge(df, df_MN, sort = F)

Result:

> df2
   MN   ID name value
1   2 7519    M     0
2   2 7519    N     1
3   2 7527    M     0
4   2 7527    N     1
5   2 7541    M     0
6   2 7541    N     1
7   1 7522    M     1
8   1 7522    N     1
9   1 7526    M     1
10  1 7526    N     1
11  1 7530    M     1
12  1 7530    N     1
13  1 7538    M     1
14  1 7538    N     1
15  0 7525    M     1
16  0 7525    N     0
17  0 7537    M     1
18  0 7537    N     0
19  0 7542    M     1
20  0 7542    N     0

There is no need for the sort = F, but the result is easier to compare to your expected result.

Upvotes: 2

Switch statement in R to replace every row with two new rows based on a column value

Answers (3)

Related Questions