M3Lba
M3Lba

Reputation: 165

R function to mutate 2 ID columns into two different rows/entries

I have a data frame of events that have two columns for attendance IDs It looks like this.

event<-c(1:20)
name1<-c(101:120)
name2<-c(rep(NA,15),201:205)

df<-data.frame(event,name1,name2)

I want all the names in one column so I can join it to mult other data sets. It should look like this.

event<-c(1:16,16,17,17,18,18,19,19,20,20)
name<-c(101:116,201,117,202,118,203,119,204,120,205)

desired_df<-data.frame(event,name)

What is a good mutation/transformation to use for this process?

Upvotes: 2

Views: 40

Answers (3)

ThomasIsCoding
ThomasIsCoding

Reputation: 102529

There are already excellent solutions with @jay.sf's stats::reshape and @L Tyrone's tidyr::pivot_longer, and here is just another possible option you might be interested (with data.table)

> library(data.table)

> setorder(na.omit(melt(setDT(df), id.vars = "event", value.name = "name"))[, variable := NULL])[]
    event  name
    <int> <int>
 1:     1   101
 2:     2   102
 3:     3   103
 4:     4   104
 5:     5   105
 6:     6   106
 7:     7   107
 8:     8   108
 9:     9   109
10:    10   110
11:    11   111
12:    12   112
13:    13   113
14:    14   114
15:    15   115
16:    16   116
17:    16   201
18:    17   117
19:    17   202
20:    18   118
21:    18   203
22:    19   119
23:    19   204
24:    20   120
25:    20   205
    event  name

Another base R implementation with reshape (similar to @jay.sf's answer)

sort_by(
    na.omit(
        reshape(
            df,
            direction = "long",
            varying = -1,
            v.names = "name",
            timevar = "name"
        )
    ), ~id
)[-3]

gives

     event name
1.1      1  101
2.1      2  102
3.1      3  103
4.1      4  104
5.1      5  105
6.1      6  106
7.1      7  107
8.1      8  108
9.1      9  109
10.1    10  110
11.1    11  111
12.1    12  112
13.1    13  113
14.1    14  114
15.1    15  115
16.1    16  116
16.2    16  201
17.1    17  117
17.2    17  202
18.1    18  118
18.2    18  203
19.1    19  119
19.2    19  204
20.1    20  120
20.2    20  205

Upvotes: 1

jay.sf
jay.sf

Reputation: 73562

Using reshape.

> reshape(df, varying=2:3, sep='', direction='long')[c('event', 'name')] |> 
+   na.omit() |> sort_by(~list(event, name))
     event name
1.1      1  101
2.1      2  102
3.1      3  103
4.1      4  104
5.1      5  105
6.1      6  106
7.1      7  107
8.1      8  108
9.1      9  109
10.1    10  110
11.1    11  111
12.1    12  112
13.1    13  113
14.1    14  114
15.1    15  115
16.1    16  116
16.2    16  201
17.1    17  117
17.2    17  202
18.1    18  118
18.2    18  203
19.1    19  119
19.2    19  204
20.1    20  120
20.2    20  205

Upvotes: 2

L Tyrone
L Tyrone

Reputation: 7065

Using tidyr::pivot_longer():

library(tidyr)

desired_df <- df %>%
  pivot_longer(
    cols = c(name1, name2),
    values_drop_na = TRUE,
    names_to = NULL,
    values_to = "name")

print(desired_df, n = Inf)
# # A tibble: 25 × 2
#    event  name
#    <int> <int>
#  1     1   101
#  2     2   102
#  3     3   103
#  4     4   104
#  5     5   105
#  6     6   106
#  7     7   107
#  8     8   108
#  9     9   109
# 10    10   110
# 11    11   111
# 12    12   112
# 13    13   113
# 14    14   114
# 15    15   115
# 16    16   116
# 17    16   201
# 18    17   117
# 19    17   202
# 20    18   118
# 21    18   203
# 22    19   119
# 23    19   204
# 24    20   120
# 25    20   205

Upvotes: 2

Related Questions