Reputation: 165
I have a data frame of events that have two columns for attendance IDs It looks like this.
event<-c(1:20)
name1<-c(101:120)
name2<-c(rep(NA,15),201:205)
df<-data.frame(event,name1,name2)
I want all the names in one column so I can join it to mult other data sets. It should look like this.
event<-c(1:16,16,17,17,18,18,19,19,20,20)
name<-c(101:116,201,117,202,118,203,119,204,120,205)
desired_df<-data.frame(event,name)
What is a good mutation/transformation to use for this process?
Upvotes: 2
Views: 40
Reputation: 102529
There are already excellent solutions with @jay.sf's stats::reshape
and @L Tyrone's tidyr::pivot_longer
, and here is just another possible option you might be interested (with data.table
)
> library(data.table)
> setorder(na.omit(melt(setDT(df), id.vars = "event", value.name = "name"))[, variable := NULL])[]
event name
<int> <int>
1: 1 101
2: 2 102
3: 3 103
4: 4 104
5: 5 105
6: 6 106
7: 7 107
8: 8 108
9: 9 109
10: 10 110
11: 11 111
12: 12 112
13: 13 113
14: 14 114
15: 15 115
16: 16 116
17: 16 201
18: 17 117
19: 17 202
20: 18 118
21: 18 203
22: 19 119
23: 19 204
24: 20 120
25: 20 205
event name
Another base R implementation with reshape
(similar to @jay.sf's answer)
sort_by(
na.omit(
reshape(
df,
direction = "long",
varying = -1,
v.names = "name",
timevar = "name"
)
), ~id
)[-3]
gives
event name
1.1 1 101
2.1 2 102
3.1 3 103
4.1 4 104
5.1 5 105
6.1 6 106
7.1 7 107
8.1 8 108
9.1 9 109
10.1 10 110
11.1 11 111
12.1 12 112
13.1 13 113
14.1 14 114
15.1 15 115
16.1 16 116
16.2 16 201
17.1 17 117
17.2 17 202
18.1 18 118
18.2 18 203
19.1 19 119
19.2 19 204
20.1 20 120
20.2 20 205
Upvotes: 1
Reputation: 73562
Using reshape
.
> reshape(df, varying=2:3, sep='', direction='long')[c('event', 'name')] |>
+ na.omit() |> sort_by(~list(event, name))
event name
1.1 1 101
2.1 2 102
3.1 3 103
4.1 4 104
5.1 5 105
6.1 6 106
7.1 7 107
8.1 8 108
9.1 9 109
10.1 10 110
11.1 11 111
12.1 12 112
13.1 13 113
14.1 14 114
15.1 15 115
16.1 16 116
16.2 16 201
17.1 17 117
17.2 17 202
18.1 18 118
18.2 18 203
19.1 19 119
19.2 19 204
20.1 20 120
20.2 20 205
Upvotes: 2
Reputation: 7065
Using tidyr::pivot_longer()
:
library(tidyr)
desired_df <- df %>%
pivot_longer(
cols = c(name1, name2),
values_drop_na = TRUE,
names_to = NULL,
values_to = "name")
print(desired_df, n = Inf)
# # A tibble: 25 × 2
# event name
# <int> <int>
# 1 1 101
# 2 2 102
# 3 3 103
# 4 4 104
# 5 5 105
# 6 6 106
# 7 7 107
# 8 8 108
# 9 9 109
# 10 10 110
# 11 11 111
# 12 12 112
# 13 13 113
# 14 14 114
# 15 15 115
# 16 16 116
# 17 16 201
# 18 17 117
# 19 17 202
# 20 18 118
# 21 18 203
# 22 19 119
# 23 19 204
# 24 20 120
# 25 20 205
Upvotes: 2