Reputation: 1108
I have a data table functionally equivalent to x
below:
x <- data.table::data.table(id = c(1,2,3,4), "1" = c("a", "a"), "2" = c("b", NA), "3" = c(NA, "b"), "4" = c("c", "c"))
id 1 2 3 4
1: 1 a b <NA> c
2: 2 a <NA> b c
3: 3 a b <NA> c
4: 4 a <NA> b c
y <- data.table::melt(x, id.vars = "id", measure.vars = c("1", "2", "3", "4"), variable.name = "col", value.name = "letter") %>% na.omit("letter") %>% data.table::setorder(id)
y
id col letter
1: 1 1 a
2: 1 2 b
3: 1 4 c
4: 2 1 a
5: 2 3 b
6: 2 4 c
7: 3 1 a
8: 3 2 b
9: 3 4 c
10: 4 1 a
11: 4 3 b
12: 4 4 c
I am melting x
as above. However, this leaves me with "gaps" in the col
column where an NA value was in the wide data. For example, for the id == 1
records, col
is equal to 1
, 2
, and 4
. I would prefer for the numbers in the col
column to be contiguous regardless of their position in the original dataset. I need to be able to "slide" the data over to fill left. Prior to melting, I need to be able to make x
look like this:
x <- data.table::data.table(id = c(1,2,3,4), "1" = c("a", "a"), "2" = c("b", "b"), "3" = c("c", "c"), "4" = c(NA, NA))
x
id 1 2 3 4
1: 1 a b c NA
2: 2 a b c NA
3: 3 a b c NA
4: 4 a b c NA
That way, when I melt it, I get:
y <- data.table::melt(x, id.vars = "id", measure.vars = c("1", "2", "3", "4"), variable.name = "col", value.name = "letter") %>% na.omit("letter") %>% data.table::setorder(id)
y
id col letter
1: 1 1 a
2: 1 2 b
3: 1 3 c
4: 2 1 a
5: 2 2 b
6: 2 3 c
7: 3 1 a
8: 3 2 b
9: 3 3 c
10: 4 1 a
11: 4 2 b
12: 4 3 c
Upvotes: 1
Views: 38
Reputation: 389155
You could adjust the y
data by creating a row number for each id
.
library(data.table)
y[, col := seq_len(.N), id]
y
# id col letter
# 1: 1 1 a
# 2: 1 2 b
# 3: 1 3 c
# 4: 2 1 a
# 5: 2 2 b
# 6: 2 3 c
# 7: 3 1 a
# 8: 3 2 b
# 9: 3 3 c
#10: 4 1 a
#11: 4 2 b
#12: 4 3 c
Upvotes: 0
Reputation: 887541
We can do a group by 'id' unlist
, remove the NA
s and then create the 'col' with rowid
on 'id' column
library(data.table)
x[, .(letter = na.omit(unlist(.SD))), id][, col := rowid(id)][]
# id letter col
# 1: 1 a 1
# 2: 1 b 2
# 3: 1 c 3
# 4: 2 a 1
# 5: 2 b 2
# 6: 2 c 3
# 7: 3 a 1
# 8: 3 b 2
# 9: 3 c 3
#10: 4 a 1
#11: 4 b 2
#12: 4 c 3
or with melt
, reshape to 'long' format, then grouped by 'id', reorder the 'letter' based on the NA
element, and then use the na.omit
na.omit(melt(x, id.var = 'id', variable.name = "col",
value.name = "letter")[,
letter := letter[order(is.na(letter))], id][order(id)])
# id col letter
# 1: 1 1 a
# 2: 1 2 b
# 3: 1 3 c
# 4: 2 1 a
# 5: 2 2 b
# 6: 2 3 c
# 7: 3 1 a
# 8: 3 2 b
# 9: 3 3 c
#10: 4 1 a
#11: 4 2 b
#12: 4 3 c
Upvotes: 1