R sliding data left in data.table

Question

I have a data table functionally equivalent to x below:

x <- data.table::data.table(id = c(1,2,3,4), "1" = c("a", "a"), "2" = c("b", NA), "3" = c(NA, "b"), "4" = c("c", "c"))

   id 1    2    3 4
1:  1 a    b  c
2:  2 a     b c
3:  3 a    b  c
4:  4 a     b c
    
y <- data.table::melt(x, id.vars = "id", measure.vars = c("1", "2", "3", "4"), variable.name = "col", value.name = "letter") %>% na.omit("letter") %>% data.table::setorder(id)
y
    id col letter
 1:  1   1      a
 2:  1   2      b
 3:  1   4      c
 4:  2   1      a
 5:  2   3      b
 6:  2   4      c
 7:  3   1      a
 8:  3   2      b
 9:  3   4      c
10:  4   1      a
11:  4   3      b
12:  4   4      c

I am melting x as above. However, this leaves me with "gaps" in the col column where an NA value was in the wide data. For example, for the id == 1 records, col is equal to 1, 2, and 4. I would prefer for the numbers in the col column to be contiguous regardless of their position in the original dataset. I need to be able to "slide" the data over to fill left. Prior to melting, I need to be able to make x look like this:

x <- data.table::data.table(id = c(1,2,3,4), "1" = c("a", "a"), "2" = c("b", "b"), "3" = c("c", "c"), "4" = c(NA, NA))
x
   id 1 2 3  4
1:  1 a b c NA
2:  2 a b c NA
3:  3 a b c NA
4:  4 a b c NA

That way, when I melt it, I get:

y <- data.table::melt(x, id.vars = "id", measure.vars = c("1", "2", "3", "4"), variable.name = "col", value.name = "letter") %>% na.omit("letter") %>% data.table::setorder(id)
y
    id col letter
 1:  1   1      a
 2:  1   2      b
 3:  1   3      c
 4:  2   1      a
 5:  2   2      b
 6:  2   3      c
 7:  3   1      a
 8:  3   2      b
 9:  3   3      c
10:  4   1      a
11:  4   2      b
12:  4   3      c

akrun · Accepted Answer

We can do a group by 'id' unlist, remove the NAs and then create the 'col' with rowid on 'id' column

library(data.table)
x[, .(letter = na.omit(unlist(.SD))), id][, col := rowid(id)][]
#    id letter col
# 1:  1      a   1
# 2:  1      b   2
# 3:  1      c   3
# 4:  2      a   1
# 5:  2      b   2
# 6:  2      c   3
# 7:  3      a   1
# 8:  3      b   2
# 9:  3      c   3
#10:  4      a   1
#11:  4      b   2
#12:  4      c   3

or with melt, reshape to 'long' format, then grouped by 'id', reorder the 'letter' based on the NA element, and then use the na.omit

na.omit(melt(x, id.var = 'id', variable.name = "col", 
    value.name = "letter")[, 
        letter := letter[order(is.na(letter))], id][order(id)])
#    id col letter
# 1:  1   1      a
# 2:  1   2      b
# 3:  1   3      c
# 4:  2   1      a
# 5:  2   2      b
# 6:  2   3      c
# 7:  3   1      a
# 8:  3   2      b
# 9:  3   3      c
#10:  4   1      a
#11:  4   2      b
#12:  4   3      c

R sliding data left in data.table

Answers (2)

Related Questions