How to transform multi rows columns and melt to long form R data.table

Question

There is the table to transform into long form from wide form. It contains +200 columns compose with multi columns like below :

Original data :

# dt
dt <- data.table("1" = c(NA,"Place","dan","uan","yan"),
                 "2" = c(NA,"Place_2","adan","duan","eyan"),
                 "3" = c("something","Male",1253,6643,4325),
                 "4" = c(1998,"Female",624,623,55),
                 "5" = c(NA,"Trans",13,51,51),
                 "6" = c("something2","Male",126,63643,725),
                 "7" = c(1999,"Female",284,243,557),
                 "8" = c(NA,"Trans",138,541,11))

Start from the 3rd column , every +3 column is the year value

dt[1,c(3:ncol(dt) %% 3 == 1),with = FALSE]

How to effectively transform the multi columns into single column for melt ?

Goal:

Place Place_2   Sex     Year    num
dan   adan      Male    1998    1253
dan   adan      Female  1998    624
dan   adan      Trans   1998    13
dan   adan      Male    1999    126
dan   adan      Female  1999    63643
dan   adan      Trans   1999    725
uan   duan      Female  1998    6643
....

jazzurro · Accepted Answer

Here is what I tried. I thought arranging column names is the key here. I provided explanation in the code below.

library(data.table)

# Creat new column names. Get the 1st row, search for years, repeat each year
# three times, and paste them with three levels of sex.

unlist(dt[1,]) %>% 
grep(pattern = "\d{4}", value = TRUE) %>% 
rep(each = 3) %>% 
paste(., c("Male", "Female", "Trans"), sep = "_") -> foo

# Set new column names.
setnames(dt, c("Place_1", "Place_2", foo))

# Then, transform the data into a long-format data. Create two new columns
# (i.e., year and sex), and remove the column, variable.

melt(dt[-(1:2)], id.vars = 1:2, measure = patterns("^\d{4}"))[,
        c("year", "sex") := tstrsplit(variable, "_", fixed = TRUE)][, -"variable"] -> out

# Sort the result with Place_1 and Place_2. (This is for showing the result). 
out[order(Place_1, Place_2)][]

#    Place_1 Place_2 value year    sex
# 1:     dan    adan  1253 1998   Male
# 2:     dan    adan   624 1998 Female
# 3:     dan    adan    13 1998  Trans
# 4:     dan    adan   126 1999   Male
# 5:     dan    adan   284 1999 Female
# 6:     dan    adan   138 1999  Trans
# 7:     uan    duan  6643 1998   Male
# 8:     uan    duan   623 1998 Female
# 9:     uan    duan    51 1998  Trans
#10:     uan    duan 63643 1999   Male
#11:     uan    duan   243 1999 Female
#12:     uan    duan   541 1999  Trans
#13:     yan    eyan  4325 1998   Male
#14:     yan    eyan    55 1998 Female
#15:     yan    eyan    51 1998  Trans
#16:     yan    eyan   725 1999   Male
#17:     yan    eyan   557 1999 Female
#18:     yan    eyan    11 1999  Trans

How to transform multi rows columns and melt to long form R data.table

Answers (2)

Related Questions