tiffkyn
tiffkyn

Reputation: 79

Melt data frame row by row

How can I melt a data frame row by row? I found a really similar question on the forum but I still can't solve my problem without a different id variable.

This is my data set:

V1 V2 V3 V4 V5
51 20 29 12 20
51 22 51 NA NA
51 14 NA NA NA
51 75 NA NA NA

And I want to melt it into:

V1 variable value    
51 V2 20
51 V3 29
51 V4 12
51 V5 20
51 V2 22
51 V3 51
51 V2 14
51 V2 75

Currently my approach is melting it row by row with a for loop and then rbind them together.

library(reshape)

df <- read.table(text = "V1 V2 V3 V4 V5 51 20 29 12 20 51 22 51 NA NA 51 
+14 NA NA NA 51 75 NA NA NA", header = TRUE)

dfall<-NULL
for (i in 1:NROW(df))
{
  dfmelt<-melt(df,id="V1",na.rm=TRUE)
  dfall<-rbind(dfall,dfmelt)
}

Just wondering if there is any way to do this faster? Thanks!

Upvotes: 5

Views: 4113

Answers (2)

akrun
akrun

Reputation: 887971

We replicate the first column "V1" and the names of the dataset except the first column name to create the first and second column of the expected output, while the 'value' column is created by transposing the dataset without the first column.

na.omit(data.frame(V1=df1[1][col(df1[-1])],
             variable = names(df1)[-1][row(df1[-1])],
              value = c(t(df1[-1]))))
#   V1 variable value
#1  51       V2    20
#2  51       V3    29
#3  51       V4    12
#4  51       V5    20
#5  51       V2    22
#6  51       V3    51
#9  51       V2    14
#13 51       V2    75

NOTE: No additional packages used.


Or we can use gather (from tidyr) to convert the 'wide' to 'long' format after we create a row id column (add_rownames from dplyr) and then arrange the rows.

library(dplyr)
library(tidyr)
add_rownames(df1) %>% 
        gather(variable, value, V2:V5, na.rm=TRUE) %>% 
        arrange(rowname, V1) %>% 
        select(-rowname)
#      V1 variable value
#    (int)    (chr) (int)
#1    51       V2    20
#2    51       V3    29
#3    51       V4    12
#4    51       V5    20
#5    51       V2    22
#6    51       V3    51
#7    51       V2    14
#8    51       V2    75

Or with data.table

library(data.table)
melt(setDT(df1, keep.rownames=TRUE),
      id.var= c("rn", "V1"), na.rm=TRUE)[
      order(rn, V1)][, rn:= NULL][]

Upvotes: 2

alistaire
alistaire

Reputation: 43364

You can make a column with a unique ID for each row, so you can sort on it after melting. Using dplyr:

library(reshape2)
library(dplyr)
df %>% mutate(id = seq_len(n())) %>% 
    melt(id.var = c('V1','id'), na.rm = T) %>% 
    arrange(V1, id, variable) %>% 
    select(-id)
#   V1 variable value
# 1 51       V2    20
# 2 51       V3    29
# 3 51       V4    12
# 4 51       V5    20
# 5 51       V2    22
# 6 51       V3    51
# 7 51       V2    14
# 8 51       V2    75

...or base R:

library(reshape2)
df$id <- seq_along(df$V1)
df2 <- melt(df, id.var = c('V1', 'id'), na.rm = TRUE)
df2[order(df2$V1, df2$id, df2$variable),-2]

Upvotes: 2

Related Questions