Jin
Jin

Reputation: 555

Convert wide to long format using melt without column name in long format dataframe

I have a dataframe that looks like this enter image description here

I want the dataframe to be converted to long format like this

enter image description here

This is the code I use

long_ex <- melt(wide_ex, id.vars = 5, na.rm=TRUE) #wide_ex is wide format dataframe

However, my end result looks like this

enter image description here

Is there a way to use melt without extracting the column names? Will gladly accept alternative functions if melt is not best suited for this job

Edit: Data output from dput

structure(list(ï..Column1 = c(NA, NA, NA, NA), Column2 = c(NA, 
NA, NA, NA), Column3 = c(NA, NA, NA, NA), Column4 = c(NA, NA, 
NA, NA), Column5 = structure(c(2L, 1L, 4L, 3L), .Label = c("Eric ", 
"Jim", "Matt", "Tim"), class = "factor"), Column6 = c(NA, NA, 
NA, NA), Column7 = structure(c(1L, 3L, 2L, 3L), .Label = c("Eric", 
"Erica", "Mary "), class = "factor"), Column8 = structure(c(3L, 
2L, 1L, 3L), .Label = c("Beth", "Loranda", "Matt"), class = "factor"), 
    Column9 = structure(c(2L, 3L, 1L, 3L), .Label = c("Courtney ", 
    "Heather ", "Patrick"), class = "factor"), Column10 = structure(4:1, .Label = c("Beth", 
    "Heather", "John", "Loranda "), class = "factor"), Column11 = c(NA, 
    NA, NA, NA), Column12 = c(NA, NA, NA, NA), Column13 = c(NA, 
    NA, NA, NA), Column14 = c(NA, NA, NA, NA), Column15 = c(NA, 
    NA, NA, NA)), class = "data.frame", row.names = c(NA, -4L
))

Upvotes: 1

Views: 579

Answers (2)

Khaynes
Khaynes

Reputation: 1986

If you want a base R solution:

data.frame(name_1 = rep(as.character(wide_ex$Column5), each=nrow(wide_ex)),
    name_2 = as.vector(t(wide_ex[, c("Column7", "Column8", "Column9", "Column10")])))

I'm still of the opinion the most concise method is melt with data.table:

library(data.table)
setDT(wide_ex)
melt(wide_ex, id.vars = c("Column5"), na.rm=TRUE)[,variable := NULL][]

It will also offer considerable speed improvements over reshape2 if speed is of concern.



(Additional explanation) ... What's with the use of additional []?

A) the use of additional [] in data.table is known as chaining. It allows you to perform more operations on preceding []'s.

As you originally indicated, the output of melt produces an unwanted column (variable). variable := NULL removes it. It's essentially the same as doing the following (on your original question):

 long_ex <- melt(wide_ex, id.vars = 5, na.rm=TRUE) 
 long_ex$variable <- NULL

However the use of := does it by reference (and on the same line).

Chaining can be super useful to keep your code nice and concise. Say you wanted to order your output by the first column (as you kind of indicated in your original question). You could do this like so:

melt(wide_ex, id.vars = c("Column5"), na.rm=TRUE)[, variable := NULL][order(Column5)][]

data.table really is an amazing package (especially if you're dealing with medium to large data). If you're interested, I'd suggest reading & learning more about it: https://github.com/Rdatatable/data.table/wiki

Upvotes: 3

NColl
NColl

Reputation: 757

Similarly using melt for a tidyverse approach

library(tidyverse)
library(data.table)

df %>%
  melt(id.vars=5) %>%
  filter(complete.cases(.)) %>%
  select(c(1,3))

 Column5     value
1    Jim     Eric
2    Eric    Mary 
3    Tim     Erica
4    Matt    Mary 
5    Jim     Matt
6    Eric    Loranda
7    Tim     Beth
8    Matt    Matt
9    Jim     Heather 
10   Eric    Patrick
11   Tim     Courtney 
12   Matt    Patrick
13   Jim     Loranda 
14   Eric    John
15   Tim     Heather
16   Matt    Beth

Upvotes: 1

Related Questions