Reputation: 555
I have a dataframe that looks like this
I want the dataframe to be converted to long format like this
This is the code I use
long_ex <- melt(wide_ex, id.vars = 5, na.rm=TRUE) #wide_ex is wide format dataframe
However, my end result looks like this
Is there a way to use melt without extracting the column names? Will gladly accept alternative functions if melt is not best suited for this job
Edit: Data output from dput
structure(list(ï..Column1 = c(NA, NA, NA, NA), Column2 = c(NA,
NA, NA, NA), Column3 = c(NA, NA, NA, NA), Column4 = c(NA, NA,
NA, NA), Column5 = structure(c(2L, 1L, 4L, 3L), .Label = c("Eric ",
"Jim", "Matt", "Tim"), class = "factor"), Column6 = c(NA, NA,
NA, NA), Column7 = structure(c(1L, 3L, 2L, 3L), .Label = c("Eric",
"Erica", "Mary "), class = "factor"), Column8 = structure(c(3L,
2L, 1L, 3L), .Label = c("Beth", "Loranda", "Matt"), class = "factor"),
Column9 = structure(c(2L, 3L, 1L, 3L), .Label = c("Courtney ",
"Heather ", "Patrick"), class = "factor"), Column10 = structure(4:1, .Label = c("Beth",
"Heather", "John", "Loranda "), class = "factor"), Column11 = c(NA,
NA, NA, NA), Column12 = c(NA, NA, NA, NA), Column13 = c(NA,
NA, NA, NA), Column14 = c(NA, NA, NA, NA), Column15 = c(NA,
NA, NA, NA)), class = "data.frame", row.names = c(NA, -4L
))
Upvotes: 1
Views: 579
Reputation: 1986
If you want a base R solution:
data.frame(name_1 = rep(as.character(wide_ex$Column5), each=nrow(wide_ex)),
name_2 = as.vector(t(wide_ex[, c("Column7", "Column8", "Column9", "Column10")])))
I'm still of the opinion the most concise method is melt with data.table:
library(data.table)
setDT(wide_ex)
melt(wide_ex, id.vars = c("Column5"), na.rm=TRUE)[,variable := NULL][]
It will also offer considerable speed improvements over reshape2 if speed is of concern.
(Additional explanation) ... What's with the use of additional []
?
A) the use of additional [] in data.table
is known as chaining. It allows you to perform more operations on preceding []'s.
As you originally indicated, the output of melt produces an unwanted column (variable
). variable := NULL
removes it. It's essentially the same as doing the following (on your original question):
long_ex <- melt(wide_ex, id.vars = 5, na.rm=TRUE)
long_ex$variable <- NULL
However the use of :=
does it by reference (and on the same line).
Chaining can be super useful to keep your code nice and concise. Say you wanted to order your output by the first column (as you kind of indicated in your original question). You could do this like so:
melt(wide_ex, id.vars = c("Column5"), na.rm=TRUE)[, variable := NULL][order(Column5)][]
data.table really is an amazing package (especially if you're dealing with medium to large data). If you're interested, I'd suggest reading & learning more about it: https://github.com/Rdatatable/data.table/wiki
Upvotes: 3
Reputation: 757
Similarly using melt
for a tidyverse
approach
library(tidyverse)
library(data.table)
df %>%
melt(id.vars=5) %>%
filter(complete.cases(.)) %>%
select(c(1,3))
Column5 value
1 Jim Eric
2 Eric Mary
3 Tim Erica
4 Matt Mary
5 Jim Matt
6 Eric Loranda
7 Tim Beth
8 Matt Matt
9 Jim Heather
10 Eric Patrick
11 Tim Courtney
12 Matt Patrick
13 Jim Loranda
14 Eric John
15 Tim Heather
16 Matt Beth
Upvotes: 1