Reputation: 23
I have data of this wide format that I want to convert to long format
Cond Construct Line Plant Tube_shoot weight_shoot Tube_root weight_root
1 Standard NA NA 2 199.95 - -
2 Cd0 IIF 43.1 1 3 51.87 4 10.39
3 Cd0 IIF 43.1 2 5 81.80 6 15.05
4 Cd0 IIF 43.1 3 7 101.56 8 16.70
What I basically want is to keep Tube_shoot and weight_shoot together, i.e. treat these two columns as one for melt. But because I can only use
id.vars=c("Cond","Construct","Line","Plant")
the result is not quite what I want.
I have two (ugly) solutions so far:
I melt twice, first by measure.vars=c("Tube_shoot", "Tube_root" ), then by weights, and later remove half of the rows where the result is simply wrong. This is not feasible for me because I have different lengths of data, and I'd always have to check if I'm taking the right rows out.
I paste "tube" with "weight" into a new column, take out the others, melt them, and later take them apart again.
Copy them one by one in excel. But with hundreds of lines I'd rather learn how to do it in R.
I'm sure that there is a better way.
What I want in the end:
Cond Construct Line Plant Tube weight
1 Standard NA NA 2 199.95
2 Cd0 IIF 43.1 1 3 51.87
3 Cd0 IIF 43.1 2 5 81.80
4 Cd0 IIF 43.1 3 7 101.56
2 Cd0 IIF 43.1 1 4 10.39
3 Cd0 IIF 43.1 2 6 15.05
4 Cd0 IIF 43.1 3 8 16.70
Upvotes: 2
Views: 3120
Reputation: 193687
You may want to consider merged.stack
from my "splitstackshape" package, with which you can do something like:
library(splitstackshape)
merged.stack(as.data.table(df1, keep.rownames = TRUE),
var.stubs = c("Tube", "weight"), sep = "_")
# rn Cond Construct Line Plant .time_1 Tube weight
# 1: 1 Standard NA NA root - -
# 2: 1 Standard NA NA shoot 2 199.95
# 3: 2 Cd0 IIF 43.1 1 root 4 10.39
# 4: 2 Cd0 IIF 43.1 1 shoot 3 51.87
# 5: 3 Cd0 IIF 43.1 2 root 6 15.05
# 6: 3 Cd0 IIF 43.1 2 shoot 5 81.8
# 7: 4 Cd0 IIF 43.1 3 root 8 16.70
# 8: 4 Cd0 IIF 43.1 3 shoot 7 101.56
Of course, you can also add a [Tube != "-" | weight != "-"]
to the end to remove the rows where "Tube" or "weight" have "-"... but note that doing so doesn't magically convert those columns to numeric :-)
Upvotes: 1
Reputation: 70336
Another option, using dplyr and tidyr:
library(dplyr)
libarary(tidyr)
gather(df1, x, Tube, c(Tube_shoot, Tube_root)) %>%
mutate(weight = ifelse(grepl("*root$", x), weight_root, weight_shoot)) %>%
select(-c(weight_shoot, weight_root, x))
# Cond Construct Line Plant Tube weight
#1 Standard NA NA 2 199.95
#2 Cd0 IIF 43.1 1 3 51.87
#3 Cd0 IIF 43.1 2 5 81.8
#4 Cd0 IIF 43.1 3 7 101.56
#5 Standard NA NA - -
#6 Cd0 IIF 43.1 1 4 10.39
#7 Cd0 IIF 43.1 2 6 15.05
#8 Cd0 IIF 43.1 3 8 16.70
Upvotes: 1
Reputation: 887981
You could try
res <- reshape(df1, idvar=c('Cond', 'Construct', 'Line', 'Plant'),
varying=5:8, direction='long', sep="_")
res1 <- res[res$weight!='-', -5]
row.names(res1) <- NULL
res1
# Cond Construct Line Plant Tube weight_shoot
#1 Standard NA NA 2 199.95
#2 Cd0 IIF 43.1 1 3 51.87
#3 Cd0 IIF 43.1 2 5 81.8
#4 Cd0 IIF 43.1 3 7 101.56
#5 Cd0 IIF 43.1 1 4 10.39
#6 Cd0 IIF 43.1 2 6 15.05
#7 Cd0 IIF 43.1 3 8 16.70
df1 <- structure(list(Cond = c("Standard", "Cd0", "Cd0", "Cd0"),
Construct = c("", "IIF", "IIF", "IIF"), Line = c(NA, 43.1, 43.1, 43.1),
Plant = c(NA, 1L, 2L, 3L), Tube_shoot = c(2L, 3L, 5L, 7L), weight_shoot =
c(199.95,51.87, 81.8, 101.56), Tube_root = c("-", "4", "6", "8"),
weight_root = c("-", "10.39", "15.05", "16.70")), .Names = c("Cond",
"Construct", "Line", "Plant", "Tube_shoot", "weight_shoot", "Tube_root",
"weight_root"), class = "data.frame", row.names = c("1", "2", "3", "4"))
Upvotes: 2