Reputation: 23
I am looking for reshaping:
ID p2012 p2010 p2008 p2006 c2012 c2010 c2008 c2006
1 1 160 162 163 165 37.3 37.3 37.1 37.1
2 2 163 164 164 163 2.6 2.6 2.6 2.6
into:
ID year p c
1 1 2006 165 37.1
2 1 2008 164 37.1
3 1 2010 162 37.3
4 1 2012 160 37.3
5 2 2006 163 2.6
6 2 2008 163 2.6
7 2 2010 164 2.6
8 2 2012 163 2.6
I am new to R, have been trying around with melt
and dcast
functions, but there are just to many twists for me at this stage. Help would be much appreciated!
A dput
of my df:
structure(list(ID = 1:2, p2012 = c(160L, 163L), p2010 = c(162L, 164L), p2008 = 163:164, p2006 = c(165L, 163L), c2012 = c(37.3, 2.6), c2010 = c(37.3, 2.6), c2008 = c(37.1, 2.6), c2006 = c(37.1, 2.6)), .Names = c("ID", "p2012", "p2010", "p2008", "p2006", "c2012", "c2010", "c2008", "c2006"), class = "data.frame", row.names = c(NA, -2L))
Upvotes: 1
Views: 111
Reputation: 2950
You can also use the following solution from tidyr
. You don't actually need to use regular expressions, if "p" or "c" is always the first letter of the column names:
library(tidyr)
library(dplyr) # only loaded for the %>% operator
dat %>%
gather(key,value,p2012:c2006) %>%
separate(key,c("category","year"),1) %>%
spread(category,value)
ID year c p
1 1 2006 37.1 165
2 1 2008 37.1 163
3 1 2010 37.3 162
4 1 2012 37.3 160
5 2 2006 2.6 163
6 2 2008 2.6 164
7 2 2010 2.6 164
8 2 2012 2.6 163
Upvotes: 2
Reputation: 9157
An alternative to shadow's answer is to use the reshape
function:
reshape(d, direction='long', varying=list(2:5, 6:9), v.names=c("p", "c"), idvar="ID", times=c(2012, 2010, 2008, 2006))
This assumes that you know the column indices of the p
and c
beforehand (or add additional code to figure them out). Furthermore, the times vector above could be found by using something similar to the gsub
function of shadow.
Which way to use probably is a matter of taste.
Upvotes: 4
Reputation: 22293
You probably have to melt
the data first, then split the variable and the year and then dcast
to your final data.frame
.
require(reshape2)
# melt data.frame
dfmelt <- melt(df, id.vars="ID", variable.name="var.year")
# split "var.year" into new variables "var" and "year"
dfmelt[, "var"] <- gsub("[0-9]", "", as.character(dfmelt[, "var.year"]))
dfmelt[, "year"] <- as.numeric(gsub("[a-z, A-Z]", "", as.character(dfmelt[, "var.year"])))
# cast to data with column for each var-name
dcast(dfmelt, ID+year~var, value.var="value")
Upvotes: 2