Reputation: 2851
Suppose that we have a data frame that looks like
set.seed(7302012)
county <- rep(letters[1:4], each=2)
state <- rep(LETTERS[1], times=8)
industry <- rep(c("construction", "manufacturing"), 4)
employment <- round(rnorm(8, 100, 50), 0)
establishments <- round(rnorm(8, 20, 5), 0)
data <- data.frame(state, county, industry, employment, establishments)
state county industry employment establishments
1 A a construction 146 19
2 A a manufacturing 110 20
3 A b construction 121 10
4 A b manufacturing 90 27
5 A c construction 197 18
6 A c manufacturing 73 29
7 A d construction 98 30
8 A d manufacturing 102 19
We'd like to reshape this so that each row represents a (state and) county, rather than a county-industry, with columns construction.employment
, construction.establishments
, and analogous versions for manufacturing. What is an efficient way to do this?
One way is to subset
construction <- data[data$industry == "construction", ]
names(construction)[4:5] <- c("construction.employment", "construction.establishments")
And similarly for manufacturing, then do a merge. This isn't so bad if there are only two industries, but imagine that there are 14; this process would become tedious (though made less so by using a for
loop over the levels of industry
).
Any other ideas?
Upvotes: 6
Views: 11649
Reputation: 109844
Also using the reshape package:
library(reshape)
m <- reshape::melt(data)
cast(m, state + county~...)
Yielding:
> cast(m, state + county~...)
state county construction_employment construction_establishments manufacturing_employment manufacturing_establishments
1 A a 146 19 110 20
2 A b 121 10 90 27
3 A c 197 18 73 29
4 A d 98 30 102 19
I personally use the base reshape so I probably should have shown this using reshape2 (Wickham) but forgot there was a reshape2 package. Slightly different:
library(reshape2)
m <- reshape2::melt(data)
dcast(m, state + county~...)
Upvotes: 4
Reputation: 193497
This can be done in base R reshape, if I understand your question correctly:
reshape(data, direction="wide", idvar=c("state", "county"), timevar="industry")
# state county employment.construction establishments.construction
# 1 A a 146 19
# 3 A b 121 10
# 5 A c 197 18
# 7 A d 98 30
# employment.manufacturing establishments.manufacturing
# 1 110 20
# 3 90 27
# 5 73 29
# 7 102 19
Upvotes: 9