Reputation: 127
I have a dataframe like this: (data)
Country IndicatorName 1960 1965
France Air 2 3
France Elec 9 10
France Mobile 2 4
Germany Air 50 43
Germany Elec 43 23
Germany Mobile 45 66
USA Air 87 2
USA Elec 19 81
USA Mobile 1 77
I would like to have this data in order to plot it: (data_new)
Years Country Air Elect Mobile
1960 France 2 9 2
1960 Germany 50 43 45
1960 USA 87 19 1
1965 France 3 10 4
1965 Germany 43 23 66
1965 USA 2 81 77
I kind want to transpose (data) in order to have (data_new).
How can I do that?
Upvotes: 1
Views: 114
Reputation: 887531
We can reshape to the expected format using either gather/spread
from tidyr
library(dplyr)
library(tidyr)
df1 %>%
gather(Years, Val, 3:4) %>%
spread(IndicatorName, Val)
# Country Years Air Elec Mobile
#1 France 1960 2 9 2
#2 France 1965 3 10 4
#3 Germany 1960 50 43 45
#4 Germany 1965 43 23 66
#5 USA 1960 87 19 1
#6 USA 1965 2 81 77
Or use recast
from library(reshape2)
. The function is a wrapper for melt/dcast
where melt
does the same thing as gather
from tidyr
i.e. converting 'wide' to 'long' format, and dcast
converts the 'long' back to 'wide' (as spread
from tidyr
). In the dcast
formula, we can use the full formula by indicating Country + variable ~ IndicatorName
or we can use ...
to specify all the remaining variables on the lhs of ~
.
library(reshape2)
recast(df1, measure.var=c('1960', '1965'), ...~IndicatorName, value.var='value')
# Country variable Air Elec Mobile
#1 France 1960 2 9 2
#2 France 1965 3 10 4
#3 Germany 1960 50 43 45
#4 Germany 1965 43 23 66
#5 USA 1960 87 19 1
#6 USA 1965 2 81 77
Just to understand this better,
melt(df1, measure.vars=c('1960', '1965')) %>%
dcast(., Country+variable~IndicatorName, value.var='value')
Also, note that we can change the variable name in the melt
step,
melt(df1, measure.vars=c('1960', '1965'), variable.name='Years') %>%
dcast(., Country+Years ~IndicatorName, value.var='value')
The %>%
is from dplyr
. (It is not needed. We could have done this in two steps).
df1 <- structure(list(Country = c("France", "France", "France", "Germany",
"Germany", "Germany", "USA", "USA", "USA"), IndicatorName = c("Air",
"Elec", "Mobile", "Air", "Elec", "Mobile", "Air", "Elec", "Mobile"
), `1960` = c(2L, 9L, 2L, 50L, 43L, 45L, 87L, 19L, 1L), `1965` = c(3L,
10L, 4L, 43L, 23L, 66L, 2L, 81L, 77L)), .Names = c("Country",
"IndicatorName", "1960", "1965"), class = "data.frame",
row.names = c(NA, -9L))
Upvotes: 3