Marie
Marie

Reputation: 127

How to transpose a dataframe depending on variables in R?

I have a dataframe like this: (data)

Country IndicatorName 1960 1965
France Air 2 3
France Elec 9 10 
France Mobile 2 4 
Germany Air 50 43
Germany Elec 43 23
Germany Mobile 45 66
USA Air 87 2
USA Elec 19 81
USA Mobile 1 77

I would like to have this data in order to plot it: (data_new)

Years  Country Air Elect Mobile
1960 France 2 9 2
1960 Germany 50 43 45
1960 USA 87 19 1
1965 France 3 10 4
1965 Germany 43 23 66
1965 USA 2 81 77

I kind want to transpose (data) in order to have (data_new).

How can I do that?

Upvotes: 1

Views: 114

Answers (1)

akrun
akrun

Reputation: 887531

We can reshape to the expected format using either gather/spread from tidyr

library(dplyr)
library(tidyr)
df1 %>% 
      gather(Years, Val, 3:4) %>% 
      spread(IndicatorName, Val)
#   Country Years Air Elec Mobile
#1  France  1960   2    9      2
#2  France  1965   3   10      4
#3 Germany  1960  50   43     45
#4 Germany  1965  43   23     66
#5     USA  1960  87   19      1
#6     USA  1965   2   81     77

Or use recast from library(reshape2). The function is a wrapper for melt/dcast where melt does the same thing as gather from tidyr i.e. converting 'wide' to 'long' format, and dcast converts the 'long' back to 'wide' (as spread from tidyr). In the dcast formula, we can use the full formula by indicating Country + variable ~ IndicatorName or we can use ... to specify all the remaining variables on the lhs of ~.

 library(reshape2)
 recast(df1, measure.var=c('1960', '1965'), ...~IndicatorName, value.var='value')
# Country variable Air Elec Mobile
#1  France     1960   2    9      2
#2  France     1965   3   10      4
#3 Germany     1960  50   43     45
#4 Germany     1965  43   23     66
#5     USA     1960  87   19      1
#6     USA     1965   2   81     77

Just to understand this better,

 melt(df1, measure.vars=c('1960', '1965')) %>% 
            dcast(., Country+variable~IndicatorName, value.var='value')

Also, note that we can change the variable name in the melt step,

 melt(df1, measure.vars=c('1960', '1965'), variable.name='Years') %>%
           dcast(., Country+Years ~IndicatorName, value.var='value')

The %>% is from dplyr. (It is not needed. We could have done this in two steps).

data

 df1 <- structure(list(Country = c("France", "France", "France", "Germany", 
"Germany", "Germany", "USA", "USA", "USA"), IndicatorName = c("Air", 
"Elec", "Mobile", "Air", "Elec", "Mobile", "Air", "Elec", "Mobile"
), `1960` = c(2L, 9L, 2L, 50L, 43L, 45L, 87L, 19L, 1L), `1965` = c(3L, 
10L, 4L, 43L, 23L, 66L, 2L, 81L, 77L)), .Names = c("Country", 
"IndicatorName", "1960", "1965"), class = "data.frame",
row.names = c(NA, -9L))

Upvotes: 3

Related Questions