Amir
Amir

Reputation: 45

Transposing a data-frame by groups in R with missing values

I have a data frame that looks like


Country    Variable      2012     2013    2014
Germany    Medical       11       2       4
Germany    Transport     12       6       8
France     Medical       15       10      12
France     Transport     17       13      14  
France     Food          24       14      15

I would like to transpose the data frame in such a way that the final data frame takes the form of the following:

Country     year    Medical    Transport     Food 
Germany     2012    11         12            NA
Germany     2013    2          6             NA
Germany     2014    4          8             NA
France      2012    15         17            24
France      2013    10         13            14  
France      2014    12         14            15

I've tried several functions including melt, reshape, and spread but they didn't work. Does anybody have any ideas?

Upvotes: 1

Views: 238

Answers (2)

akrun
akrun

Reputation: 887691

We can also use transpose from data.table

library(data.table) # v >= 1.12.4 
rbindlist(lapply(split(df1[-1], df1$Country), function(x) 
   data.table::transpose(x, keep.names = 'year', make.names = "Variable")), 
      idcol = 'Country', fill = TRUE)
#   Country year Medical Transport Food
#1:  France 2012      15        17   24
#2:  France 2013      10        13   14
#3:  France 2014      12        14   15
#4: Germany 2012      11        12   NA
#5: Germany 2013       2         6   NA
#6: Germany 2014       4         8   NA

data

df1 <- structure(list(Country = c("Germany", "Germany", "France", "France", 
"France"), Variable = c("Medical", "Transport", "Medical", "Transport", 
"Food"), `2012` = c(11L, 12L, 15L, 17L, 24L), `2013` = c(2L, 
6L, 10L, 13L, 14L), `2014` = c(4L, 8L, 12L, 14L, 15L)), 
 class = "data.frame", row.names = c(NA, 
-5L))

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 389175

You can first convert it into long format and then into wide again

library(tidyr)

df %>%
  pivot_longer(cols = -c(Country, Variable), names_to = "year") %>%
  pivot_wider(names_from = Variable, values_from = value)

# A tibble: 6 x 5
#  Country year  Medical Transport  Food
#  <fct>   <chr>   <int>     <int> <int>
#1 Germany 2012       11        12    NA
#2 Germany 2013        2         6    NA
#3 Germany 2014        4         8    NA
#4 France  2012       15        17    24
#5 France  2013       10        13    14
#6 France  2014       12        14    15

For older version of tidyr that would be with gather and spread

df %>%
  gather(year, value, -c(Country, Variable)) %>%
  spread(Variable, value)

Upvotes: 2

Related Questions