Converting a dataframe to tidy format

Question

Here I'm attempting to convert dataframe to tibble format and split the year , month column values into their own rows :

library(dpylr)
library(tidyr)

res <- data.frame("year.month" = c("2005M1","2005M2","2005M3","2005M4"), "national houses" = c(100,100,100,100), "dublin houses" = c(120,120,120,120))

res %>% separate(year.month , into=c("year" , "month") ,  sep=".")

returns :

  year month national.houses dublin.houses
1                        100           120
2                        100           120
3                        100           120
4                        100           120
Warning message:
Too many values at 4 locations: 1, 2, 3, 4

year & month values are not appearing, im not utilizing separate correctly ?

Gabi · Accepted Answer

I'd guess that just separating year from month would get you to half-tidy. You still have two separate columns that both count houses. One row per observation, one column per variable would require something like this:

res %>% 
  tidyr::gather(key = where, 
                value = houses, 
                -year.month) %>% 
  mutate(where = gsub(where, 
                      pattern = '\.houses', 
                      replacement = '')) %>% 
  separate(year.month, 
           into = c('year', 'month'), 
           sep = 'M')

Converting a dataframe to tidy format

Answers (2)

Related Questions