Reputation: 189
I am having trouble using the gather function in R. This is the sample data frame -
library(dplyr)
library(tidyr)
DF = data.frame(Region = c("Asia", "Asia", "Asia", "Europe", "Europe"),
`Indicator Name` = c("Population", "GDP", "GNI", "Population", "GDP"),
`2004` = c(22, 33,44,55,56),
`2005` =c(223, 44,555,66,64))
Region Indicator.Name X2004 X2005
1 Asia Population 22 223
2 Asia GDP 33 44
3 Asia GNI 44 555
4 Europe Population 55 66
5 Europe GDP 56 64
And this is the data frame that I want
DF2 = data.frame(Region = c("Asia", "Asia", "Europe", "Europe"),
Year = c("X2004", "X2005"),
population = c(22, 224, 55, 66),
GDP = c(33, 44, 56,64))
Region Year population GDP
1 Asia X2004 22 33
2 Asia X2005 224 44
3 Europe X2004 55 56
4 Europe X2005 66 64
I want to do this via the gather
function in tidyr
.
I am not sure how to go about this. This is what I tried -
gather(DF, key= DF$Indicator.Name, values = "values")
Upvotes: 1
Views: 119
Reputation: 24069
This is not a simple gather
function. First you need to make the dataframe long and then make it wide switching the desired columns.
Here is a solution using the new pivot_longer
and pivot_wider
functions.
library(dplyr)
library(tidyr)
DF = data.frame(Region = c("Asia", "Asia", "Asia", "Europe", "Europe"),
`Indicator Name` = c("Population", "GDP", "GNI", "Population", "GDP"),
`2004` = c(22, 33,44,55,56),
`2005` =c(223, 44,555,66,64))
DF %>% pivot_longer(cols = starts_with("x")) %>%
pivot_wider(names_from = Indicator.Name, values_from = value)
# A tibble: 4 x 5
Region name Population GDP GNI
<fct> <chr> <dbl> <dbl> <dbl>
1 Asia X2004 22 33 44
2 Asia X2005 223 44 555
3 Europe X2004 55 56 NA
4 Europe X2005 66 64 NA
Upvotes: 4
Reputation: 5956
Using gather
and spread
, you have:
DF %>%
gather(-Indicator.Name, -Region, key= "Year", value = "value") %>%
spread(Indicator.Name, value)
Upvotes: 4