Reputation: 69
I am trying to find a convenient way to rename multiple columns using the tidyverse. Say I have a tibble
df <- tibble(a = 1, b = 2, tmp_2000 = 23, tmp_2001 = 22.1, tmp_2002 = 25, pre_2000, pre_2001, pre_2002)
# A tibble: 1 x 8
a b tmp_2000 tmp_2001 tmp_2002 pre_2000 pre_2001 pre_2002
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2 23 22.1 25 100 103 189
temp
and pre
stand for temperature and precipitations. I want to reorganize this table in a tidy form, i.e. with one column for temperature
, one for precipitations
, and each row is the corresponding value for the year.
Right now the only option I found was to do something like this
df <- df %>%
select(-starts_with("pre"))
names(df)[3:5] <- substr(names(df)[3:5],5,8)
df<-df %>%
gather(`2000`:`2002`,key = "year",value="temp") %>%
mutate("year" = as.integer(year))
# A tibble: 3 x 4
a b year temp
<dbl> <dbl> <int> <dbl>
1 2 2000 23
1 2 2001 22.1
1 2 2002 25
This is not fantastic as I need to do the same thing with precipitations and then joins the two tables. In the future I will be getting more weather variables and this process will quickly become a pain.
Does anyone have an idea on how to do this more efficiently using the tidyverse?
Thanks,
Jo
PS: the only similar posts I saw referred to recoding the variables (using mutate_at), or renaming columns using the names
that I showed above.
Upvotes: 3
Views: 868
Reputation: 3321
df <- tibble(
a = 1,
b = 2,
tmp_2000 = 23,
tmp_2001 = 22.1,
tmp_2002 = 25,
pre_2000=100,
pre_2001=103,
pre_2002=189
)
df %>%
gather(key, value, -a:-b) %>%
separate(key, c("type", "year")) %>%
spread(type, value= value )
#> # A tibble: 3 x 5
#> a b year pre tmp
#> <dbl> <dbl> <chr> <dbl> <dbl>
#> 1 1 2 2000 100 23
#> 2 1 2 2001 103 22.1
#> 3 1 2 2002 189 25
```
Upvotes: 0
Reputation: 79188
data.frame(df)%>%
reshape(3:ncol(df),sep="_",dir="long")%>%
`rownames<-`(NULL)
a b time tmp pre id
1 1 2 2000 23.0 100 1
2 1 2 2001 22.1 103 1
3 1 2 2002 25.0 189 1
Upvotes: 2
Reputation: 8110
You could do it like this:
library(tidyverse)
df %>%
gather(measure, value, -a, -b) %>%
separate(measure, into = c("type", "year"), sep = "_") %>%
mutate(type = case_when(type == "tmp" ~ "temp", type == "pre" ~ "precip")) %>%
spread(type, value)
# a b year precip temp
# 1 1 2 2000 100 23
# 2 1 2 2001 103 22.1
# 3 1 2 2002 189 25
We first gather all of the data in long format, then we separate the year from the measurement, then we change the names of the measurements, and lastly we spread the data back to wide format.
Upvotes: 2