Reputation: 21
I am working with PUMS data and I am using tidycensus to translate the values of certain columns. However, this adds a new column with structure: columnname_label. I want to use mutate to replace the original column with the new translated values in columnname_label. Here is example of my data frame:
df <-data.frame(Region= c(1,2,1,4,3,1),
Region_label = c("North", "South", "North", "West", "East", "North"),
Broadband = c(0,1,0,0,0,1)
Broadband_label = c("No","Yes","No","No","No","Yes")
Hispeed = c(1,1,0,0,1,0)
Hispeed_label = c("Yes", "Yes","No","No","Yes","No")
I know I can write out code like this using tidycensus:
library(tidyverse)
recode <- df %>% mutate(Region = Region_label) %>% mutate(Broadband = Broadband_label) %>%
mutate(Hispeed = Hispeed_label)
However, I have 66 columns that need to be matched to the "_label" columns. Is there a more elegant way to perform this instead of writing 66 mutate statements?
I have tried writing a loop using mutate_at, but it does not work.
subset1 <- grep('*label*',names(df),value = TRUE)
name <- names(df)
for (i in subset1) {
final <- final_house %>% mutate_at(vars(matches(trimws(name,"right","\\_label"))),i)
}
This does not work.
Thank you!
Upvotes: 2
Views: 110
Reputation: 269481
Using across
this assigns to each column whose name is X the column whose name is X_label. If there can be columns that do not have a corresponding _label column then replace the first argument of across
with sub("_label$", "", subset1)
where subset1
is defined in the question.
df %>%
mutate(across(!ends_with("_label"),
~ cur_data()[[paste0(cur_column(), "_label")]]))
giving:
Region Region_label Broadband Broadband_label Hispeed Hispeed_label
1 North North No No Yes Yes
2 South South Yes Yes Yes Yes
3 North North No No No No
4 West West No No No No
5 East East No No Yes Yes
6 North North Yes Yes No No
Note that the above is easy to do with just base R:
replace(df, sub("_label$", "", subset1), df[subset1])
or adding a pipeline:
df %>% replace(sub("_label$", "", subset1), .[subset1])
Upvotes: 1
Reputation: 18632
library(dplyr)
df %>%
select(ends_with("label")) %>%
rename_with(~ gsub("_label","", .))
Output
Region Broadband Hispeed
1 North No Yes
2 South Yes Yes
3 North No No
4 West No No
5 East No Yes
6 North Yes No
Upvotes: 1
Reputation: 1557
First, select all columns, which end with "label"
. Then you can extract the string until the first underscore (_
) using a reprex.
df %>%
select(ends_with("label")) %>%
rename_with(~ str_extract(.x, "^[^_]+(?=_)"))
Note that you need dplyr
version 1.0.0
or newer for that.
Upvotes: 1