Auriane Serval
Auriane Serval

Reputation: 11

Rename variable names with tidyverse

I've been struggling with a problem for a while so I was hoping to find some help here ;)

On R, I want to create a data table with two columns from a data set, modify the values of one of the two columns and calculate the number of people in the grouping for each group of variables.

I want to extract these two columns:

I want to modify NAVLC9_COD by renaming several variables from 0 to 12 meters with one variable and by renaming several variables over 12 meters with one variable.

STEP 1. With the idea that my script is reproducible I have created the values :

key_segment_size <- c("[0-6[ m","[6-10[ m","[10-12[ m")
other_size_segment <- c("[12-15[ m", "[15-18[ m", "[18-24[ m", "[24-40[ m", "[40-80[ m",">= 80 m")

STEP 2. Then I create my data table:

data_all_size <- data %>%
    dplyr::select(REF_YEAR,NAVLC9_COD) %>%
    str_replace(key_segment_size, "Inf. 12 m") %>% 
    str_replace(other_segment_size, "Sup. 12 m") %>% 
    droplevels() %>%
    group_by(REF_YEAR,NAVLC9_COD) %>% 
    summarize(
      number = n() 
    )

Error in type(pattern) : object 'other_segment_size' not found

in the case where I replace my values by a vector, it gives :

data_all_size <- data %>%
    dplyr::select(REF_YEAR,NAVLC9_COD) %>%
    str_replace(c("[0-6[ m","[6-10[ m","[10-12[ m"),"Inf. 12 m") %>% 
    str_replace(c("[12-15[ m", "[15-18[ m", "[18-24[ m", "[24-40[ m", "[40-80[ m",">= 80 m"),"Sup. 12 m") %>% 
    droplevels() %>%
    group_by(REF_YEAR,NAVLC9_COD) %>% 
    summarise(
      effectif = n() 
    )

Error in stri_replace_first_regex(string, pattern, fix_replacement(replacement), : Missing closing bracket on a bracket expression. (U_REGEX_MISSING_CLOSE_BRACKET, context=[0-6[ m) In addition: Warning messages: 1: In stri_replace_first_regex(string, pattern, fix_replacement(replacement), : argument is not an atomic vector; coercing 2: In stri_replace_first_regex(string, pattern, fix_replacement(replacement), : longer object length is not a multiple of shorter object length

I also tried with the mutate and recode functions :

data_all_size <- data %>%
    dplyr::select(REF_YEAR,NAVLC9_COD) %>%
    mutate(Taille = recode("Inf. 12 m" = key_segment_size)) %>%
    mutate(Taille = recode("Inf. 12 m" = other_segment_size)) %>% 
    droplevels() %>%
    group_by(REF_YEAR,NAVLC9_COD) %>% 
    summarise(
      effectif = n() 
    )

Error in mutate(): ! Problem while computing Taille = recode(Inf. 12 m = key_segment_size). Caused by error in recode.character(): ! argument ".x" is missing, with no default

Can anyone tell me what I'm doing wrong/what I should do instead? I'm using tidyverse, idk if that helps at all. Thank you for any help, I'm frustrated to tears.

Upvotes: 0

Views: 1818

Answers (1)

Auriane Serval
Auriane Serval

Reputation: 11

I finally achieve it thanks to @Gregor Thomas comment with :

data_all_size <- data_sacrois %>%
 dplyr::select(REF_YEAR,NAVLC9_COD) %>%
 mutate(Taille = case_when(NAVLC9_COD %in% key_segment_size ~ "Inf. 12 m", 
                           NAVLC9_COD %in% other_size_segment ~ "Sup. 12 m", 
                           TRUE ~  "other")) %>%
 droplevels() %>%
 group_by(REF_YEAR,Taille) %>% 
 summarise(
      effectif = n() 
)

Upvotes: 1

Related Questions