Jaime Vergara
Jaime Vergara

Reputation: 1

Check columns in data frame against a string of names, if lacking one create a column with NULL values

I want to check my data frame "df" against the list "true_names" if "df" dosen't contain one of the "true_names" it mutates the data frame to add a colum with that name.

library(dplyr)

true_names <- c("comuna" , "n_ciclo", "id_2", "dat_mtt", "dat_ci", 
                "name_ci", "ci_o_cr", "oneway" , "phanto", "len_v", "t_v_ci",
                "ancho_c", "pista_c", "t_v_cr", "ci_ca", "ci_vd" , "ci_plat", "ci_band",
                "ci_par" ,"tipci" , "mater", "ancho_v" , "ancho_s" , "t_s_vd" , "t_s_ca",
                "color_p",  "linea_p", "senaliz",  "pintado",  "semaf", "cartel", 
                "proye","op_ci","op_sup","op_s_ca","op_s_vd","op_dist","op_cr","geometry"
)


actual_names <- sf::st_read("/home/df")

for (x in 1:length(true_names) ) {
  
  if (true_names[x] %in% colnames(actual_names)) {
  }
  else {
     actual_names <- mutate(actual_names, true_names[x] = NULL ) 
  }
}

The code has a problem with the mutate function it trows the following error:

Error: unexpected '=' in:
"  else {
     actual_names <- mutate(actual_names,true_names[x] ="

Syntax is correct. What is it then?

I tried replacing the "=" symbol by a "<-" caracter.

Upvotes: 0

Views: 28

Answers (1)

juanbarq
juanbarq

Reputation: 399

To accomplish that, you just need an if condition, and new columns have to take value NA (NULL delete your column). I show here a reproducible example, you can change actual_names with your dataframe:

library(dplyr)

true_names <- c("comuna" , "n_ciclo", "id_2", "dat_mtt", "dat_ci", 
                "name_ci", "ci_o_cr", "oneway" , "phanto", "len_v", "t_v_ci",
                "ancho_c", "pista_c", "t_v_cr", "ci_ca", "ci_vd" , "ci_plat", "ci_band",
                "ci_par" ,"tipci" , "mater", "ancho_v" , "ancho_s" , "t_s_vd" , "t_s_ca",
                "color_p",  "linea_p", "senaliz",  "pintado",  "semaf", "cartel", 
                "proye","op_ci","op_sup","op_s_ca","op_s_vd","op_dist","op_cr","geometry"
)


actual_names <- tibble(comuna= c(1,2,3))

for (x in 1:length(true_names) ) {
  
  if (!true_names[x] %in% colnames(actual_names)) {
    
    actual_names[[true_names[x]]] <- NA
  }
}

Upvotes: 0

Related Questions