Tom
Tom

Reputation: 2341

Adding a column to the data that looks for a list of words and adds them if found

I have data that has a column with variable names and a column with the descriptions of variables:

library(data.table)
example_dat <- fread("var_nam description
      some_var this_is_som_var_kg
      other_var this_is_meters_for_another_var")
example_dat$description  <- gsub("_", " ", example_dat$description)

example_dat
     var_nam                    description
1:  some_var             this is som var kg
2: other_var this is meters for another var

I would like to create a separate column in this data which looks for certain units listed in a vector. I started out as follows:

vector_of_units <- c("kg", "meters")
example_dat <- setDT(example_dat)[, unit := ifelse(vector_of_units %in% description, vector_of_units, NA)]

But this gives

     var_nam                    description unit
1:  some_var             this is som var kg   NA
2: other_var this is meters for another var   NA

How should I write this syntax so that it gives the following output?

     var_nam                    description unit
1:  some_var             this is som var kg   kg
2: other_var this is meters for another var   meters

Upvotes: 0

Views: 55

Answers (1)

maydin
maydin

Reputation: 3755

Change %in% with str_detect and make some arrangement with paste,

library(tidyverse)
setDT(example_dat)[, unit :=    unlist(lapply(example_dat$description,function(x) 
                    paste0(vector_of_units[str_detect(x,vector_of_units)],
                    collapse = ",")))]

gives,

#         var_nam                    description   unit
#    1:  some_var             this is som var kg     kg
#    2: other_var this is meters for another var meters

Upvotes: 1

Related Questions