Tom
Tom

Reputation: 199

Populate a variable when another variable contains a certain character string (R dataframe)

I am trying to create an observation based on an observation from a different variable containing a certain character. I have tried the following code:

site<-  c('5.1',    'CD 1.1',   'FD 1', 'FD 2', 'FD 3', 'FD 4',  
'FD 5', 'FD 6')
year<-  c(2011, 2013,   2010,   2010,   2010,   2010,   2010,   2010)
diveLocation<-  NA

df = data.frame(site, year, diveLocation)
df$diveLocation<-as.character(df$diveLocation)
df$diveLocation<- gsub("^C\\w+", "compliance", df$site)

head(df)

Which gives:

    site    year  diveLocation                           
1   5.1   2011    5.1   
2   CD 1.1  2013    compliance 1.1  
3   FD 1    2010    FD 1    
4   FD 2    2010    FD 2    
5   FD 3    2010    FD 3    
6   FD 4    2010    FD 4

the only positive is that compliance has populated "diveLocation", however, I only want the compliance character (i.e. not with the 'site' observation 1.1) and I don't want all the other 'site' observations to come across to 'diveLocation' (e.g. 5.1 etc) rather just populate with NAs. Any advice would be very much appreciated!

Upvotes: 0

Views: 1345

Answers (3)

akrun
akrun

Reputation: 887138

We can use grep to create a numeric index. Subset the 'site' based on the index, assign the values to the corresponding elements of 'diveLocation'

i1 <- grep("^CD", df$site)
df$diveLocation[i1] <-  'compliance'
df
#    site year   diveLocation
#1    5.1 2011           <NA>
#2 CD 1.1 2013    compliance
#3   FD 1 2010           <NA>
#4   FD 2 2010           <NA>
#5   FD 3 2010           <NA>
#6   FD 4 2010           <NA>
#7   FD 5 2010           <NA>
#8   FD 6 2010           <NA>
i2 <- grep("^FD", df$site)
df$diveLocation[i2] <- 'Farm'

Or using data.table

library(data.table)
setDT(df)[grep("^CD", site), diveLocation := 'compliance'][]

Upvotes: 0

Henry Cyranka
Henry Cyranka

Reputation: 3060

Using tidyverse packages and a combination of case_when and str_detect

library(tidyverse)

site<-  c('5.1',    'CD 1.1',   'FD 1', 'FD 2', 'FD 3', 'FD 4',  
          'FD 5', 'FD 6')
year<-  c(2011, 2013,   2010,   2010,   2010,   2010,   2010,   2010)
diveLocation<-  NA


df = data.frame(site, year, diveLocation) %>%as_tibble()


new_df <- df %>%
    mutate(diveLocation = case_when(
        str_detect(site,pattern = "C") ~ "compliance",
        str_detect(site, pattern = "F") ~"farm",
        TRUE ~ NA_character_
    ))

new_df

Upvotes: 1

Hunaidkhan
Hunaidkhan

Reputation: 1418

This code should do the work for you.

site<-  c('5.1',    'CD 1.1',   'FD 1', 'FD 2', 'FD 3', 'FD 4',  
          'FD 5', 'FD 6')
year<-  c(2011, 2013,   2010,   2010,   2010,   2010,   2010,   2010)
diveLocation<-  NA

df = data.frame(site, year, diveLocation)

df$diveLocation <- ifelse(substr(df$site, 1, 1) == "C", "compliance", ifelse(substr(df$site, 1, 1) == "F", "Farm","NA"))

Upvotes: 0

Related Questions