Reputation: 199
I am trying to create an observation based on an observation from a different variable containing a certain character. I have tried the following code:
site<- c('5.1', 'CD 1.1', 'FD 1', 'FD 2', 'FD 3', 'FD 4',
'FD 5', 'FD 6')
year<- c(2011, 2013, 2010, 2010, 2010, 2010, 2010, 2010)
diveLocation<- NA
df = data.frame(site, year, diveLocation)
df$diveLocation<-as.character(df$diveLocation)
df$diveLocation<- gsub("^C\\w+", "compliance", df$site)
head(df)
Which gives:
site year diveLocation
1 5.1 2011 5.1
2 CD 1.1 2013 compliance 1.1
3 FD 1 2010 FD 1
4 FD 2 2010 FD 2
5 FD 3 2010 FD 3
6 FD 4 2010 FD 4
the only positive is that compliance has populated "diveLocation", however, I only want the compliance character (i.e. not with the 'site' observation 1.1) and I don't want all the other 'site' observations to come across to 'diveLocation' (e.g. 5.1 etc) rather just populate with NAs. Any advice would be very much appreciated!
Upvotes: 0
Views: 1345
Reputation: 887138
We can use grep
to create a numeric index. Subset the 'site' based on the index, assign the values to the corresponding elements of 'diveLocation'
i1 <- grep("^CD", df$site)
df$diveLocation[i1] <- 'compliance'
df
# site year diveLocation
#1 5.1 2011 <NA>
#2 CD 1.1 2013 compliance
#3 FD 1 2010 <NA>
#4 FD 2 2010 <NA>
#5 FD 3 2010 <NA>
#6 FD 4 2010 <NA>
#7 FD 5 2010 <NA>
#8 FD 6 2010 <NA>
i2 <- grep("^FD", df$site)
df$diveLocation[i2] <- 'Farm'
Or using data.table
library(data.table)
setDT(df)[grep("^CD", site), diveLocation := 'compliance'][]
Upvotes: 0
Reputation: 3060
Using tidyverse packages and a combination of case_when and str_detect
library(tidyverse)
site<- c('5.1', 'CD 1.1', 'FD 1', 'FD 2', 'FD 3', 'FD 4',
'FD 5', 'FD 6')
year<- c(2011, 2013, 2010, 2010, 2010, 2010, 2010, 2010)
diveLocation<- NA
df = data.frame(site, year, diveLocation) %>%as_tibble()
new_df <- df %>%
mutate(diveLocation = case_when(
str_detect(site,pattern = "C") ~ "compliance",
str_detect(site, pattern = "F") ~"farm",
TRUE ~ NA_character_
))
new_df
Upvotes: 1
Reputation: 1418
This code should do the work for you.
site<- c('5.1', 'CD 1.1', 'FD 1', 'FD 2', 'FD 3', 'FD 4',
'FD 5', 'FD 6')
year<- c(2011, 2013, 2010, 2010, 2010, 2010, 2010, 2010)
diveLocation<- NA
df = data.frame(site, year, diveLocation)
df$diveLocation <- ifelse(substr(df$site, 1, 1) == "C", "compliance", ifelse(substr(df$site, 1, 1) == "F", "Farm","NA"))
Upvotes: 0