Manasi Shah
Manasi Shah

Reputation: 437

creating a new variable based on string matching

I have the following dataframe:

df <- data.frame(Sample_name = c("01_00H_NA_DNA",   "01_00H_NA_RNA",    "01_00H_NA_S",  "01_00H_NW_DNA",    "01_00H_NW_RNA",    "01_00H_NW_S",  "01_00H_OM_DNA",    "01_00H_OM_RNA",    "01_00H_OM_S",  "01_00H_RL_DNA",    "01_00H_RL_RNA",    "01_00H_RL_S"),
             Pair = c("","", "S1","","","S2","","","S3","", "","S5"))

I am trying to create a new variable treatment based on sample_name. I used the following code: df$treatment <- ifelse(grep("_NA_", df$sample_name, ignore.case = T), "nat", ifelse(grep("_NW_", df$sample_name, ignore.case = T), "natH2", ifelse(grep("_RL_", df$sample_name, ignore.case = T), "RNALat", ifelse(grep("_OM_", df$sample_name, ignore.case = T ), "Om"))))

I don't understand what I am doing wrong here, I got an error saying Error in $<-.data.frame(*tmp*, "treatment", value = logical(0)) : replacement has 0 rows, data has 12

Any suggestions?

Upvotes: 0

Views: 1706

Answers (1)

Manasi Shah
Manasi Shah

Reputation: 437

Got the answer, added grepl to each grep statement:

df$treatment <- ifelse(grepl("_NA_", df$sample_name, ignore.case = T), "nat", 
                        ifelse(grepl("_NW_", df$sample_name, ignore.case = T ), "natH2",
                               ifelse(grepl("_RL_", df$sample_name, ignore.case = T), "RNALat",
                                      ifelse(grepl("_OM_", df$sample_name, ignore.case = T ), "Om", "NA"))))

Upvotes: 2

Related Questions