Reputation: 3941
Leapfrogging from a previous question, I'm having problem with the proper reg expression syntax to isolate a specific word.
Given a data frame:
DL<-c("Dark_ark","Light-Lis","dark7","DK_dark","The_light","Lights","Lig_dark","D_Light")
Col1<-c(1,12,3,6,4,8,2,8)
DF<-data.frame(Col1)
row.names(DF)<-DL
I'm looking extract all of the "Dark" and "Light" (ignoring upper vs lower case) from the row names and make a second column containing only the string "Dark" or "Light"
Col2<-c("Dark","Light","dark","dark","light","Light","dark","Light")
DF$Col2<-Col2
Col1 Col2
Dark_ark 1 Dark
Light-Lis 12 Light
dark7 3 dark
DK_dark 6 dark
The_light 4 light
Lights 8 Light
Lig_dark 2 dark
D_Light 8 Light
Ive changed the original data a bit to detail my current issue, but working of an excellent answer from Tyler Rinker, I used this:
DF$Col2<-gsub("[^dark|light]", "", row.names(DF), ignore.case = TRUE)
But the gsub gets tripped up on some of the letters in common. Searching the message boards for isolating an exact word with regex, it looks like the answer should be to use double slash with either
\\<light\\>
or
\\blight\\b
So why does the line
DF$Col2<-gsub("[^\\<dark\\>|\\<light\\>]", "", row.names(DF), ignore.case = TRUE)
Not pull the desired column above? Instead I get
Col1 Col2
Dark_ark 1 Darkark
Light-Lis 12 LightLi
dark7 3 dark
DK_dark 6 DKdark
The_light 4 Thlight
Lights 8 Light
Lig_dark 2 Ligdark
D_Light 8 DLight
Upvotes: 5
Views: 1950
Reputation: 121608
One option is to use stringr
package:
library(stringr)
str_extract(tolower(rownames(DF)),'dark|light')
[1] "dark" "light" "dark" "dark" "light" "light" "dark" "light"
Or better using @Arun suggestion:
str_extract(rownames(DF), ignore.case('dark|light'))
Upvotes: 5
Reputation: 118889
How about this?
unlist(regmatches(rownames(DF), gregexpr("dark|light", rownames(DF), ignore.case=TRUE)))
# [1] "Dark" "Light" "dark" "dark" "light" "Light" "dark" "Light"
or
gsub(".*(dark|light).*$", "\\1", row.names(DF), ignore.case = TRUE)
# [1] "Dark" "Light" "dark" "dark" "light" "Light" "dark" "Light"
Upvotes: 9