Reputation: 223
So I have a list of names, and I want to extract the first character of the last word in the name. I can get the last word, but not the first character of the last word.
species <- c("ACHILLEA MILLEFOLIUM VAR. BOREALIS",
"ACHILLEA MILLEFOLIUM VAR. MILLEFOLIUM",
"ALLIUM SCHOENOPRASUM VAR. SIBIRICUM")
#can get the last word
str_extract(data$species, "\\w+$")
[1] "BOREALIS" "MILLEFOLIUM" "SIBIRICUM"
What I want is [1] "B" "M" "S"
Upvotes: 1
Views: 159
Reputation: 163362
With str_extract you could also assert a whitespace boundary to the left and match the first following word characters, while asserting optional word characters to the end of the string.
If you want to match any non whitespace character you can also use \\S
instead of \\w
library (stringr)
species <- c("ACHILLEA MILLEFOLIUM VAR. BOREALIS",
"ACHILLEA MILLEFOLIUM VAR. MILLEFOLIUM",
"ALLIUM SCHOENOPRASUM VAR. SIBIRICUM")
str_extract(species, "(?<!\\S)\\w(?=\\w*$)")
Output
[1] "B" "M" "S"
See an R demo.
Upvotes: 1
Reputation: 1056
This might not be the most elegant solution, but you can always pipe string_extract()
a second time to get the first character of the last word.
library(stringr)
species <- c("ACHILLEA MILLEFOLIUM VAR. BOREALIS",
"ACHILLEA MILLEFOLIUM VAR. MILLEFOLIUM",
"ALLIUM SCHOENOPRASUM VAR. SIBIRICUM")
str_extract(species, "(\\w+$)") |>
str_extract("^[A-Z]")
[1] "B" "M" "S"
Upvotes: 2
Reputation: 887128
We may capture the non-whitespace character (\\S
) followed by one or more non-whitespace charactrers (\\S+
) till the end ($
) of the string and replace by the backreference (\\1
) of the captured group
sub(".*\\s+(\\S)\\S+$", "\\1", species)
[1] "B" "M" "S"
Upvotes: 2