Reputation: 1773
I would like to extract the second last string after the '/' symbol. For example,
url<- c('https://example.com/names/ani/digitalcod-org','https://example.com/names/bmc/ambulancecod.org' )
df<- data.frame (url)
I want to extract the second word from the last between the two // and would like to get the words 'ani' and 'bmc'
so, I tried this
library(stringr)
df$name<- word(df$url,-2)
I need output which as follows:
name
ani
bmc
Upvotes: 3
Views: 1278
Reputation: 26353
A non-regex approach using basename
basename(mapply(sub, pattern = basename(url), replacement = "", x = url, fixed = TRUE))
#[1] "ani" "bmc"
basename(url)
"removes all of the path up to and including the last path separator (if any)" and returns
[1] "digitalcod-org" "ambulancecod.org"
use mapply
to replace this outcome for every element in url
by ""
and call basename
again.
Upvotes: 0
Reputation: 5281
Here is a solution using strsplit
words <- strsplit(url, '/')
L <- lengths(words)
vapply(seq_along(words), function (k) words[[k]][L[k]-1], character(1))
# [1] "ani" "bmc"
Upvotes: 0
Reputation: 43169
Use gsub
with
.*?([^/]+)/[^/]+$
R
:
urls <- c('https://example.com/names/ani/digitalcod-org','https://example.com/names/bmc/ambulancecod.org' )
gsub(".*?([^/]+)/[^/]+$", "\\1", urls)
This yields
[1] "ani" "bmc"
Upvotes: 0
Reputation: 51592
You can use word
but you need to specify the separator,
library(stringr)
word(url, -2, sep = '/')
#[1] "ani" "bmc"
Upvotes: 5
Reputation: 13319
Try this:
as.data.frame(sapply(str_extract_all(df$url,"\\w{2,}(?=\\/)"),"["))[3,]
# V1 V2
#3 ani bmc
as.data.frame(sapply(str_extract_all(df$url,"\\w{2,}(?=\\/)"),"["))[2:3,]
# V1 V2
#2 names names
#3 ani bmc
Upvotes: 1