Reputation:
I want to remove everything before /query .. for e.g
I have no idea about regular expressions so doing this is difficult for me
Note : the reference should be /query
as the link mentioned below may have some different patterns like - www.abcd.wsd/asd/asdcd/asrr/query=xyz
www.html.com/query=abcd
should result into
query = abcd
Upvotes: 2
Views: 727
Reputation: 27398
Another option is:
sub('.*/query', '/query', 'www.html.com/query=abcd')
i.e., replace "all characters up to and including [the last instance of] /query" with "/query".
Upvotes: 1
Reputation: 627110
A generic regex solution to extract the query
appearing after the last /
and that is followed with characters other than /
is
s <- c("www.abcd.wsd/asd/asdcd/asrr/query=xyz","www.html.com/query=abcd","www.cmpnt.com/query=fgh/noquery=dd")
sub("^.*/(query[^/]*).*$", "\\1", s)
## => "query=xyz" "query=abcd" "query=fgh"
See this R demo
The regex is
^.*/(query[^/]*).*$
See the regex demo
Details:
^
- start of string.*
- match any 0+ characters as many as possible up to the last/
- a literal forward slash char(query[^/]*)
- capture group 1 matching a query
substring followed with 0+ characters other than /
(see [^/]*
negated character class with a *
quantifier).*
- zero or more any characters up to$
- the end of string.Upvotes: 2
Reputation: 4554
string<-c('www.abcd.wsd/asd/asdcd/asrr/query=xyz','www.html.com/query=abcd')
gsub('.*\\/([^/]+)$','\\1',string)
#[1] "query=xyz" "query=abcd"
Upvotes: 0
Reputation: 56219
We can abuse basename function which was intended to get filename, dropping all folders:
basename("www.abcd.wsd/asd/asdcd/asrr/query=xyz")
# [1] "query=xyz"
basename("www.html.com/query=abcd")
# [1] "query=abcd"
Note that this will fail when query
is not at the end:
basename("www.html.com/query=abcd/xyz")
# [1] "xyz"
Upvotes: 2