Achal Neupane
Achal Neupane

Reputation: 5719

string split at the last (also at any nth) delimiter in R and remove the string before the delimiter

I have a vector vec. I need to remove the part before last "/" and get the remaining string and get the result. Please also note that I can't use Perl-compatible regexps (i.e. perl=FALSE). I would also like to see it for the nth delimiter.

vec<-c("/apple/pineapple/mango/reg.sh_ttgs.pos","/apple/pipple/mgo/deh_ttgs.pos")

Result for the last delimiter

reg.sh_ttgs.pos , deh_ttgs.pos

Result for the 2nd delimiter

pineapple/mango/reg.sh_ttgs.pos, pipple/mgo/deh_ttgs.pos

and so on..

Upvotes: 2

Views: 1125

Answers (2)

LyzandeR
LyzandeR

Reputation: 37879

One way could be to use a function like this (using gregexpr to get the location of a string and substring to subset the string accordingly):

get_string <- function(vec, n) {
  if(n == 'last'){
    positions <- lapply(gregexpr(pattern ='/',vec), function(x) x[length(x)] + 1)
  }else{
    positions <- lapply(gregexpr(pattern ='/',vec), function(x) x[n] + 1)
  }
  substring(vec, positions)
}

Output:

> get_string(vec, 2)
[1] "pineapple/mango/reg.sh_ttgs.pos" "pipple/mgo/deh_ttgs.pos"        
> get_string(vec, 'last')
[1] "reg.sh_ttgs.pos" "deh_ttgs.pos"   

You either specify the nth '/' or just specify 'last' if you want just the last part of the path.

Note: I am using an if-else statement above just in case the position of the last '/' is different in the various elements of your actual vector. If the number of /s will always be the same across all elements only lapply(gregexpr(pattern ='/',vec), function(x) x[n] + 1) is needed.

Upvotes: 2

jazzurro
jazzurro

Reputation: 23574

Alternatively, you can use char2end() in the qdap package. You specify a delimiter and which delimiter you want to use (1st, 2nd, etc.) in the function.

library(qdap)

For the 2nd delimiter,

char2end(vec, "/", 2)
#[1] "pineapple/mango/reg.sh_ttgs.pos" "pipple/mgo/deh_ttgs.pos"

For the last delimiter,

char2end(vec, "/", 4)
#[1] "reg.sh_ttgs.pos" "deh_ttgs.pos" 

Upvotes: 5

Related Questions