Strobila
Strobila

Reputation: 317

R - Extract text between symbol or delimiter '/'

I have several vectors, like these ones:

str <- c("AT/FBA/1/12/360/26/SF/96", "AT/RLMW/1/12/360/44/SF/122", "AT/ACR/1/12/362/66/SF/175", "AT/AA/1/12/363/72/SF/281", "AT/BB/1/12/364/90/SF/310", "AT/ANT/1/123/364/92/SF/338")

N.B. that each argument between '/' may change in length (amount of characters).

I want to extract the 5th and 6th arguments delimited by the '/'.

for example in this case:

"360/26", "360/44", "362/66", "363/72", "364/90", "364/92"

I checked at these answers from similar questions: Extract text after a symbol in R - Extracting part of string by position in R -

I tried to use:

sub("^([^/]+/){4}([^/]+).*", "\\2", str)

but it gives me only the 5th argument, as follow:

[1] "360" "360" "362" "363" "364" "364" "364" "365" "365" "366" "365" "002" "002" "002" "002" "003"
 [17] "003" "003" "004" "004" "004" "005"

then I tried

scan(text=str, sep="/", what="", quiet=TRUE)[c(5:6)]

but it gives me just the two arguments without the delimiter '/'.

Upvotes: 1

Views: 1058

Answers (4)

ktiu
ktiu

Reputation: 2636

A simple regex solution would be

sub("^([^/]*/){4}([^/]*/[^/]*)/.*", "\\2", str)

returning the desired

[1] "360/26" "360/44" "362/66" "363/72" "364/90"
[6] "364/92"

Upvotes: 2

Anoushiravan R
Anoushiravan R

Reputation: 21938

Here is a tidyverse solution I thought you could also use:

library(dplyr)
library(tidyr)

str %>%
  as_tibble() %>%
  separate(value, into = LETTERS[1:8], sep = "\\/") %>%
  select(5, 6) %>%
  unite("Extract", c("E", "F"), sep = "/")

# A tibble: 6 x 1
  Extract
  <chr>  
1 360/26 
2 360/44 
3 362/66 
4 363/72 
5 364/90 
6 364/92 

Upvotes: 0

G. Grothendieck
G. Grothendieck

Reputation: 269905

Use read.table like this:

with(read.table(text = str, sep = "/"), paste(V5, V6, sep = "/"))
## [1] "360/26" "360/44" "362/66" "363/72" "364/90" "364/92"

Upvotes: 2

Karthik S
Karthik S

Reputation: 11596

Will this work:

apply(sapply(strsplit(str, split = '/'), '[', c(5,6)),2, function(x) paste(x, collapse = '/'))
[1] "360/26" "360/44" "362/66" "363/72" "364/90" "364/92"

Upvotes: 1

Related Questions