flee
flee

Reputation: 1335

Extract characters between first and third period

Basically what the title says, I have a vector of character strings and for each element I want to extract everything between the first and third period. E.g.

s <- c("random.0.0.word.1.0", "different.0.02.words.15.6", "different.0.1.words.4.2")

The result should be:

"0.0" "0.02" "0.1" 

I have tried adapting code from here and here but failed. Any advice much appreciated!

Upvotes: 2

Views: 501

Answers (4)

moodymudskipper
moodymudskipper

Reputation: 47340

Here's a way with unglue, which some might find less intimidating :

library(unglue)
s <- c("random.0.0.word.1.0", "different.0.02.words.15.6", "different.0.1.words.4.2")
unglue_vec(s, "{=[^.]+}.{x}.{=[^.]+}.{=[^.]+}.{=[^.]+}")
#> [1] "0.0"  "0.02" "0.1"

Created on 2020-01-16 by the reprex package (v0.3.0)

The subpatterns [^.]+ are sequences of "non dots", not named (nothing on the lhs of =) because we don't want to extract them.

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389175

We can use sub to capture as little as possible between 1st and 3rd period.

sub(".*?\\.(.*?\\..*?)\\..*", "\\1", s)
#[1] "0.0"  "0.02" "0.1" 

Upvotes: 1

akrun
akrun

Reputation: 887571

We can capture as a group by matching characters not a . ([^.]+) from the start (^) of the string, followed by a dot (\\.) and then capture all the characters between the first and the third dot, in the replacement use the backreference (\\1) of the captured group ((...))

sub("^[^.]+\\.([^.]+\\.[^.]+)\\..*", "\\1", s)
#[1] "0.0"  "0.02" "0.1" 

Or it can be also done with substr after getting the position of the dots

lst1 <- gregexpr('.', s, fixed = TRUE)
substring(s, sapply(lst1, `[`, 1) + 1, sapply(lst1, `[`, 3) - 1)
#[1] "0.0"  "0.02" "0.1" 

Upvotes: 2

Marius
Marius

Reputation: 60160

An alternative way to do this, without using any fancy regex features, is just to split on . and then grab the bits we need:

library(stringr)
library(purrr)

str_split(s, "\\.") %>% 
  map_chr(~ paste0(.[2:3], collapse = "."))

Upvotes: 1

Related Questions