SBista
SBista

Reputation: 7704

Regex to till the first occurrence of the bracket close

I have a string named cars which is as follows:

cars
[1] "Only one car(52;model-14557) had a good engine(workable condition), others engine were damaged beyond repair"   
[2] "Other car(21, model-155) looked in good condition but car ( 36, model-8878) looked to be in terrible condition."

I need to extract the following parts from the string:

car(52;model-14557)
car(21, model-155)
car ( 36, model-8878)

I tried using the following piece of could to extract it:

stringr::str_extract_all(cars, "(.car\\s{0,5}\\(([^]]+)\\))")

This gave me the following output:

[[1]]
[1] " car(52;model-14557) had a good engine(workable condition)"

[[2]]
[1] " car(21, model-155) looked in good condition but car ( 36, model-8878)"

Is there a way in which I could extract the word cars with the associated number and model number?

Upvotes: 2

Views: 332

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627082

Your regex does not work because you are using [^]]+, one or more symbols other than ] that matches ( and ), and thus matches from the first ( up to the last ) with no ] in between.

Use

> cars <- c("Only one car(52;model-14557) had a good engine(workable condition), others engine were damaged beyond repair","Other car(21, model-155) looked in good condition but car ( 36, model-8878) looked to be in terrible condition.")
> library(stringr)
> str_extract_all(cars, "\\bcar\\s*\\([^()]+\\)")
[[1]]
[1] "car(52;model-14557)"

[[2]]
[1] "car(21, model-155)"    "car ( 36, model-8878)"

The regex is \bcar\s*\([^()]+\), see the online regex demo here.

It matches:

  • \b - a word boundary
  • car - the literal char sequence
  • \s* - 0+ whitespaces
  • \( - a literal (
  • [^()]+ - 1 or more chars other than ( and )
  • \) - a literal ).

Note the same regex will yield the same results with the following base R code:

> regmatches(cars, gregexpr("\\bcar\\s*\\([^()]+\\)", cars))
[[1]]
[1] "car(52;model-14557)"

[[2]]
[1] "car(21, model-155)"    "car ( 36, model-8878)"

Upvotes: 3

Related Questions