Nirthakali
Nirthakali

Reputation: 45

How to you perform the regex on the line with multiple same word with distinct meaning?

I have a sentence. My dad, granddad and great great granddad looks alike. How do you create a regex to get dad, granddad , great great grand data value using grep.

I tried using str_extract_all(pattern = "(great)?\s(grand)?(father|mother)", sentence) but with little success.

Upvotes: 1

Views: 29

Answers (1)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521249

The following regex should work:

\b(?:(?:great )*granddad|dad)\b

R code:

sentence <- "My dad, granddad and great great granddad looks alike."
str_extract_all(pattern = "\\b(?:(?:great )*granddad|dad)\\b", sentence)[[1]]

[1] "dad"                  "granddad"             "great great granddad"

Demo

The trick here is to use an alternation, as you were already using, but to place the more specific terms first. The pattern (?:great )*granddad will match great great granddad first, followed by great granddad (which does not actually occur in your sentence), and finally granddad.

Upvotes: 2

Related Questions