user7206349
user7206349

Reputation:

Is it possible to find sequences of words using regex?

Is it possible to find sequences where the ending letter of one word is the same as the beginning letter of the next word, and the ending letter of that word is the same as the beginning letter of the next next word and so on?

For example:

elementum magna sodales should match elementum magna, while something like Proin nunc curna, aliquet nec should return Proin nunc curna, aliquet, but an earring should return nothing because n is not the same as e.

I've tried something like \w*(\w)[\s:;'",.?!]*\1\w* but that only matches two words, I kind of need them to daisy chain together.

Upvotes: 1

Views: 67

Answers (2)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89557

You can do it with this pattern:

(?i)\b(?:[a-z]*([a-z])[^a-z]+(?=\1))+[a-z]*

Details:

(?i) # makes the pattern case-insensitive
\b
(?:  # non-capturing group: one word and eventual following non-word characters
    [a-z]*([a-z]) # a word with the capture of the last character
    [^a-z]+ # non-word characters
    (?=\1) # lookahead that checks the next word first letter
)+ # repeat
[a-z]* # last next word

demo

Upvotes: 4

Dario
Dario

Reputation: 4035

Yes, is theoretically possible if your regex-engine supports recursive references.

This problems is similar with checking if a string is palindrome (question: How to check that a string is a palindrome using regular expressions?).

Upvotes: 0

Related Questions