Meow
Meow

Reputation: 89

How can I capture the desired group using REGEX

How can I break this string, to just capture Chocolate cake & nuts?

Input string

pizza & coke > sweets > Chocolate cake & nuts >

I am using this regex:

.*[\>]\s(.*)

However, it is capturing Chocolate cake & nuts > How can I remove the > and the space in the end?

Desired result lastone=Chocolate cake & nuts

Upvotes: 1

Views: 58

Answers (2)

Gene
Gene

Reputation: 47020

Avoiding capture of space around the final phrase is a bit tricky. In Java,

.*>\s*(\S+(?:\s+[^>\s]+)*)\s*>.*

captures everything except initial and ending whitespace between the final two >'s. Note that you only get the last stuff between >'s because the * is "greedy." It matches the longest possible string that allows the rest of the regex to match.

Note that when you ask about a regex, you need to specify which regex engine you're using.

Edit: How it works

.*> matches anything followed by >. Then \s* matches 0 or more whitespace chars, and capturing starts. The \S+ matches one or more non-space characters, and (?:\s+[^>\s]+)* matches 0 or more repeats of spaces followed by characters that are anything except > and space (this is the tricky part), whereupon capturing stops. The (?: ) form of parentheses are non-capturing. They only group what's inside so * can match 0 or more of whatever that is. Finally, \s*>.* matches a final > preceded by optional whitespace and followed by anything.

Upvotes: 2

luoluo
luoluo

Reputation: 5533

Try move the > out of (). .*[\>]\s(.*?)\s*>

Or the more precise version [>\s]+(\w+[\w ]*&[ \w]*\w+)[> ]+

DEMO

Upvotes: 2

Related Questions