JasonJensenDev
JasonJensenDev

Reputation: 2407

Regex - Find the Shortest Match Possible

The Problem

Given the following:

\plain\f2 This is the first part of the note. This is the second part of the note. This is the \plain\f2\fs24\cf6{\txfielddef{\*\txfieldstart\txfieldtype1\txfieldflags144\txfielddataval44334\txfielddata 35003800380039000000}{\*\txfielddatadef\txfielddatatype1\txfielddata 340034003300330034000000}{\*\txfieldtext 20{\*\txfieldend}}{\field{\*\fldinst{ HYPERLINK "44334" }}{\fldrslt{20}}}}\plain\f2\fs24 part of the note.

I'd like to produce this:

\plain\f2 This is the first part of the note. This is the second part of the note. This is the third part of the note.

What I've Tried

The example input/output is a very simplified version of the data I need to parse and it would be nice to have a way to parse the data programmatically. I have a PHP application and I've been trying to use regex to match the segments that are important and then filter out the parts of the string that aren't required. Here's what I've come up with so far:

/\\plain.*?\\field{\\\*\\fldinst{ HYPERLINK "(.*?)" }}{\\fldrslt{(.*?)}}}}\\plain.*? /gm

regex101: https://regex101.com/r/ILLZU6/2

It almost matches what I want, but it but grabs the longest possible match instead of the shortest. I want it to match only one \\plain before the \\field{.... Maybe after the \\plain, I could match anything except for a space? How would I go about doing that?

I'm no regex expert, but my use-case really calls for it. (Otherwise, I'd just write code to handle everything.) Any help would be much appreciated!

Upvotes: 0

Views: 183

Answers (1)

Skylar
Skylar

Reputation: 934

(?:(?!\\plain).)* will match any string unless it contains a match for \\plain. Here's the regex implementing this:

/\\plain(?:(?!\\plain).)*\\field{\\\*\\fldinst{ HYPERLINK "(.*?)" }}{\\fldrslt{(.*?)}}}}\\plain.*? /gm

regex101: https://regex101.com/r/ILLZU6/5


Also, you can replace the space at the end with (?: |$) if you want to allow the end of the text to trigger it as well as a space:

/\\plain(?:(?!\\plain).)*\\field{\\\*\\fldinst{ HYPERLINK "(.*?)" }}{\\fldrslt{(.*?)}}}}\\plain.*?(?: |$)/gm

regex101: https://regex101.com/r/ILLZU6/4

Upvotes: 1

Related Questions