Akash Kinwad
Akash Kinwad

Reputation: 815

Get all the text between quotes matching one of the word

a:5:{s:12:"SubmissionID";s:0:"";s:13:"NotesClientID";s:4:"1891";s:5:"Field";a:5:
{i:0;s:7:"8/19/19";i:1;s:0:"";i:4;s:0:"";i:5;s:160:"client is dismissed due to no contact with me or with his assigned Recovery
Coach";i:7;s:0:"";}s:10:"SecurityID";s:40:"a31b7ea7f9191465525d0ac9a6358ba7b3d3e0fc";s:13:"action_doSave";s:4:"Save";}

In above string expected output is "client is dismissed due to no contact with me or with his assigned Recovery Coach"

this contains dismissed word.

Similarly, regex should pick up string containing one of the dismissed/discharged/amenable in string.

I have tried this at https://rubular.com/r/QMFAHjUgaYdjpM

Upvotes: 0

Views: 62

Answers (1)

Cary Swoveland
Cary Swoveland

Reputation: 110755

You may match the desired text with the following regular expression.

/.*?"(?=(?:[^"]*\bdismissed\b[^"]*"))\K[^"]*(?=")/m

Start your engine!

Detailed information about each element of the regex is given at regex1011.

If the variable str holds the text given as an example in the question, one may write:

str.scan(/.*?"(?=(?:[^"]*\bdismissed\b[^"]*"))\K[^"]*(?=")/m)
  #=> ["client is dismissed due to no contact with me or with his assigned Recovery\nCoach"]

[String#scan] returns an array contains each match of the regex. Here there is a single match.

Ruby's regex engine performs the following operations.

/
.*?                : match 0+ characters, lazily
"                  : match a double-quote
(?=                : begin a positive lookahead to assert that
                     'dismissed' appears before the next double-quote
  (?:              : begin a non-capture group
    [^"]*          : match 0+ characters other than double-quotes
    \bdismissed\b  : match 'dismissed'
    [^"]*          : match 0+ characters other than double-quotes
    "              : match a double-quote
  )                : end non-capture group
)                  : end positive lookahead
\K                 : forget all matched so far and reset match
[^"]*              : match 0+ characters other than double-quotes
(?=")              : assert the next character is a double-quote
/m                 : multiline mode to match line terminators

Addendum: I missed the mention in the question that the target word could be any of "dismissed", "discharged" or "amenable". Rather than me revising my answer, suffice it to say that where

\bdismissed\b

appears in the regex that should be replaced with

\b(?:dismissed|discharged|amenable)\b

1. Move the cursor around for detailed explanations.

Upvotes: 1

Related Questions