buzznoisy
buzznoisy

Reputation: 11

Extracting paragraphs containing a particular string from a group of paragraph having all paragraphs with same start and end

I need to Extract paragraphs containing a particular string from a group of paragraph having all paragraphs with same start and end

for example: In the below text, all the paragraphs' first line starts with "Thread" and last line starts with "Breadcrumb", Now I want to extract only those paragraphs which contains "string_to_be_searched"

 Thread 1398 (Thread name)
    data...
    Breadcrumb: some alpha numeric data

  Thread 1398 (Thread name)
    data...
    string_to_be_searched
    Breadcrumb: some alpha numeric data



  Thread 1398 (Thread name)
    data...
    Breadcrumb: some alpha numeric data

 Thread 1398 (Thread name)
    data...
    string_to_be_searched
    Breadcrumb: some alpha numeric data

 Thread 1398 (Thread name)
    data...
    Breadcrumb: some alpha numeric data

I have tried it using a regular expression but when I try it without g option it gives me first two threads and when I do it with g it gives me first 4 threads, instead it should give me only 2nd and 3rd thread.

var re = /(Thread[\s\S]*?sys_mlock[\s\S]*?Bread.*)/m; 

Problem Demo: https://regex101.com/r/nR3qG9/2

Upvotes: 1

Views: 111

Answers (1)

anubhava
anubhava

Reputation: 785481

You can use this lookahead based regex:

/(\bThread ((?!\bBread)[\s\S])*string_to_be_searched((?!\bBread)[\s\S])*Bread.*)/g

Updated Regex Demo

((?!\bBread)[\s\S])* is the key here which means match 0 or more characters (including newlines) that are not followed by another Bread pattern (your end block).

Upvotes: 1

Related Questions