Reputation: 11
I need to Extract paragraphs containing a particular string from a group of paragraph having all paragraphs with same start and end
for example: In the below text, all the paragraphs' first line starts with "Thread" and last line starts with "Breadcrumb", Now I want to extract only those paragraphs which contains "string_to_be_searched"
Thread 1398 (Thread name)
data...
Breadcrumb: some alpha numeric data
Thread 1398 (Thread name)
data...
string_to_be_searched
Breadcrumb: some alpha numeric data
Thread 1398 (Thread name)
data...
Breadcrumb: some alpha numeric data
Thread 1398 (Thread name)
data...
string_to_be_searched
Breadcrumb: some alpha numeric data
Thread 1398 (Thread name)
data...
Breadcrumb: some alpha numeric data
I have tried it using a regular expression but when I try it without g option it gives me first two threads and when I do it with g it gives me first 4 threads, instead it should give me only 2nd and 3rd thread.
var re = /(Thread[\s\S]*?sys_mlock[\s\S]*?Bread.*)/m;
Problem Demo: https://regex101.com/r/nR3qG9/2
Upvotes: 1
Views: 111
Reputation: 785481
You can use this lookahead based regex:
/(\bThread ((?!\bBread)[\s\S])*string_to_be_searched((?!\bBread)[\s\S])*Bread.*)/g
((?!\bBread)[\s\S])*
is the key here which means match 0 or more characters (including newlines) that are not followed by another Bread
pattern (your end block).
Upvotes: 1