Reputation: 562
I am trying to capture a multiline string which start at a specific word Case
and end with a date with format dd.mm.yyyy
OR dd.m.dddd
Here is sample text:
Case No.X.1 I know I am a lucky boy (1) a sda asddasd (ii) sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date ......... 29.8.1818
Case No.X.1 I know I am a lucky boy (1) a sda asddasd (ii) sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date ......... 29.8.1818
Case No.X.1 I know I am a lucky boy (1) a sda asddasd (ii) sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date ......... 29.8.1818
Case No.X.1 I know I am a lucky boy (1) a sda asddasd (ii) sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date ......... 29.8.1818
I am trying this:
Flags: g m i
^case[^]*\d{1,2}\.\d{1,2}\.\d{2,4}
^case[\s\S]*\d{1,2}\.\d{1,2}\.\d{2,4}
((^case)[\s\S]+(\d{1,2}\.\d{1,2}\.\d{2,4}))
note: case insensitive flag is set
I am expecting to get group of each paragraph (case - date)
.
These expressions capture the only one group with first case to last date
Case No.X.1 I know I am a lucky boy (1) a sda asddasd (ii) sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date .........29.8.1818
Case No.X.1 I know I am a lucky boy (1) a sda asddasd (ii) sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date .........29.8.1818
Case No.X.1 I know I am a lucky boy (1) a sda asddasd (ii) sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date .........29.8.1818
Case No.X.1 I know I am a lucky boy (1) a sda asddasd (ii) sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date .........29.8.1818
I am investigating newline and lookarounds.
Upvotes: 0
Views: 32
Reputation: 195408
Using flags=re.DOTALL|re.M
(regex101):
data = '''Case No.X.1 I know I am a lucky boy (1) a sda asddasd (ii)
sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date ......... 29.8.1818
Case No.X.2 I know I am a lucky boy (1) a sda asddasd (ii)
sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date ......... 29.8.1818
Case No.X.3 I know I am a lucky boy (1) a sda asddasd (ii)
sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date ......... 29.8.1818
Case No.X.4 I know I am a lucky boy (1) a sda asddasd (ii)
sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date ......... 29.8.1818'''
import re
for m in re.findall(r'^Case.*?\d{1,2}\.\d{1,2}\.\d{2,4}$', data, flags=re.DOTALL|re.M):
print(m)
print('-' * 160)
Prints:
Case No.X.1 I know I am a lucky boy (1) a sda asddasd (ii)
sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date ......... 29.8.1818
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Case No.X.2 I know I am a lucky boy (1) a sda asddasd (ii)
sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date ......... 29.8.1818
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Case No.X.3 I know I am a lucky boy (1) a sda asddasd (ii)
sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date ......... 29.8.1818
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Case No.X.4 I know I am a lucky boy (1) a sda asddasd (ii)
sa asdas asd aklk Railway, Airplane asd - (one two three four). Closing date ......... 29.8.1818
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Upvotes: 1