user4542931
user4542931

Reputation:

identify new line in regex

I would like to perform some regex on the text from MAcbeth

My text is as follows:

Scena Secunda.

Alarum within. Enter King Malcome, Donalbaine, Lenox, with
attendants,
meeting a bleeding Captaine.

  King. What bloody man is that? he can report,
As seemeth by his plight, of the Reuolt
The newest state

My intention is to get the text from Enter to the full-stop.

I am trying this regular expression Enter(.?)*\.

But it is showing no matches. Can anybody fix my regexp?

I am trying it out in this link

Upvotes: 1

Views: 70

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626709

Since @Tushar has not explained the issue you had with your regex, I decided to explain it.

Your regex - Enter(.?)*\. - matches a word Enter (literally), then optionally matches any character except a newline 0 or more times, as many as possible, up to the last period.

The problem is that your string contains a newline between the Enter and the period. You'd need a regex pattern to match newlines, too. To force . to match newline symbols, you may use DOTALL mode. However, it won't get you the expected result as the * quantifier is greedy (will return the longest possible substring).

So, to get the substring from Enter till the closest period, you can use

Enter([^.]*)

See this regex demo. If you need no capture group, remove it.

And an IDEONE demo:

import re
p = re.compile(r'Enter([^.]*)')
test_str = "Scena Secunda.\n\nAlarum within. Enter King Malcome, Donalbaine, Lenox, with\nattendants,\nmeeting a bleeding Captaine.\n\n  King. What bloody man is that? he can report,\nAs seemeth by his plight, of the Reuolt\nThe newest state"
print(p.findall(test_str)) # if you need the capture group text, or
# print(p.search(test_str).group()) # to get the whole first match, or
# print(re.findall(r'Enter[^.]*', test_str)) # to return all substrings from Enter till the next period

Upvotes: 1

Related Questions