Reputation: 163
is there any regex that would separate the title
and the address
to from the text to the output below?
This is what i have so far:
.+?(?=\d+.*Singapore \d{6}\b)
Text:
Marina Bay Sands Relocated! 2 Bayfront Avenue Galleria Level #B1-01 Singapore 018972
+65 6634 9969
nex 23 Serangoon Central #B1-10 Singapore 556083
+65 6634 7787
Northpoint City 1 Northpoint Drive South Wing #B1-107 Singapore 768019
+65 6481 3433
Output:
Marina Bay Sands Relocated!
2 Bayfront Avenue Galleria Level #B1-01 Singapore 018972
nex
23 Serangoon Central #B1-10 Singapore 556083
Northpoint City 1 Northpoint Drive South Wing #B1-107 Singapore 768019
+65 6481 3433
Upvotes: 2
Views: 54
Reputation: 626747
You may use
(.+?)\s*(\d+.*Singapore \d{6})\b(?:\r?\n(\+65\s*\d{4}\s*\d{4}))?
Or just
(.+?)\s*(\d+.*Singapore \d{6})\b(?:\r?\n(\+65[\d ]*))?
See the regex demo.
Details
(.+?)
- Group 1: any 1 or more chars other than linebreak chars, as few as possible\s*
- 0+ whitespaces(\d+.*Singapore \d{6})
- Group 2: 1+ digits, any 0+ chars other than line break chars, as many as possible, Singapore
and then six digits\b
- word boundary(?:\r?\n(\+65\s*\d{4}\s*\d{4}))?
- an optional sequence of
\r?\n
- CRLF or LF line ending(\+65\s*\d{4}\s*\d{4})
- Group 3: +65
, 0+ whitespaces, 4 digits, 0+ whitespaces, 4 digits. The [\d ]*
will match 0 or more digits or spaces.Three group contents per match:
Marina Bay Sands Relocated!
2 Bayfront Avenue Galleria Level #B1-01 Singapore 018972
+65 6634 9969
nex
23 Serangoon Central #B1-10 Singapore 556083
+65 6634 7787
Northpoint City
1 Northpoint Drive South Wing #B1-107 Singapore 768019
+65 6481 3433
Upvotes: 1