Reputation: 790
I am trying to get the parts of the text line that are after colon. For example from this text
previous usc contact name:*assistant director of field education*
agency name:*development corporation
I want to get the following:
assistant director of field education
1010 development corporation
I tried the following regex
.*:\*?(.*)\**$
It did not work. What is working right now is this:
.*:\*?(.*)\*
I do not understand why it is working on the second line where it does not have asterisk, and regex requires asterisk. And I do not understand why the first regex does not work properly.
Thanks.
Upvotes: 5
Views: 25955
Reputation: 20889
In a nutshell:
The second regex .*:\*?(.*)\*
works, because:
.*
is matching:
previous usc contact name
andagency name
followed by :\*
(escaped *
means: match *
).
(.*)\*
is finally matching EVERYHTING until the LAST *
.
(Assuming you missed the star in the last line, this matches:)
assistant director of field education
anddevelopment corporation
Why the first regex fails is hard to tell from the example given. .*:\*?(.*)\**$
means, that the END OF THE LINE needs to be zero or multiple *
(\**
)
Assuming, your line breaks are as provided, it will only match development corporation
, because the anchor $
(line end) normaly bahaves in single-line mode, means "end of String". Therefore the regex is only able to match ONCE. If you change the modifier to be multiline-mode (meaning, $
matches every \r\n
rather than just the END OF STRING) will give you the required result.
SingleLine-Mode, matching:
development corporation
.*:\*?(.*)\**$
Multiline-Mode matching:
assistant director of field education
anddevelopment corporation
.*:\*?(.*)\**$
The beavhiour of ^
and $
depends on the modifier:
given the String
Hello
World
and using ^(.*)$
in single-line mode will match Hello World
. Using the same pattern in multiline mode will match Hello
and World
in two different Matchgroups.
In SingleLine, the String will be handled by the regex engine like
^Hello
World$
In MultiLine Mode, the Engine threads it like
^Hello$
^World$
Upvotes: 6