Reputation: 29957
I have the following data (a subset of possible log4j responders if someone is interested):
ap://167.172.44.255:1389/LegitimateJavaCla
ap://167.172.44.255:1389/La
ap://167.99.32.139:1389/Basic/ReverseShell/167.99.32.139/99
ldap://x.x.x.x.61k2ev3252274o2ek77941q85t0r9444o.interact.sh/ok6ll9m
ldap://c6ps4rekeidcvgqlsmsgcg37qdoyyknz4.interact.sh/a
ldap://c6ps4rekeidcvgqlsmsgcg37x9ayymcak.interact.sh/a
ldap://c6ps4ipurnhssm2608l0cg37chyyykyhk.interact.sh/a
ldap://c6ps4ipurnhssm2608l0cg37pdyyykbug.interact.sh/a
91fd9fef8958.bingsearchlib.com:39356/
550f7e1deaed.bingsearchlib.com:39356/a
2174d47e8d04.bingsearchlib.com:39356/a
da6d408517b9.bingsearchlib.com:39356/a
5463610592ef.bingsearchlib.com:39356/a
I would like to keep the FQDN only (the host and domain) or the IP - so I tried
(\S*)?(:\/\/)?(?<interesting>.*)(:)?\/
(see https://regex101.com/r/dusRR5/1)
The idea was:
(\S*)?
→ match or not some letters (ldap
, ...)(:\/\/)?
→ match or not ://
(?<interesting>.*)
→ match anything and call it interesting
(:)?
→ ... but stop at :
if there is one\/
→ ... otherwise stop at /
The expected result is
167.172.44.255
167.99.32.139
x.x.x.x.61k2ev3252274o2ek77941q85t0r9444o.interact.sh
c6ps4rekeidcvgqlsmsgcg37qdoyyknz4.interact.sh
c6ps4rekeidcvgqlsmsgcg37x9ayymcak.interact.sh
(...)
But it does not work and my very limited knowledge of regex does not help.
Upvotes: 4
Views: 99
Reputation: 626691
You can use
^(?:[a-zA-Z0-9]+:\/\/)?(?<interesting>[^:\/]+)
See the regex demo. Details:
^
- start of string(?:[a-zA-Z0-9]+:\/\/)?
- an optional occurrence of any one or more letters/digits and then ://
(?<interesting>[^:\/]+)
- Group "interesting": any one or more chars other than :
and /
.Remember that you do not have to escape /
if you define your regex with a string literal (as in Python, or C#, or using constructor notations in JavaScript/Ruby/etc.).
Upvotes: 0