Reputation: 59
I have some IIS logs in which I'm looking to extract the file path and file name from the cs_uri_stem field. An example IIS event is as follows:
2018-02-21 04:39:13 <IPv4> GET /www/images/flash_email_large.gif - 8030 - <IPv4> Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.3;+WOW64;+Trident/7.0;+.NET4.0E;+.NET4.0C;+.NET+CLR+3.5.30729;+.NET+CLR+2.0.50727;+.NET+CLR+3.0.30729;+Microsoft+Outlook+16.0.4654;+ms-office;+MSOffice+16) 200 0 0 531
My regex is as follows:
.*?(GET|POST|HEAD|OPTIONS|PROPFIND)\s(?P<file_path>(?:[^\/]*\/)*)(?P<file_name>.*)\s-
but I'm getting extra characters after the file name (in this case, flash_email_large.gif). How can I exclude everything after the file name in my regex?
Thx
Upvotes: 1
Views: 35
Reputation: 784958
You can use this better performing regex to capture file path and file name in 2 capturing groups:
\s(GET|POST|HEAD|OPTIONS|PROPFIND)\s(?P<file_path>\S*\/)(?P<file_name>\S+)\s-
Changes:
.*?
with \s
(?:[^\/]*\/)*
.*
with \S+
Upvotes: 1