GerogeGZ
GerogeGZ

Reputation: 19

Regular expressions for long phrases in perl

I'm looking to extract the "Account Name" and "Source Network Address" from the following text using regular expressions in a perl script. Adding a regular expression for such a long phrase, seems to take a lot of effort.

I need your help with finding the best regex for this, or any ideas would help. Keep in mind that this are just 3 examples out of possible 50? phrases similar to this (different lengths).

Example phrase 1:

WinEvtLog: Security: AUDIT_SUCCESS(4624): Microsoft-Windows-Security-Auditing: admin:     DOMAIN: hostname.domain.com: An account was successfully logged on. Subject:  Security ID:  S-1-0-0  Account Name:  -  Account Domain:  -  Logon ID:  0x0  Logon Type:   3      New Logon:  Security ID:  S-1-5-21-1130994204-1932287720-1813960501-1239  Account Name:  admin  Account Domain:  DOMAIN  Logon ID:  0x1d12cfff5  Logon GUID:  {AF5E2CF5-1A54-2121-D281-13381F397F41}  Process Information:  Process ID:  0x0  Process Name:  -  Network Information:  Workstation Name:   Source Network Address: 101.101.101.101  Source Port:  52616  Detailed Authentication Information:  Logon Process:  Kerberos  Authentication Package: Kerberos  Transited Services: -  Package Name (NTLM only): -  Key Length:  0  This event is generated when a logon session is created. It is generated on the computer that was accessed. 

Example phrase 2:

WinEvtLog: Security: AUDIT_SUCCESS(4634): Microsoft-Windows-Security-Auditing: admin: DOMAIN: hostname.domain.com: An account was logged off. Subject:  Security ID:  S-1-5-21-1130554204-1932287720-1813960501-4444  Account Name:  admin  Account Domain:  DOMAIN  Logon ID:  0x1d12d000a  Logon Type:   3  This event is generated when a logon session is destroyed. It may be positively correlated with a logon event using the Logon ID value. Logon IDs are only unique between reboots on the same computer."  4646,1

Example phrase 3:

WinEvtLog: Security: AUDIT_SUCCESS(540): Security: Administrator: HOST88: HOST88: Successful Network Logon:     User Name: Administrator        Domain:     HOST88      Logon ID:   (0x14,0x6E6FB948)       Logon Type: 3       Logon Process: NtLmSsp      Authentication Package: NTLM        Workstation Name: DESKHOST88        Logon GUID: -       Caller User Name: -     Caller Domain: -        Caller Logon ID: -      Caller Process ID: -        Transited Services: -       Source Network Address: 10.10.10.10     Source Port: 43221

Upvotes: 0

Views: 170

Answers (2)

jordanm
jordanm

Reputation: 34944

The following regex will handle your posted cases:

if ( $string =~ /(?<=Account Name:)\s+([^-\s]+).+(?:Source Network Address:)\s+([\d.]+)\s+/ ) {
    $account_name = $1;
    $source_addr = $2;
}

Upvotes: 1

Tim Pierce
Tim Pierce

Reputation: 5664

How rigorous do you want to be with your solution?

If you have log lines and want to extract the word that follows "Account Name:" and the address that follows "Source Network Address:" then you can do it with a very naive regex like this:

my ($account_name) = /Account Name:\s+(\S+)/;
my ($source_network_addr) = /Source Network Address:\s+(\S+)/;

That doesn't attempt to validate that anything else in the line is as you expect it to be, but if the application is only parsing lines that are generated by IIS or whatever, it may not need to be really precise.

Upvotes: 0

Related Questions