Harel Yacovian
Harel Yacovian

Reputation: 137

azure kql parse function - unable to parse ? using regex (zero or one time)

I'm trying to parse this line:

01/11/1011 11:11:11: LOG SERVER = 1 URL = /one/one.aspx/ AccountId = 1111 MainId = 1111 UserAgent = Browser = Chrome , Version = 11.0, IsMobile = False, IP = 1.1.1.1 MESSAGE = sample message TRACE = 1

using this parse statement:

parse-where kind=regex flags=i message with 
timestamp:datetime 
":.*LOG SERVER = " log_server:string 
".*URL = " url:string 
".*AccountId = " account_id:string 
".*MainId = " main_id:string 
".*?UserAgent = " user_agent:string  
",.*Version = " version:string 
",.*IsMobile = " is_mobile:string 
",.*IP = " ip:string 
".*MESSAGE = " event:string 
".*TRACE = " trace:string

now the thing is that sometimes I got records that has one "key=value" missing but the order of the rest of the columns remains the same. to match all kinds of rows I just wanted to add (<name_of_colum>)? for example: "(,.*Version = )?" version:string but it fails everytime.

Upvotes: 0

Views: 1324

Answers (1)

Nati Nimni
Nati Nimni

Reputation: 258

I think parse/parse-where operators are more useful when you have well formatted inputs - the potentially missing values in this case would make it tricky/impossible to use these operators.
If you control the formatting of the input strings, consider normalizing it to always include all fields and/or add delimiters and quotes where appropriate. Otherwise, you could use the extract function to parse it - the following expression would work even if some lines are missing some fields:

| extend 
    timestamp = extract("(.*): .*", 1, message, typeof(datetime)),
    log_server = extract(".*LOG SERVER = ([^\\s]*).*", 1, message),
    url = extract(".*URL = ([^\\s]*).*", 1, message),
    main_id = extract(".*MainId = ([^\\s]*).*", 1, message),
    user_agent = extract(".*UserAgent = ([^,]*).*", 1, message),
    version = extract(".*Version = ([^,]*).*", 1, message),
    is_mobile = extract(".*IsMobile = ([^,]*).*", 1, message),
    ip = extract(".*IP = ([^\\s]*).*", 1, message),
    event = iff(message has "TRACE", extract(".*MESSAGE = (.*) TRACE.*", 1, message), extract(".*MESSAGE = (.*)", 1, message)),
    trace = extract(".*TRACE = (.*)", 1, message)

Upvotes: 1

Related Questions