Creibold
Creibold

Reputation: 23

Grok pattern issues with logstash and postfix

I'm having issues parsing out a certain line of data for my elasticsearch server, in order to make it searchable, etc.

What I'm attempting to do here is have postfix log the subject line of all messages that go through the system. I am aware this is a bit of a grey area for data logging, but it seems to work.

To achieve this I have edited the main.cf in my postfix file to perform a headercheck for the subject line and record it at the INFO level and write it out to the maillog.

Thus, the subject line of the message comes from the postfix cleanup process, and looks something like this in kibana, onced parsed out:

Table
JSON
@timestamp      January 8th 2016, 11:51:10.951
t@version       1
t_id            AVIiJeGaAHt2sxJKgJgY
t_index         logstash-2016.01.08
#_score         [empty]
t_type          log
#count          1
tfields.type        postfix
tfrom           [Incoming server]
thelo           [Test computer]
tinput_type     [empty]
#line           715
tmessage        Jan  8 11:51:10 testserver postfix/cleanup[19150]: CFEBE81B5877: info: header Subject: Test Messages from unknown[10.21.2.166]; from=<[email protected]> to=<[email protected]> proto=ESMTP helo=<testcomputer>
#offset         226,216
tproto          ESMTP
tshipper        Testserver
tsource         /var/log/maillog
ttags           _grokparsefailure
tto             [email protected]
ttype           log

Here is my grok patterns file I am using:

# Postfix stuff based on https://gist.github.com/jbrownsc/4694374:
# ORIGINAL POSTFIX PATTERNS #
QUEUEID (?:[A-F0-9]+|NOQUEUE)
EMAILADDRESSPART [a-zA-Z0-9_.+-=:]+
EMAILADDRESS %{EMAILADDRESSPART:local}@%{EMAILADDRESSPART:remote}
RELAY (?:%{HOSTNAME:relayhost}(?:\[%{IP:relayip}\](?::[0-9]+(.[0-9]+)?)?)?)
POSREAL [0-9]+(.[0-9]+)?
DELAYS (%{POSREAL}[/]*)+
DSN %{NONNEGINT}.%{NONNEGINT}.%{NONNEGINT}
STATUS sent|deferred|bounced|expired
PERMERROR 5[0-9]{2}
MESSAGELEVEL reject|warning|error|fatal|panic
POSTFIXACTION discard|dunno|filter|hold|ignore|info|prepend|redirect|replace|reject|warn

# postfix/smtp and postfix/lmtp and postfix/local
POSTFIXSMTPRELAY %{QUEUEID:qid}: to=<%{EMAILADDRESS:to}>,(?:\sorig_to=<%{EMAILADDRESS:orig_to}>,)? relay=%{RELAY}, delay=%{POSREAL:delay}, delays=%{DELAYS:delays}, dsn$
POSTFIXSMTPCONNECT connect to %{RELAY}: %{GREEDYDATA:reason}
POSTFIXSMTP4XX %{QUEUEID:qid}: host %{RELAY} said: %{GREEDYDATA:reason}
POSTFIXSMTP5XX %{QUEUEID:qid}: to=<%{EMAILADDRESS:to}>,(?:\sorig_to=<%{EMAILADDRESS:orig_to}>,)? relay=%{RELAY}, delay=%{POSREAL:delay}, delays=%{DELAYS:delays}, dsn=%$
POSTFIXSMTPREFUSAL %{QUEUEID:qid}: host %{RELAY} refused to talk to me: %{GREEDYDATA:reason}
POSTFIXSMTPLOSTCONNECTION %{QUEUEID:qid}: lost connection with %{RELAY} while %{GREEDYDATA:reason}
POSTFIXSMTPTIMEOUT %{QUEUEID:qid}: conversation with %{RELAY} timed out while %{GREEDYDATA:reason}

# postfix/smtpd
POSTFIXSMTPDCONNECTS (?:dis)?connect from %{RELAY}
POSTFIXSMTPDACTIONS %{QUEUEID:qid}: %{POSTFIXACTION}: %{DATA:command} from %{RELAY}: %{DATA:smtp_response}: %{DATA:reason}; from=<%{EMAILADDRESS:from}> to=<%{EMAILADDR$
POSTFIXSMTPDTIMEOUTS timeout after %{DATA:command} from %{RELAY}
POSTFIXSMTPDLOGIN %{QUEUEID:qid}: client=%{DATA:client}, sasl_method=%{DATA:saslmethod}, sasl_username=%{EMAILADDRESS:saslusername}
POSTFIXSMTPDCLIENT %{QUEUEID:qid}: client=%{DATA:client}

# postfix/cleanup
POSTFIXCLEANUP %{QUEUEID:qid}: %{DATA:type_alert}: %{GREEDYDATA:subject} from %{RELAY}; message-id=<%{EMAILADDRESS:messageid}>

# postfix/bounce
POSTFIXBOUNCE %{QUEUEID:qid}: sender non-delivery notification: %{QUEUEID:bouncequeueid}

# postfix/qmgr and postfix/pickup
POSTFIXQMGR %{QUEUEID:qid}: (?:removed|from=<(?:%{EMAILADDRESS:from})?>(?:, size=%{POSINT:size}, nrcpt=%{POSINT:nrcpt} \(%{GREEDYDATA:queuestatus}\))?)

# postfix/warm
POSTFIXINFO %{QUEUEID:qid}

As you can see, it does not want to parse out the subject line for me. I have tried to make the right changes under the postfix/cleanup pattern, but it does not seem to be working. I am new to grok pattern construction and any help would be appreciated.

Grok statement for logastash:

input {
    file {
        type => "postfix"
        path => "/var/log/maillog"
    }
}

    filter {
        grok {
            patterns_dir => [ "/etc/logstash/patterns.d" ]
            pattern => [
                "%{SYSLOGBASE} %{POSTFIXSMTPDCONNECTS}",
                "%{SYSLOGBASE} %{POSTFIXSMTPDACTIONS}",
                "%{SYSLOGBASE} %{POSTFIXSMTPDTIMEOUTS}",
                "%{SYSLOGBASE} %{POSTFIXSMTPDLOGIN}",
                "%{SYSLOGBASE} %{POSTFIXSMTPDCLIENT}",
                "%{SYSLOGBASE} %{POSTFIXSMTPRELAY}",
                "%{SYSLOGBASE} %{POSTFIXSMTPCONNECT}",
                "%{SYSLOGBASE} %{POSTFIXSMTP4XX}",
                "%{SYSLOGBASE} %{POSTFIXSMTP5XX}",
                "%{SYSLOGBASE} %{POSTFIXSMTPREFUSAL}",
                "%{SYSLOGBASE} %{POSTFIXSMTPLOSTCONNECTION}",
                "%{SYSLOGBASE} %{POSTFIXSMTPTIMEOUT}",
                "%{SYSLOGBASE} %{POSTFIXBOUNCE}",
                "%{SYSLOGBASE} %{POSTFIXQMGR}",
                "%{SYSLOGBASE} %{POSTFIXCLEANUP}",
                "%{SYSLOGBASE} %{POSTFIXINFO}"
            ]
            named_captures_only => true
        }
    }

Upvotes: 1

Views: 2511

Answers (1)

Alain Collins
Alain Collins

Reputation: 16362

Your pattern has to match your input, as stated in the previous comments. If you look at your input, it has several sections after the SYSLOGBASE. (Line numbers added for sake of discussion):

1. CFEBE81B5877:
2. info:
3. header Subject: Test Messages from unknown[10.21.2.166];
4. from=<[email protected]>
5. to=<[email protected]>
6. proto=ESMTP
7. helo=<testcomputer>

So your pattern must account for all (or, under some circumstances, some) of this data.

Even your new pattern does not do this (again split up with numbers for conversational value):

1. %{QUEUEID:qid}:
2. %{DATA:type_alert}:
3. %{GREEDYDATA:subject} from %{RELAY};
to=<%{EMAILADDRESS:to}>
%{DATA:proto)
<%{IPORHOST}>

Line numbers 1-3 of your input are matched by 1-3 of your pattern, which you can test in the grok debugger.

But look at line 4 of the input. You don't have anything that matches it in your pattern. As such, the entire pattern doesn't match, and you get no fields.

The universal advice with grok is to use the debugger, starting slowing, moving one field at a time from the left. This will keep you from missing fields, and make sure that you're happy with the parsing as you move along.

Upvotes: 0

Related Questions