E.K.
E.K.

Reputation: 321

Apache access log is formatted incorrectly

I'm trying to add a monitoring system to parse my Apache logs. I'm running on an AWS Elastic Beanstalk AMI (Amazon Linux, ami-655e8e0a).

Looking at my apache conf file (/etc/httpd/conf/httpd.conf) there's the following snippet:

<IfModule log_config_module>
    #
    # The following directives define some format nicknames for use with
    # a CustomLog directive (see below).
    #
    LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
    LogFormat "%h %l %u %t \"%r\" %>s %b" common

    <IfModule logio_module>
      # You need to enable mod_logio.c to use %I and %O
      LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O" combinedio
    </IfModule>

    #
    # The location and format of the access logfile (Common Logfile Format).
    # If you do not define any access logfiles within a <VirtualHost>
    # container, they will be logged here.  Contrariwise, if you *do*
    # define per-<VirtualHost> access logfiles, transactions will be
    # logged therein and *not* in this file.
    #
    #CustomLog "logs/access_log" common

    #
    # If you prefer a logfile with access, agent, and referer information
    # (Combined Logfile Format) you can use the following directive.
    #
    CustomLog "logs/access_log" combined
</IfModule>

A sample actual log line looks like:

1.2.3.4 (-) - - [11/Nov/2018:06:41:59 +0000] "GET /myproj/ HTTP/1.1" 200 1500 "-" "ELB-HealthChecker/2.0"

Looking at the definition of the 'combined' format in the conf file, it looks like there should be only two fields between the IP address (%h) and the timestamp (%t), but I count three (the "(-)" and the two "-"). This causes the monitoring system's default Apache log parser to fail.

Firstly, this hyphen in parenthesis is strange - why is it in parenthesis? Secondly, why are there three fields instead of two? Thirdly, when I edit the line for the 'combined' LogFormat in the conf file, it doesn't change the actual logs.

The only workaround I found was to create a new LogFormat with a different name and change the CustomLog to work with it instead of with the 'combined' LogFormat. It looks just like the 'combined' LogFormat line, except it has a different name, yet the logs come out fine with it - without that extra '(-)' part, i.e.:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" mytestformat
CustomLog "logs/access_log" mytestformat

How come the actual default 'combined' definition is adding this strange '(-)'? Where is it coming from? And why is it impossible to change it?

Thanks.

Upvotes: 1

Views: 1456

Answers (1)

E.K.
E.K.

Reputation: 321

Got it! It turns out that the EBS AMI has the /etc/httpd/conf.d/wsgi.conf file that overrides these settings. The last line in this file is:

LogFormat "%h (%{X-Forwarded-For}i) %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

I changed it to:

LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

(removed the %h and the parenthesis around the X-Forwarded-For) and now everything is working well!

Upvotes: 2

Related Questions