Reputation: 321
I'm trying to add a monitoring system to parse my Apache logs. I'm running on an AWS Elastic Beanstalk AMI (Amazon Linux, ami-655e8e0a).
Looking at my apache conf file (/etc/httpd/conf/httpd.conf) there's the following snippet:
<IfModule log_config_module>
#
# The following directives define some format nicknames for use with
# a CustomLog directive (see below).
#
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %b" common
<IfModule logio_module>
# You need to enable mod_logio.c to use %I and %O
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O" combinedio
</IfModule>
#
# The location and format of the access logfile (Common Logfile Format).
# If you do not define any access logfiles within a <VirtualHost>
# container, they will be logged here. Contrariwise, if you *do*
# define per-<VirtualHost> access logfiles, transactions will be
# logged therein and *not* in this file.
#
#CustomLog "logs/access_log" common
#
# If you prefer a logfile with access, agent, and referer information
# (Combined Logfile Format) you can use the following directive.
#
CustomLog "logs/access_log" combined
</IfModule>
A sample actual log line looks like:
1.2.3.4 (-) - - [11/Nov/2018:06:41:59 +0000] "GET /myproj/ HTTP/1.1" 200 1500 "-" "ELB-HealthChecker/2.0"
Looking at the definition of the 'combined' format in the conf file, it looks like there should be only two fields between the IP address (%h) and the timestamp (%t), but I count three (the "(-)" and the two "-"). This causes the monitoring system's default Apache log parser to fail.
Firstly, this hyphen in parenthesis is strange - why is it in parenthesis? Secondly, why are there three fields instead of two? Thirdly, when I edit the line for the 'combined' LogFormat in the conf file, it doesn't change the actual logs.
The only workaround I found was to create a new LogFormat with a different name and change the CustomLog to work with it instead of with the 'combined' LogFormat. It looks just like the 'combined' LogFormat line, except it has a different name, yet the logs come out fine with it - without that extra '(-)' part, i.e.:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" mytestformat
CustomLog "logs/access_log" mytestformat
How come the actual default 'combined' definition is adding this strange '(-)'? Where is it coming from? And why is it impossible to change it?
Thanks.
Upvotes: 1
Views: 1456
Reputation: 321
Got it! It turns out that the EBS AMI has the /etc/httpd/conf.d/wsgi.conf
file that overrides these settings. The last line in this file is:
LogFormat "%h (%{X-Forwarded-For}i) %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
I changed it to:
LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
(removed the %h
and the parenthesis around the X-Forwarded-For
) and now everything is working well!
Upvotes: 2