F.M.
F.M.

Reputation: 303

how to parse historic tomcat logs (txt) from logstash to elastic with specific pattern and historic timestamp index?

I have some historic tomcat access logs in this basic format - - [19/Dec/2022:00:00:05 +0100] "POST HTTP/1.1" 200 1321

I want to ship this log entries to ElasticSearch.

My starting Logstash looks like this:

input {
  file {
    path => "/path/localhost_access_log*.txt"
    start_position => "beginning"
  }
}


filter {

}


output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "tomcat_access"
    data_stream => "false"

  }
  stdout {codec => "rubydebug"}
}

the input is already working

How to set the filter? I found a similar pattern here logstash pattern to grep tomcat access logs, but don't understand how to implement this in the filter.

Is it easier to use mutate or grok like from the example pattern in the post? My goal is to have the following fields in ElasticSearch aka Kibana

IP
timestamp with historic date
Request Type
URL
Protocol
Response Code
Return

Maybe it is also possible to set the index of ElasticSearch the same as the date of the file? How can I achieve this?

localhost_access_log.2022-12-19.txt will create an index like

tomcat_access-2022.12.19

The logs from the next day (in a new file) gets the new index.

Many thanks in advance!

======== Update ==============

via filebeat I have a similar result

my filebeat cong looks like this

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml
  # Set to true to enable config reloading
  reload.enabled: false
 
#output.logstash:
  # The Logstash hosts
  hosts: ["localhost:5044"]
 
  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
 
  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"
 
  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

with following module tomcat setup

# Module: tomcat
# Docs: https://www.elastic.co/guide/en/beats/filebeat/8.5/filebeat-module-tomcat.html
 
- module: tomcat
  log:
    enabled: true
 
    # Set which input to use between udp (default), tcp or file.
    var.input: file
    # var.syslog_host: localhost
    # var.syslog_port: 9501
 
    # Set paths for the log files when file input is used.
    var.paths:
       - /path/localhost_access_log.2022-12-26.log
 
    # Toggle output of non-ECS fields (default true).
    # var.rsa_fields: true
 
    # Set custom timezone offset.
    # "local" (default) for system timezone.
    # "+02:00" for GMT+02:00
    # var.tz_offset: local

my logstash looks like

input {
    beats {
        port => "5044"
    }
}


filter {

}



output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "beats"
    data_stream => "false"
  }
  stdout {codec => "rubydebug"}
}

and the result in elastic is again

{
  "@timestamp": [
    "2023-02-14T15:13:50.460Z"
  ],
  "@version": [
    "1"
  ],
  "@version.keyword": [
    "1"
  ],
  "agent.ephemeral_id": [
    "8d7d3d49-9f58-4ec0-991f-3f01db1da900"
  ],
  "agent.ephemeral_id.keyword": [
    "8d7d3d49-9f58-4ec0-991f-3f01db1da900"
  ],
  "agent.id": [
    "f36cfaa9-bbac-40b5-aaf1-169250563066"
  ],
  "agent.id.keyword": [
    "f36cfaa9-bbac-40b5-aaf1-169250563066"
  ],
  "agent.name": [
    "xxxx"
  ],
  "agent.name.keyword": [
    "xxxx"
  ],
  "agent.type": [
    "filebeat"
  ],
  "agent.type.keyword": [
    "filebeat"
  ],
  "agent.version": [
    "8.5.3"
  ],
  "agent.version.keyword": [
    "8.5.3"
  ],
  "ecs.version": [
    "1.12.0"
  ],
  "ecs.version.keyword": [
    "1.12.0"
  ],
  "event.dataset": [
    "tomcat.log"
  ],
  "event.dataset.keyword": [
    "tomcat.log"
  ],
  "event.module": [
    "tomcat"
  ],
  "event.module.keyword": [
    "tomcat"
  ],
  "event.original": [
    "xxxx - - [01/Jan/2023:09:32:27 +0100] \"GET xxx/health HTTP/1.0\" 200 18"
  ],
  "event.original.keyword": [
    "xxxx - - [01/Jan/2023:09:32:27 +0100] \"GET xxx/health HTTP/1.0\" 200 18"
  ],
  "fileset.name": [
    "log"
  ],
  "fileset.name.keyword": [
    "log"
  ],
  "input.type": [
    "log"
  ],
  "input.type.keyword": [
    "log"
  ],
  "log.file.path": [
    "path/localhost_access_log.2023-01-01.txt"
  ],
  "log.file.path.keyword": [
    "path/localhost_access_log.2023-01-01.txt"
  ],
  "log.flags": [
    "dissect_parsing_error"
  ],
  "log.flags.keyword": [
    "dissect_parsing_error"
  ],
  "log.offset": [
    6525120
  ],
  "observer.product": [
    "TomCat"
  ],
  "observer.product.keyword": [
    "TomCat"
  ],
  "observer.type": [
    "Web"
  ],
  "observer.type.keyword": [
    "Web"
  ],
  "observer.vendor": [
    "Apache"
  ],
  "observer.vendor.keyword": [
    "Apache"
  ],
  "service.type": [
    "tomcat"
  ],
  "service.type.keyword": [
    "tomcat"
  ],
  "tags": [
    "tomcat.log",
    "forwarded",
    "beats_input_raw_event"
  ],
  "tags.keyword": [
    "tomcat.log",
    "forwarded",
    "beats_input_raw_event"
  ],
  "_id": "iVV9UIYB4iUevYI3Slps",
  "_index": "beats",
  "_score": null
}

Upvotes: 0

Views: 332

Answers (0)

Related Questions