Centralized Logging - Write to local file first?

Question

I have an server API that has a few application instances and a worker instance. Currently, the applications send some data to Loggly (a SAAS centralized logging service). This was good to get started, but i'm starting to look into creating a setup using some open source software.

Besides, the current costs of using Loggly my, biggest concern is that: connecting to Loggly at the end of requests, to log the data, is adding time to the requests.

I've been reading a bit about Logstash, Graphite, ElasticSearch, etc, in conjunction with LogRotate, and some sources seem to suggest writing to a local file on each server, and then when LogRotating, sending them off to Logstash

I'm curious what practices people find most efficient in centralized logging scenarios. Should I be writing to local files on each server first? Or is that making each box to "stateful" and instead, should i just be sending the data directly off the Logstash, or SQS, for processing by a centralized server?

sysadmin1138 · Accepted Answer

When it comes to centralized logging scenarios, there are implementation differences between tightly coupling your log-producers to logstash and doing so loosely. For very large scale, tight-coupling should be avoided towards the middle. Tight coupling is creating a socket between your producer and your receiver to transmit events over, which can produce lag on the producer side if the receiver is slow.

Loose coupling can come in a variety of methods:

Queue-mediation, through SQS, Redis, Kafka, AQMP, etc.
File-mediation, through files.

The very large centralized logging systems I know all use some form of queue mediation in the centralization tier.

That said, at the edges the use-cases are different. If you need to avoid writing to files in order to reduce I/O, using TCP or UDP sockets to transmit to a locally installed logstash (that then ships the events to a central queue) can be quite fast.

Centralized logging with logstash can take many forms. If you can install logstash on your log-producing nodes, here is one architecture that is quite valid:

Logstash installed on the producing node.
- That instance is configured to listen to a TCP port for application-logging, and a few files for system logging.
Instance-logstash ships events to SQS.
A fleet of parser-logstash instances pulls jobs off of SQS and processes them, outputting to wherever.

In this architecture, all of the filtering logic is housed in the parser-logstashes, leaving instance-logstash to be nothing but a shipper. The best part is that the parser-logstash tier can be scaled up and down as load warrants. This keeps the instance-logstash with a minimal memory footprint, so it won't compete with the application for resources.

Since Logstash had a loggly plugin, you can still feed data there if you wish while also keeping a copy locally.

Files vs. direct-connections

Deciding between these two is best done by answering a few questions:

If I'm connecting directly through something like a TCP socket, what happens to my application if the log-receiver is not available for 5 minutes?
How sensitive is my instance to storage I/O?

Files are a method of loose-coupling on the instance. If your answer to the first question is, the app pauses until the log-receiver is back, you may not want that sort of tight coupling. In that case, log files are a way to provide a buffer. A buffer that will survive instance restarts, if that's important for you.

It is keeping state on the instance. However, it should be very short-lived state. The log shipper should dump state to the central queue system fast enough that you're not keeping more than a few seconds.

If you are very sensitive to storage I/O and are also very sensitive to TCP state, you can still queue-mediate to a point. Install a local redis instance, have your app ship to that, and have logstash pull from there and ship to the central queue. This allows the app to be buffered from queue-events centrally. Though, in some cases it's still better to ship directly to the central queue if the app can be configured to do so.

Centralized Logging - Write to local file first?

Answers (1)

Files vs. direct-connections

Related Questions