Rips
Rips

Reputation: 2004

Asynchronous Logging

Right now in my application, at certain points, we are logging some heavy stuff into the log files.

Basically only for logging, we are creating JSON of the data available and then logging into Log files. This is a business requirement to log data in JSON format.

Now creating JSON from the data available and then logging to FILE takes a lot of time and impacts the original request return time. Now idea is to improve the situation.

One of the things that we have discussed is creating a thread pool using

Executors.newSingleThreadExecutor() 

in our code and then submitting the task to it which does the conversion of data into JSON and subsequent logging.

Is it a good approach to do this? As we are managing the thread pool itself, is it going to create some issues?

I would appreciate it if someone can share better solutions. Someway to use Log4j for this. I tried to use AsyncAppender but didn't achieve any desired result.

We are using EJB 3, JBoss 5.0, Log4j, and Java6.

Upvotes: 17

Views: 31882

Answers (6)

q3769
q3769

Reputation: 212

Disclaimer: I am the developer of elf4j-engine/elf4j-provider.

Take a look at elf4j-engine. Asynchronous is the only logging mode. Also supports JSON pattern format out of the box. Hope it addresses most of the "cautions" and usual use cases.

Good word of caution from Juned Ahsan already. Just to add a few...

  1. If you are using Executors.newSingleThreadExecutor(), the FIFO part is already taken care of by that single-thread pool (with a built-in FIFO task queue). Just make sure no more multiple-thread "fan out" after the thread pool takes over the work (or if you do further multi-threading, then all the caution applies).

  2. By using the async thread pool, the real expensive part of the logging is the operations you have to perform before handing over the log entry data to the thread-pool (for output like file writing). e.g. if you need the caller thread info, or the caller frame details such as calling method, line number, and calling file name - those are difficult to be done by the async thread from the pool, and the caller/application thread would need to do majority of the gathering work before the log data is handed over to the thread-pool. The data gathering may have to include walking the calling stack trace frames to extract the details, so it's a bit expensive but hard to get around. Asynchrony does help the output performance once all the data is gathered.

Upvotes: 0

rdalmeida
rdalmeida

Reputation: 1853

You can also try CoralLog to asynchronously log data using the disruptor pattern. That way you spend minimum time in the logger thread and all the hard work is passed to the thread doing the actual file I/O. It also provides Memory Mapped Files to speed up the consumer thread and reduce queue contention.

Disclaimer: I am one of the developers of CoralLog

Upvotes: 0

Karussell
Karussell

Reputation: 17375

There is also log4j 2: http://logging.apache.org/log4j/2.x/manual/async.html Additionally read this article about why it is so fast: http://www.grobmeier.de/log4j-2-performance-close-to-insane-20072013.html#.UzwywI9Bow4

Upvotes: 1

Alankar Srivastava
Alankar Srivastava

Reputation: 864

Have a look at Logback,AsyncAppender it already provide separate threadpool, queue etc and is easily configurable, it almost do the same as you are doing, but saves you from re-inventing the wheel.

Upvotes: 8

Juned Ahsan
Juned Ahsan

Reputation: 68715

I believe you are on right track in terms of using a separate thread pool for logging. In lot of products you will see the asynchronous logging feature. Logs are accumulated and pushed to log files using a separate thread than the request thread. Especially in prodcution environments, where are millions of incoming request and your response time need to be less than few seconds. You cannot afford anything such as logging to slow down the system. So the approach used is to add logs in a memory buffer and push them asynchronously in reasonably sized chunks.

A word of caution while using thread pool for logging As multiple threads will be working on the log file(s) and on a memory log buffer, you need to be careful about the logging. You need to add logs in a FIFO kind of a buffer to be sure that logs are printed in the log files sorted by time stamp. Also make sure the file access is synchronized and you don't run into situation where log file is all upside down or messed up.

Upvotes: 11

rajesh
rajesh

Reputation: 3407

Is using MongoDB for logging considered?

  1. MongoDB inserts can be done asynchronously. One wouldn’t want a user’s experience to grind to a halt if logging were slow, stalled or down. MongoDB provides the ability to fire off an insert into a log collection and not wait for a response code. (If one wants a response, one calls getLastError() — we would skip that here.)
  2. Old log data automatically LRU’s out. By using capped collections, we preallocate space for logs, and once it is full, the log wraps and reuses the space specified. No risk of filling up a disk with excessive log information, and no need to write log archival / deletion scripts.
  3. It’s fast enough for the problem. First, MongoDB is very fast in general, fast enough for problems like this. Second, when using a capped collection, insertion order is automatically preserved: we don’t need to create an index on timestamp. This makes things even faster, and is important given that the logging use case has a very high number of writes compared to reads (opposite of most database problems).
  4. Document-oriented / JSON is a great format for log information. Very flexible and “schemaless” in the sense we can throw in an extra field any time we want.

Upvotes: 3

Related Questions