yaobin
yaobin

Reputation: 2526

What is your strategy to write logs in your software to deal with possible HUGE amount of log messages?

Thanks for your time and sorry for this long message!

My work environment

Linux C/C++(but I'm new to Linux platform)

My question in brief

In the software I'm working on we write a LOT of log messages to local files which make the file size grow fast and finally use up all the disk space(ouch!). We want these log messages for trouble-shooting purpose, especially after the software is released to the customer site. I believe it's of course unacceptable to take up all the disk space of the customer's computer, but I have no good idea how to handle this. So I'm wondering if somebody has any good idea here. More info goes below.

What I am NOT asking

1). I'm NOT asking for a recommended C++ log library. We wrote a logger ourselves.

2). I'm NOT asking about what details(such as time stamp, thread ID, function name, etc) should be written in a log message. Some suggestions can be found here.

What I have done in my software

I separate the log messages into 3 categories:

My tries and thoughts

1). I tried to not record any INFO log messages. This resolves the disk space issue but I also lose a lot of information for debugging. Think about this: My customer is in a different city and it's expensive to go there often. Besides, they use an intranet that is 100% inaccessible from outside. Therefore: we can't always send engineers on-site as soon as they meet problems; we can't start a remote debug session. Thus log files, I think, are the only way we could make use to figure out the root of the trouble.

2). Maybe I could make the logging strategy configurable at run-time(currently it's before the software runs), that is: At normal run-time, the software only records SYSTEM and ERROR logs; when a problem arises, somebody could change the logging configuration so the INFO messages could be logged. But still: Who could change the configuration at run-time? Maybe we should educate the software admin?

3). Maybe I could always turn the INFO message logging on but pack the log files into a compressed package periodically? Hmm...

Finally...

What is your experience in your projects/work? Any thoughts/ideas/comments are welcome!

EDIT

THANKS for all your effort!!! Here is a summary of the key points from all the replies below(and I'll give them a try):

1). Do not use large log files. Use relatively small ones.

2). Deal with the oldest ones periodically(Either delete them or zip and put them to a larger storage).

3). Implement run-time configurable logging strategy.

Upvotes: 8

Views: 3196

Answers (6)

cja
cja

Reputation: 10036

I like to log a lot. In some programs I've kept the last n lines in memory and written to disk in case of an error or the user requesting support.

In one program it would keep the last 400 lines in memory and save this to a logging database upon an error. A separate service monitored this database and sent a HTTP request containing summary information to a service at our office which added this to a database there.

We had a program on each of our desktop machines that showed a list (updated by F5) of issues, which we could assign to ourselves and mark as processed. But now I'm getting carried away :)

This worked very well to help us support many users at several customers. If an error occurred on a PDA somewhere running our software then within a minute or so we'd get a new item on our screens. We'd often phone a user before they realised they had a problem.

We had a filtering mechanism to automatically process or assign issues that we knew we'd fixed or didn't care much about.

In other programs I've had hourly or daily files which are deleted after n days either by the program itself or by a dedicated log cleaning service.

Upvotes: 0

Ed Heal
Ed Heal

Reputation: 60037

My answer is to write long logs and then tweat out the info you want.

Compress them on a daily basis - but keep them for a week

Upvotes: 0

Matthieu M.
Matthieu M.

Reputation: 300349

There are two important things to take note of:

  • Extremely large files are unwieldy. They are hard to transmit, hard to investigate, ...
  • Log files are mostly text, and text is compressible

In my experience, a simple way to deal with this is:

  • Only write small files: start a new file for a new session or when the current file grows past a preset limit (I have found 50 MB to be quite effective). To help locate the file in which the logs have been written, make the date and time of creation part of the file name.
  • Compress the logs, either offline (once the file is finished) or online (on the fly).
  • Put up a cleaning routine in place, delete all files older than X days or whenever you reach more than 10, 20 or 50 files, delete the oldest.

If you wish to keep the System and Error logs longer, you might duplicate them in a specific rotating file that only track them.

Put altogether, this gives the following log folder:

 Log/
   info.120229.081643.log.gz // <-- older file (to be purged soon)
   info.120306.080423.log // <-- complete (50 MB) file started at log in
                                 (to be compressed soon)
   info.120306.131743.log // <-- current file

   mon.120102.080417.log.gz // <-- older mon file
   mon.120229.081643.log.gz // <-- older mon file
   mon.120306.080423.log // <-- current mon file (System + Error only)

Depending on whether you can schedule (cron) the cleanup task, you may simply spin up a thread for cleanup within your application. Whether you go with a purge date or a number of files limit is a choice you have to make, either is effective.

Note: from experience, a 50MB ends up weighing around 10MB when compressed on the fly and less than 5MB when compressed offline (on the fly is less efficient).

Upvotes: 5

stefanB
stefanB

Reputation: 79920

One way to deal with it is to rotate log files.

Start logging into a new file once you reach certain size and keep last couple of log files before you start overwriting the first one.

You will not have all possible info but you will have at least some stuff leading up to the issue.

The logging strategy sounds unusual but you have your reasons.

Upvotes: 4

blueshift
blueshift

Reputation: 6891

Your (3) is standard practice in the world of UNIX system logging.

  1. When log file reaches a certain age or maximum size, start a new one
  2. Zip or otherwise compress the old one
  3. throw away the nth oldest compressed log

Upvotes: 5

Ed Heal
Ed Heal

Reputation: 60037

I would

a) Make the level of detail in the log messages configurable at run time. b) Create a new log file for each day. You can then get cron to either compress them and/or delete them or perhaps transfer to off-ling storage.

Upvotes: 3

Related Questions