gpol
gpol

Reputation: 964

Performance of sort command in unix

I am writing a custom apache log parser for my company and I noticed a performance issue that I can't explain. I have a text file log.txt with size 1.2GB.

The command: sort log.txt is up to 3 sec slower than the command: cat log.txt | sort

Does anybody know why this is happening?

Upvotes: 3

Views: 1663

Answers (2)

dogbane
dogbane

Reputation: 274758

cat file | sort is a Useless Use of Cat.

The purpose of cat is to concatenate (or "catenate") files. If it's only one file, concatenating it with nothing at all is a waste of time, and costs you a process.

It shouldn't take longer. Are you sure your timings are right?

Please post the output of:

time sort file

and

time cat file | sort

You need to run the commands a few times and get the average.

Upvotes: 4

Michael
Michael

Reputation: 7438

Instead of worrying about the performance of sort instead you should change your logging:

  • Eliminate unnecessarily verbose output to your log.
  • Periodically roll the log (based on either date or size).
  • ...fix the errors outputting to the log. ;)

Also, are you sure cat is reading the entire file? It may have a read buffer etc.

Upvotes: 1

Related Questions