Reputation: 14685
I tried this (to observe the behaviour of unix sort):
yes | sort & top
What I see is the unix memory usage growing, as you would expect, but the sort process itself's memory does not appear to be growing:
Mem: 1689540k total, 1455384k used, 234156k free, 147248k buffers
Swap: 1718268k total, 804k used, 1717464k free, 956216k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
32248 mgregory 20 0 29844 25m 692 R 95.0 1.6 0:32.98 sort
32247 mgregory 20 0 4036 504 444 S 4.0 0.0 0:01.52 yes
The number 1455348 is growing rapidly
The number 29844 is not growing.
What is happening there?
Upvotes: 2
Views: 138
Reputation: 175
Unix Sort uses an External R-Way merge sorting algorithm. It basically divides the input up into smaller portions of similar size (that fit into memory) and then merges each portion together at the end.
Those small portions of the file, except during its sorting precess, are stored in temporary disk files (usually in /tmp) and not in memory. Therefore the Unix Sort command's memory usage does not increase during the sorting process.
But why is the unix memory usage growing ? Simply because "unused memory is wasted memory". The Linux kernel keeps around huge amounts of file metadata and files that were requested, until something that looks more important pushes that data out.
Upvotes: 1
Reputation: 393547
Sort doesn't need to have all data in memory, necessarily.
Sort is able to do merge sort if files are too big to fit in memory. I think (IIRC) some of this is described in the man/info pages. Edit e.g.:
--batch-size=NMERGE
merge at most NMERGE inputs at once; for more use temp files
-S, --buffer-size=SIZE
use SIZE for main memory buffer
The 1455384k
number is likely growing if
sort mmap
s in more pages than are actually 'reserved' (i.e. locked into the process address space)
buffers are counted (as files and data are read, dentries, blocks and inodes are cached). Check this by doing (as root)
echo 3 > /proc/sys/vm/drop_caches
and seeing how much memory becomes available again.
Upvotes: 3