DimG
DimG

Reputation: 1781

Strange pipe buffering

I have a file full of file numbers (starting from 0)

$ cat in.del
0
1
2
....

Could anybody explain what happens here and where does buffering take place other than in the pipe? To my understanding both head's fileno(stdin) must look directly into the pipe's read end

$ cat in.del | ( head -n1 ; head -n1 )
0
60

How the following code differs from the above one?

$ cat in.del | ( head -n10 ; head -n10 )
0
1
...
8
9
60
1861 # O_o
1862
1863
...
1868
1869

This works as expected and shows that head itself doesn't read more bytes than it actually writes to its' stdout:

$ ( head -n10 ; head -n10 ) < ./in.del
0
1
...
9
10
11
...
18
19

Obviously there is something related to pipe happening

Update

OS:Ubuntu 18.04.1 LTS

Bash: version 4.4.19(1)-release (x86_64-pc-linux-gnu)

Update 2 As an addition to the fantastic answer by @Barmar, more on stdio buffering

Upvotes: 4

Views: 150

Answers (1)

Barmar
Barmar

Reputation: 781059

What's happening is that stdio reads an entire buffer at a time from the pipe, and the buffer size on Linux is 8K.

Then head reads the first 10 lines out of the buffer, prints them, and exits.

The next head starts reading from the pipe where the last one left off, 8K bytes into the file. It reads that line and the following 9 lines. The 60 that you see is the end of 1860.

The reason why it works as expected in the last case is because head seeks to the end of the last line that it printed before it exits. Seeking doesn't work in a pipe, so this has no effect. But when stdin is an ordinary file, the seek works, and the next process starts from where the seek set the file position.

I see slightly different results on my Mac. Its buffer size is 64K, so the second head starts much later in the file. It also doesn't seek back to the end of the last printed line before exiting, so the version with file redirection works the same as piping.

Upvotes: 5

Related Questions