Reputation: 322
Using bash I often want to get the headings of a large csv file and search the rest for a particular entry. I do this as follows.
$ (head -1; grep mike) < tmp.csv
name,age,favourite colour
mike,38,blue
But taking the input from cat, or any other command doesn't work - it seems grep never gets passed the remainder of the file.
$ cat tmp.csv | (head -1; grep mike)
name,age,favourite colour
Why is there different behaviour in these two cases?
Upvotes: 8
Views: 449
Reputation: 10582
The difference between reading from a pipe and reading from a file is that you can lseek
on a file, but not on a pipe.
The behavior here looks (as seen through strace
) like it's coming from head
, not bash. head
will read a buffer and find the appropriate number of lines, then lseek
backwards to the point where the last-output line ended, leaving the file handle open at that place. As above, this works if it's reading a file, but not if it's reading from a pipe.
I can't think of any case other than what you're doing where this behavior in head
makes sense, but there it is. Learn something new every day, I tell ya ...
Upvotes: 7
Reputation: 242343
Very strange. You should not rely on this undocumented behaviour, use something like this instead:
sed -n '1p;/mike/p' tmp.csv
Upvotes: 3
Reputation: 299633
I can't reproduce this reliably with bash 3.2.48. Either both succeed or both fail. But the underlying reason for the failures is how large the file is.
cat
reads one buffer (4k-64k depending on the system) and hands it down the pipe. head
consumes the entire buffer and then exits. grep
then has access to the file after the buffer size. On my system, I can use your pipe only to grep
thing further than one buffer into the file (so I can grep
things at the end of a long file, but not at the beginning after using head
).
It is possible that later versions of bash optimize the <
operator (but not cat
) to allow your trick work, but I don't believe this is supported behavior.
Upvotes: 0