Reputation: 265
I have a bunch a files that contain many blank lines, and want to remove any repeated blank lines to make reading the files easier. I wrote the following script:
#!/bin/bash
for file in * ; do cat "$file" | sed 's/^ \+//' | cat -s > "$file" ; done
However, this had very unreliable results, with most files becoming completely empty and only a few files having the intended results. What's more, the files that did work seemed to change randomly every time I retried, as different files would get correctly edited in every run. What's going on?
Note: This is more of a theoretical question, because I realize I could use a workaround like:
#!/bin/bash
for file in * ; do
cat "$file" | sed 's/^ \+//' | cat -s > "$file"-tmp
rm "$file"
mv "$file"-tmp "$file"
done
But that seems unnecessarily convoluted. So why is the "direct" method so unreliable?
Upvotes: 2
Views: 871
Reputation: 123490
The unpredictability happens because there's a race condition between two stages in the pipeline, cat "$file"
and cat -s > "$file"
.
The first tries to open the file and read from it, while the other tries to empty the file.
If you have GNU sed, you can simply do sed -i 'expression' *
Upvotes: 2
Reputation: 241908
You cannot read from a file if you are writing to it at the same time. The >
redirection first clears the file, so there is nothing more to read.
You can use sed -i -e '/^$/d'
to remove empty lines (if your sed supports -i
), which creates the temporary file under the hood.
Upvotes: 1