Reputation: 13
Trying to replace file contents using sed, the replacement works, but for some reason I am getting extra white space at the end of the resulting output file, causing the file to be unreadable/unviewable in the opening application.
My command is as follows:
for file in *.example ; do LANG=C sed -i "" "s|https://foo.bar|http://foo.bar|g" "$file" ; done
Upvotes: 1
Views: 781
Reputation: 437953
With the benefit of hindsight:
BSD/macOS sed
is fundamentally unsuitable for making substitutions in binary files, because it invariably outputs a trailing \n
(newline) with every output command.
By contrast, GNU sed
doesn't have this problem, because it - commendably - only appends a \n
if the input "line" had one too.
Note that the concept of newline-separated lines doesn't really apply to binary input: newlines may or may not be present, and potentially with large chunks of data in between. In the worst case scenario, the entire input will be read at once.[1]
You can test this behavior with the following command:
sed -n 'p' <(printf 'x') | cat -et # input printf 'x' has no trailing \n
Output x$
indicates that a newline (symbolized as $
by cat -et
) was appended (BSD Sed), whereas just x
indicates that it was not (GNU Sed).
Thus, given that you're on macOS, you could use Homebrew to install GNU Sed with brew install gnu-sed
and then use the following command:
LANG=C gsed -i 's|https://foo.bar|http://foo.bar|g' *.example
Homebrew installs GNU Sed as gsed
, so that it can exist alongside macOS's stock (BSD) sed
.
LANG=C
(slightly more robustly: LC_ALL=C
) is needed to pass all bytes of the binary input through as-is, without causing problems stemming from binary bytes not being recognized as valid characters).
Note that this approach limits you to ASCII-only characters in the substitution (unless you explicitly add byte values as escape sequences).
Note the different, incompatible -i
syntax for in-place updating without backup - no (separate) option-argument here; see this answer of mine for background.
Note how '...'
(single-quoting) is used around the Sed script, which is generally preferable, as it avoids confusion between shell expansions that happen up front and what Sed ends up seeing.
[1] Aside from memory use, it is fine to use Sed's default line-parsing behavior here, given that your substitution command doesn't match newlines. If you want to break the input into "lines" by NULs (and also use NULs on output), however, you can use GNU Sed's -z
option.
Upvotes: 0