Adam Spiers
Adam Spiers

Reputation: 17916

how to trim consecutive whitespace from beginning/end of file via sed

Using sed, how can I trim one or more consecutive whitespace-only lines from the beginning and/or end of a file? (By "whitespace-only", I mean lines which do not contain any non-whitespace characters, i.e. lines which are either blank or only include whitespace characters.)

For example if my file is:

<blank line>
<line only containing some space/tab characters>
<blank line>
foo
bar
<tab character>
baz
<space character>
<space character><tab character>
qux
<tab character>

then the desired output would be:

foo
bar
<tab character>
baz
<space character>
<space character><tab character>
qux

If trimming from the beginning and end of the file have to be done in separate sed invocations, that's OK, although I'd also be interested in solutions which manage it all within one invocation.

P.S. This is easy in Perl / Ruby etc., but I'd specifically like to know if it's possible in sed. Thanks!

Upvotes: 1

Views: 282

Answers (1)

Ed Morton
Ed Morton

Reputation: 204310

I don't see any real sed experts popping up with a solution yet so here's my attempt (GNU sed specific due to \S and \s - replace with [^[:space:]] and [[:space:]] respectively for POSIX):

$ sed -e '/\S/,$!d' -e :a -e '/^\s*$/{$d;N;ba' -e '}' file
foo
bar

baz


qux

And in case anyone wants to see a sensible approach to compare to whatever arcane sed incantation IS eventually invoked, here's one way using GNU awk for multi-char RS and \s abbreviation for [[:space:]]:

$ awk -v RS='^$' '{gsub(/^\s+|\s+$/,"")}1' file
foo
bar

baz


qux

POSIX equivalent if you're happy picking some control char you know can't be in your input (e.g. using ^C = a literal control-C char):

awk -v RS='^C' '{gsub(/^[[:space:]]+|[[:space:]]+$/,"")}1' file

otherwise:

awk '{rec=rec $0 RS} END{gsub(/^[[:space:]]+|[[:space:]]+$/,"",rec); print rec}' file

or if you are limited in memory and cant read the whole file at once you need 2 passes to identify where the last non-blank line is, e.g.:

awk 'NR==FNR{if(NF){if(!beg)beg=NR; end=NR}; next} (FNR>=beg)&&(FNR<=end)' file file

or you need to buffer the blank lines (after the initial set of them) until you hit a non-blank line and then print that buffer before the current line:

awk 'NF{printf "%s%s\n",buf,$0; buf=""; f=1; next} f{buf = buf $0 RS}' file

Upvotes: 2

Related Questions