Reputation: 6738
I have a file which is saved output from an ASCII stream, so it's all text. I'm using a bash
script with a sequence of sed
statements on a mac to clean it up. There is one aspect of this that I'm stumped on.
In the file there are places where I need to delete part of a line to the end and the next line.
Example section of file:
abcdefg000
hijk
Should come out to:
abcdefg
hijk
Tried:
sed '/000/{N;d;}' FILE
That DOES delete the next blank line, but also deletes the first line. I end up with:
hijk
Since it's a mac I can't use sed to insert a newline (I've tried), but I have successfully replaced with a character and used tr
to switch it out for a newline. Thought if I did that and since tr
should take a string, include a special character, then I should be able to do the delete two lines sed and that would work.
sed 's/000/|/' FILE | tr '|' '\n|' | sed '/|/{N;d;}'
However, when I do this, I get only the newline and tr chops off the pipe. sed then doesn't find it and so doesn't delete any lines. I get:
abcdefg
hijk
man tr
says it accepts a string, so not sure why it won't take \n |
as a string.
I could redo this in some other language script, but I've spend long enough on it now and looked through enough other questions and answers that I want to get this to work. Either I'm missing something on sed
or tr
or there's some other simple way to do this.
Upvotes: 0
Views: 933
Reputation: 785058
You may use this sed
on OSX:
sed '/000$/{s///;n;d;}' file
abcdefg000
hijk
foo
bar
Where original file is:
cat file
abcdefg000
hijk
foo
bar
Upvotes: 1
Reputation: 6994
awk
should do a pretty good job of modifying the text file in the way that you want. Conditionally removing a single newline after a line ending with 000
is straightforward. We use a temporary variable w
to control how many "lines ahead" we're able to delete blank lines from.
awk '/000$/ { sub(/000$/,""); w = NR + 1; }
NF == 0 && w < NR { next; }
{ print; }'
And here's a way to do it if the script needs to handle the possibility of multiple newlines after a 000
. the interpretation of d
is whether we're in a state where we're dropping blank lines or not.
awk '/000$/ {d=1;sub(/000$/,"");print;next;}
NF == 0 && d { next; }
{ d = 0; print}'
You can coax sed
into cleaning up newlines by swapping newlines with another character and then swapping back. Note that sed will, at least on OS X, add a trailing newline to the stream anyway, so you have to get rid of the stray @
or |
or whatever at the very end of the stream.
cat /tmp/data.txt | tr '\n@' '@\n' | sed 's/000@//' | \
tr '\n@' '@\n' | sed '/^@$/d'
Here's how to compact multiple newlines after 000
to a single newline. Or one way to do it.
cat /tmp/data.txt | tr '\n@' '@\n' | sed 's/000@*/@/' | \
tr '\n@' '@\n' | sed '/^@$/d'
Upvotes: 1