Reputation: 911
I am working on a project that takes a delimited set of data of the form:
field1~field2~field3~.....~fieldn
Having empty fields is a possibility, so
field1~~~field4~~field6
is perfectly acceptable.
This file gets translated using an inhouse translator program that leaves a little to be desired. Specifically, it doesn't deal with empty fields well. My solution was to stick some dummy value in there, like a space or an @ sign. I've tried:
sed -r 's/~/~ ~/g'
and
awk '{gsub(/\~\~/,"~ ~")}; 1' file > file.SPACE
but both of these fall short in replacing MULTIPLE fields. So if I input
field1~field2~~~field3
it'll output:
field1~field2~ ~~field3
I'd like to just script this if I could, as I can't change the code of the translator. I can change the code in the program that creates the delimited file, but I'd rather not. Is there some workaround, or is coming up with an expression for this just one of the inherent limitations in a regular language?
EDIT: Wow thanks for the quick response everyone, all your solutions worked so I upvoted all of them. I think I'm going to accept Janito's because of the explanation.
Also why the downvote?
Upvotes: 1
Views: 202
Reputation: 212684
awk '{for( i=0; i<=NF; i++ ) if( $i ~ /^$/ ) $i = " " } 1' FS='~' OFS='~' input
or:
awk '/^$/{ $0 = " " } 1' ORS='~' RS='~' input
or:
awk '{ while( gsub( "~~", "~ ~" )); }1' input
Upvotes: 3
Reputation: 43703
You can use Perl
perl -pe 's/~(?=~)/~ /g'
...which says replace each "~"
followed by "~"
with "~ "
To store result(s) to file.SPACE
use
perl -pe 's/~(?=~)/~ /g' file >file.SPACE
Upvotes: 1
Reputation: 5072
You could try:
sed -e ':a;s/~~/~ ~/;ta'
This creates a label "a" with the ":" command, then replaces one occurrance of ~~
with ~ ~
, and then uses the "t" test command to jump back to the "a" label if the previous substitute command succeeded.
Hope this helps =)
Upvotes: 4