Reputation: 379
I have a file which is space separated, I need to convert it to pipe delimited but when it encounters specific patterns in each line, it needs to exclude those patterns while doing sed.
a char(30) NOT NULL
b LARGEINT NOT NULL
c TIMESTAMP
d numeric(10, 3)
Expected output
a|char(30)|NOT NULL
b|LARGEINT|NOT NULL
c|TIMESTAMP|
d|numeric(10, 3)|
While replacing the space, need to ignore space in (10, 3) and NOT NULL. I tried the below but it didn't work
sed -ri '|, |!s|\ /\|/g' abc.txt
Any help in this regard is appreciated.
Upvotes: 1
Views: 13207
Reputation: 1517
awk '/^[cd]/{$NF=$NF"|"}{sub(/ /,"|")sub(/ N/,"|N")}1' file
a|char(30)|NOT NULL
b|LARGEINT|NOT NULL
c|TIMESTAMP|
d|numeric(10, 3)|
What happens? It starts with adding pipes at the end of the line beginning with c or d.
The first sub replace pipe in very first empty space area.
And the second sub put pipe in front of NOT.
Upvotes: 0
Reputation: 23667
If I understood the question correctly, these are requirements:
|
as delimiter()
which may contain spacesFollowing would work for given sample
$ cat ip.txt
a char(30) NOT NULL
b LARGEINT NOT NULL
c TIMESTAMP
d numeric(10, 3)
$ sed -E 's/ +/|/; s/\) */)|/; /\)/!s/ +|$/|/' ip.txt
a|char(30)|NOT NULL
b|LARGEINT|NOT NULL
c|TIMESTAMP|
d|numeric(10, 3)|
s/ +/|/
change first occurrence of one or more spaces to |
s/\) */)|/
handle the troublesome second field first. Change )
and optional spaces to )|
()
/\)/!s/ +|$/|/
for remaining lines, if it doesn't contain )
, change first occurrence of one or more spaces or end of line to |
Upvotes: 2
Reputation: 37394
In GNU awk using FPAT
:
$ awk '
BEGIN {
FPAT="([^ ]+)|([^ ]*NOT NULL[^ ]*)|([^ ]*numeric([^)]*)[^ ]*)" # set FPAT
OFS="|" # set OFS
}
{ NF=3; $1=$1 } # rebuild record (1)
1' file # and output
a|char(30)|NOT NULL
b|LARGEINT|NOT NULL
c|TIMESTAMP|
d|numeric(10, 3)|
(1) NF=3
sets field count statically to 3 and causes the pipes in the ends of the c and d records.
Upvotes: 3
Reputation: 84531
It's not particularly pretty, but a two expression sed
expression will work,
$ sed -e 's/\([^,T]\)[ ]/\1\|/g' file.txt | sed -e 's/\([^O]T\)[ ]/\1\|/'
a|char(30)|NOT NULL
b|LARGEINT|NOT NULL
c|TIMESTAMP
d|numeric(10, 3)
If you really want the pipe after TIMESTAMP
and )
, you can add a third to the end, e.g.
$ sed -e 's/\([^,T]\)[ ]/\1\|/g' file.txt | sed -e 's/\([^O]T\)[ ]/\1\|/' \
-e 's/\([^L]\)$/\1\|/'
a|char(30)|NOT NULL
b|LARGEINT|NOT NULL
c|TIMESTAMP|
d|numeric(10, 3)|
But I took the final pipes to be trailing spaces in your input file not present in mine. Either way, it's another way to skin-this-cat.
Upvotes: 1