Reputation: 51
I am trying to modify a file: (file.txt)
abc~123~xyz~123456~12~0.12~14~1.1~
omn~124~xdz~923231~13~0.0~13~1.1~14~0.45~19~80.1~
to (new_file.txt)
abc~123~xyz~123456~"12~0.12~14~1.1~"
omn~124~xdz~923231~"13~0.0~13~1.1~14~0.45~19~80.1~"
I tried using awk command:
awk -F'~' '{for(i=1;i<=4;i++){printf "%s~", $i}; printf"\""} {for(i=5;i<=NF;i+=1){printf "%s~", $i}; printf "\"\n"}' file.txt > new_file.txt
but I am getting output as:
abc~123~xyz~123456~"12~0.12~14~1.1~~"
omn~124~xdz~923231~"13~0.0~13~1.1~14~0.45~19~80.1~~"
Can anyone help me out in this as I am getting an extra "~" at end of each line? Any reference would also be helpful as I get confused while dealing with sed and awk commands.
Upvotes: 3
Views: 302
Reputation: 1126
And also with GNU awk
:
awk '{sub(/~$/,"~\"");print gensub(/~/, "~\"", 4)}' file
abc~123~xyz~123456~"12~0.12~14~1.1~"
omn~124~xdz~923231~"13~0.0~13~1.1~14~0.45~19~80.1~"
~
by ~"
, one at the end with sub()
and other at 4th apparition with gensub()
.Upvotes: 1
Reputation: 785196
This sed
will fairly simple solution for this:
sed -E 's/^(([^~]*~){4})(.*)/\1"\3"/' file
abc~123~xyz~123456~"12~0.12~14~1.1~"
omn~124~xdz~923231~"13~0.0~13~1.1~14~0.45~19~80.1~"
Similarly using gnu awk
:
awk '{print gensub(/^(([^~]*~){4})(.*)/, "\\1\"\\3\"", "1")}' file
abc~123~xyz~123456~"12~0.12~14~1.1~"
omn~124~xdz~923231~"13~0.0~13~1.1~14~0.45~19~80.1~"
Upvotes: 1
Reputation: 29157
You have a ~
separator at the end of your lines. So, you have an extra empty field after this field separator. You can check this with:
$ awk -F'~' '{print NF "|" $NF "|"}' file.txt
9||
13||
See? When printing this last empty field followed by a ~
, you simply concatenate it to the previous one, thus the ~~
. Try:
$ awk -F '~' -vOFS='~' '{$5 = "\"" $5; $NF = $NF "\""; print}' file.txt
abc~123~xyz~123456~"12~0.12~14~1.1~"
omn~124~xdz~923231~"13~0.0~13~1.1~14~0.45~19~80.1~"
We just declare ~
as the input and output (with variable OFS
) field separator, prepend a "
to the fifth field, append one to the last field, and print.
Upvotes: 4
Reputation: 133528
With your shown samples, please try following awk
program. Written and tested in GNU awk
, should work in any awk
.
awk -v s1="\"" '
match($0,/^[^~]*~([^~]*~){3}/){
print substr($0,RSTART,RLENGTH) s1 substr($0,RSTART+RLENGTH) s1
}
' Input_file
Explanation: Simple explanation would be, firstly creating an awk
variable named s1
which has value as ~
in it. Then in main program using awk
's match
function to match regex ^[^~]*~([^~]*~){3}
which will basically match from starting of value to till 4th value of ~
. Once its matched, I am printing matched sub string(from starting of value to till 4th occurrence of ~
) then printing s1 and rest of the line's value followed by s1(as per OP's requirement).
Additional solution: Assuming in case you have lines which don't match ~
condition( meaning doesn't have 4 times ~
in it and you don't want to add "
) then use following:
awk -v s1="\"" '
match($0,/^[^~]*~([^~]*~){3}/){
$0=substr($0,RSTART,RLENGTH) s1 substr($0,RSTART+RLENGTH) s1
}
1
' Input_file
Upvotes: 1