Reputation: 370
I have had a similar question like this earlier later i've to add more scope to that question but had no idea how to edit it and make it live again. that's why i'm posting as a new Question.
My file is a pipe delimited file.
NAME | NUM | WEB | LOCATION | CURRENCY | PLACE
ABCD | 04 | GO|OGLE | EUROPE | EURO | PARIS
XYZE | 12 | Y|A|HOO | USA | DOLLAR | SEATTLE
LMNO | 17 | |FACE|B|O|O|K | ASIA | ASIAN DOLLAR | HONGKONG
EDDE | 98 | A||M|AZ|ON| | AFRICA | AF DOLLAR | CAPETOWN
My file is as complicated as this is. Our need is to remove the "|" symbol from the WEB field and replace it with a junk value like #,$,& or anything.
The Output has to be:
NAME | NUM | WEB | LOCATION | CURRENCY | PLACE
ABCD | 04 | GO#OGLE | EUROPE | EURO | PARIS
XYZE | 12 | Y#A#HOO | USA | DOLLAR | SEATTLE
LMNO | 17 | #FACE#B#O#O#K | ASIA | ASIAN DOLLAR | HONGKONG
EDDE | 98 | A##M#AZ#ON# | AFRICA | AF DOLLAR | CAPETOWN
I've tried awk'ing few filters to clear this mess up.nothing seems to find a happy ending. Thank you! I would like to thank few names who answered my prev question : RomanPerekhrest, Ed Morton,shellter , val rog.
Upvotes: 1
Views: 2141
Reputation: 203493
$ cat tst.awk
BEGIN { FS=OFS="|" }
NR==1 { outNf=NF; print; next }
{
end = beg + (NF - outNf) - 1
for (i=1; i<=NF; i++) {
sep = (i>=beg && i<=end ? "#" : OFS)
printf "%s%s", $i, (i<NF ? sep : ORS)
}
}
$ awk -v beg=3 -f tst.awk file
NAME | NUM | WEB | LOCATION | CURRENCY | PLACE
ABCD | 04 | GO#OGLE | EUROPE | EURO | PARIS
XYZE | 12 | Y#A#HOO | USA | DOLLAR | SEATTLE
LMNO | 17 | #FACE#B#O#O#K | ASIA | ASIAN DOLLAR | HONGKONG
EDDE | 98 | A##M#AZ#ON# | AFRICA | AF DOLLAR | CAPETOWN
How it works: On the first line the number of fields to be output is the same as the number of fields on that line so it saves that number as outNF. From then on any subsequent line with more than outNF fields has outNF-NF fields starting at beg
to be combined. So inside the loop it uses OFS between fields from 1 to beg, then from beg+1 to beg+(outNF-NF) it uses # between fields to create one merged output field from the input fields in that range, then it goes back to using OFS between fields.
Upvotes: 2
Reputation: 1099
I didn't try to put this in one line, but rather made it a little easier to read. Those who play perl golf will be able to reduce it considerably. The idea is to anchor the first two fields and the last three.
#!/usr/bin/perl
while(<DATA>) {
chomp;
if(($name, $num, $web, $location, $currency, $place) = $_ =~
/^([^\|]+)\|([^\|]+)\|(.+)\|([^\|]+)\|([^\|]+)\|([^\|]+)$/) {
$web =~ tr/\|/\_/;
printf "%s\n", join('|', ($name, $num, $web, $location, $currency, $place));
}
}
__DATA__
NAME | NUM | WEB | LOCATION | CURRENCY | PLACE
ABCD | 04 | GO|OGLE | EUROPE | EURO | PARIS
XYZE | 12 | Y|A|HOO | USA | DOLLAR | SEATTLE
LMNO | 17 | |FACE|B|O|O|K | ASIA | ASIAN DOLLAR | HONGKONG
EDDE | 98 | A||M|AZ|ON| | AFRICA | AF DOLLAR | CAPETOWN
Output:
NAME | NUM | WEB | LOCATION | CURRENCY | PLACE
ABCD | 04 | GO_OGLE | EUROPE | EURO | PARIS
XYZE | 12 | Y_A_HOO | USA | DOLLAR | SEATTLE
LMNO | 17 | _FACE_B_O_O_K | ASIA | ASIAN DOLLAR | HONGKONG
EDDE | 98 | A__M_AZ_ON_ | AFRICA | AF DOLLAR | CAPETOWN
Upvotes: 0
Reputation: 3089
A simple awk solution :
awk -F "|" '{printf $1}
{for(i=2; i<=NF; i++) { if(i>3 && i<NF-2)printf "#"$i; else printf "|"$i } printf "\n"} ' file
NAME|NUM|WEB|LOCATION|CURRENCY|PLACE
ABCD|04|GO#OGLE|EUROPE|EURO|PARIS
XYZE|12|Y#A#HOO|USA|DOLLAR|SEATTLE
LMNO|17|#FACE#B#O#O#K|ASIA|ASIANDOLLAR|HONGKONG
EDDE|98|A##M#AZ#ON#|AFRICA|AFDOLLAR|CAPETOWN
if(i>3 && i<NF-2)
: this condition is for extra unwanted fields after 3rd field and before NF-2nd field. If it satisfies, prefix "#" before printing these extra fields.
Upvotes: 1
Reputation: 2691
Another awk
solution can be:-
awk -F'[[:space:]][|][[:space:]]' '{gsub(/\|/,"#",$3);print $1,"|",$2,"|",$3,"|",$4,"|",$5,"|",$6}' file.txt
Explanation:-
-F - for field separator here it is space|space
gsub - global substitution in field 3. i.e. every occurance of | will be replaced by #.
print - just print all the columns separated by "|"
output will be:-
NAME | NUM | WEB | LOCATION | CURRENCY | PLACE
ABCD | 04 | GO#OGLE | EUROPE | EURO | PARIS
XYZE | 12 | Y#A#HOO | USA | DOLLAR | SEATTLE
LMNO | 17 | #FACE#B#O#O#K | ASIA | ASIAN DOLLAR | HONGKONG
EDDE | 98 | A##M#AZ#ON# | AFRICA | AF DOLLAR | CAPETOWN
Upvotes: 1
Reputation: 23794
easy if you do not mind with Perl
If it has space; then we can print it by:
stackoverflow ❱ perl -F'\s+|\s+' -a -le 'print $F[5]' file
WEB
GO|OGLE
Y|A|HOO
|FACE|B|O|O|K
A||M|AZ|ON|
stackoverflow ❱
Since we can modify the @F
array in Perl; thus we can:
$F[5] =~ s/\|/#/g;
It modifies only this column not others.
And eventually we can print it:
stackoverflow ❱ perl -F'\s+|\s+' -lae '$F[5] =~ s/\|/#/g;print "@F"' file
NAME | NUM | WEB | LOCATION | CURRENCY | PLACE
ABCD | 04 | GO#OGLE | EUROPE | EURO | PARIS
XYZE | 12 | Y#A#HOO | USA | DOLLAR | SEATTLE
LMNO | 17 | #FACE#B#O#O#K | ASIA | ASIAN DOLLAR | HONGKONG
EDDE | 98 | A##M#AZ#ON# | AFRICA | AF DOLLAR | CAPETOWN
stackoverflow ❱
If your file has no space, as someone commented me; then you can spread others columns; modify only that one and join them all together:
stackoverflow ❱ cat file2
NAME|NUM|WEB|LOCATION|CURRENCY|PLACE
ABCD|04|GO|OGLE|EUROPE|EURO|PARIS
XYZE|12|Y|A|HOO|USA|DOLLAR|SEATTLE
LMNO|17||FACE|B|O|O|K|ASIA|ASIANDOLLAR|HONGKONG
EDDE|98|A||M|AZ|ON||AFRICA|AFDOLLAR|CAPETOWN
stackoverflow ❱ perl -F'\|' -le '$s=$#F;$e="@F[2..$s-3]";$e=~s/ +/#/g;print join "|", @F[0..1],$e,join "|",@F[$s-2,$s-1,$s]' file2
NAME|NUM|WEB|LOCATION|CURRENCY|PLACE
ABCD|04|GO#OGLE|EUROPE|EURO|PARIS
XYZE|12|Y#A#HOO|USA|DOLLAR|SEATTLE
LMNO|17|#FACE#B#O#O#K|ASIA|ASIANDOLLAR|HONGKONG
EDDE|98|A#M#AZ#ON#|AFRICA|AFDOLLAR|CAPETOWN
Upvotes: 1
Reputation: 785128
You can use this awk
command:
awk 'BEGIN{FS=OFS="|"} NR==1{n=NF} NF > n {
s=$3; for (i=4; i<=NF-3; i++) {s = s "#" $i; $i=""} $3=s; gsub(/\|{2,}/, "|")} 1' file
NAME | NUM | WEB | LOCATION | CURRENCY | PLACE
ABCD | 04 | GO#OGLE | EUROPE | EURO | PARIS
XYZE | 12 | Y#A#HOO | USA | DOLLAR | SEATTLE
LMNO | 17 | #FACE#B#O#O#K | ASIA | ASIAN DOLLAR | HONGKONG
EDDE | 98 | A##M#AZ#ON# | AFRICA | AF DOLLAR | CAPETOWN
Upvotes: 2