Reputation: 523
I'm trying to remove leading and trailing space in 3rd column of the below text.txt, and append it at the end.:
text.txt ==> Field1 ~ Field2 ~ Field3 ~ Field4 ~ Field5
1 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA
2 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA
3 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA
4 ABC XYZ ~400 1 ~ 12 A1 B333 ~ TN ~ USA
Expected output:
1 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA~12 A1
2 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA~12 A1
3 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA~12 A1
4 ABC XYZ ~400 1 ~ 12 A1 B333 ~ TN ~ USA~12 A1 B333
However I am getting result as, where the last field all white space got removed including the space which is a word separator.
1 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA~12A1
2 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA~12A1
3 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA~12A1
4 ABC XYZ ~400 1 ~ 12 A1 B333 ~ TN ~ USA~12A1B333
I have used the below awk to remove leading and trailing space in 3rd column but it is not working. What am I missing? Code i am using, and all these are removing the white space which are in-between the string. Here I just need to remove the leading and training space, and not white space between the value.
Code 1:
awk 'BEGIN { FS = "~" }; {fullrow=$0}; {gsub(/ /, "", $3) } ; { print fullrow"~"$3 }' text.txt
Code 2:
awk 'BEGIN { FS = "~" }; {fullrow=$0}; {gsub(/^[[:space:]]+|[[:space:]]+/,"",$3) }; { print fullrow"~"$3 } ' text.txt
Code 3:
awk 'BEGIN { FS = "~" }; {fullrow=$0}; {gsub(/^[ \t]+|[ \t]+/,"",$3) }; { print fullrow"~"$3 } ' text.txt
Please help.
Upvotes: 1
Views: 466
Reputation: 785156
You may use:
awk -F '[[:blank:]]*~[[:blank:]]*' -v OFS='~' '{s=$3; gsub(/[[:blank:]]{2,}/, " ", s); sub(/[[:blank:]]+$/, ""); print $0, s}' file
1 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA~12 A1
2 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA~12 A1
3 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA~12 A1
4 ABC XYZ ~400 1 ~ 12 A1 B333 ~ TN ~ USA~12 A1 B333
To make it more readable:
awk -F '[[:blank:]]*~[[:blank:]]*' -v OFS='~' '{
s=$3
gsub(/[[:blank:]]{2,}/, " ", s)
sub(/[[:blank:]]+$/, "");
print $0, s
}' file
Upvotes: 3
Reputation: 133518
With your shown samples, please try following. Written and tested in GNU awk
.
awk '
BEGIN{
FS=OFS="~"
}
{
val=$3
gsub(/^ +| +$/,"",val)
gsub(/[[:space:]]+/," ",val)
sub(/ +$/,"")
print $0,val
}
' Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section of this program from here.
FS=OFS="~" ##Setting fs and ofs as ~ from here.
}
{
val=$3 ##Setting 3rd field value to val here.
gsub(/^ +| +$/,"",val) ##Globally substituting starting and ending spaces in val with NULL.
gsub(/[[:space:]]+/," ",val)
sub(/ +$/,"") ##Substituting trailing spaces with null in current line.
print $0,val ##Printing current line and val here.
}
' Input_file ##Mentioning Input_file name here.
Output will be as follows.
1 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA~12 A1
2 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA~12 A1
3 ABC XYZ ~400 1 ~ 12 A1 ~ TN ~ USA~12 A1
4 ABC XYZ ~400 1 ~ 12 A1 B333 ~ TN ~ USA~12 A1 B333
Upvotes: 2