shantanuo
shantanuo

Reputation: 32296

awk changes the text unexpectedly

I am using the following awk statement in my shell script.

#!/bin/sh
# read file line by line
file="/pdump/country.000000.txt"
while read line
do
mycol=`echo $line | awk -F"," '{print $2}'`
mycol_new=`echo $mycol | tr "[:lower:]" [:upper:]`
echo $line | awk -v var="$mycol_new" -F"," '{print $1 "," var "," $3 "," $4 "," $5 "," $6 "," $7 "," $8}'
done < $file

It is working as expected.

The only problem is that if the original text is \N (slash N) in any other column for e.g. $4 or $7 then it changes to N (without slash). How do I preserve the original values while replacing only the second column.

Upvotes: 1

Views: 95

Answers (3)

Hai Vu
Hai Vu

Reputation: 40688

If I read your code correctly, you are trying:

  1. Read input from a comma-separated-values (CSV) file
  2. Change the second field to uppercase
  3. Print the result.

If that is the case, use AWK directly. Save the following to toupper_second_field.awk:

BEGIN { FS = ","; OFS="," }
{ $2 = toupper($2); print }

The first line sets the field separators for both input (FS) and output (OFS) to comma. The second converts field #2 to upper case, then print. To invoke it:

awk -f toupper_second_field.awk /pdump/country.000000.txt

The logic is much simpler and you don't have to worry about backslashes.

Upvotes: 0

Ruchi
Ruchi

Reputation: 679

awk strips out the backslash if it's not one of the recognized escape sequences. So if it was \n, awk would have recognized it as newline but \N is simply interpreted as N. More details here

Upvotes: 0

Dennis Williamson
Dennis Williamson

Reputation: 359895

You need to use the -r option for read in your while loop:

while read -r line

That preserves backslashes in the input. That option should almost always be used. Make it a habit.

Upvotes: 2

Related Questions