mkdir
mkdir

Reputation: 33

Re-order fields from nth to NF-1 with awk

My problem : I have a pipe delimiter input file and I need to put the last column at first, drop the 2nd, and print from the third to the last-1.

Currently, this works with my 7 fields file :

awk 'BEGIN { FS="|"; OFS="|"; } {print $NF,$2,$3,$4,$5,$6}'

But i am looking for something more automatic, which works with n number of columns

I have tried a loop, but it prints all fields on separate line.

awk 'BEGIN { FS="|"; OFS="|"; } {for(i=2;i<=NF-1;++i)print $i}'

But this print all fields on separate rows, plus the first is not printed.

I have tried many another solutions but no luck so far...

Is there any option i'm missing ?

Input :

"PRILYYYTVENIZKEB@XXXX"|2017-09-08T09:46:40.000|"AUDIOTEL"|"Virement +"|25|"50747071"|6440bc7a8f41a96f89ee123159b7eb819a99767c9107b24e9d346eb3835f74a7
"CSRBQDVXJEFPACTKOO@AAA"|2020-02-11T10:02:20.000|"WEB"|"Virement +"|25|"51254683"|cd558b1319595aa63929d8cf3d8213ccc004aac089e6dd3bbad1d595ad010335
"WOGMKZLBHDFPACTKHG@ZZZZ"|2019-07-03T12:00:00.000|"WEB"|"Virement +"|195|"51080106"|f128a559267df0f9a6352fb40f65594aa8f5d01d5c3b90f471ffa0be07739c4d

Expected :

6440bc7a8f41a96f89ee123159b7eb819a99767c9107b24e9d346eb3835f74a7|2017-09-08T09:46:40.000|"AUDIOTEL"|"Virement +"|25|"50747071"
cd558b1319595aa63929d8cf3d8213ccc004aac089e6dd3bbad1d595ad010335|2020-02-11T10:02:20.000|"WEB"|"Virement +"|25|"51254683"
f128a559267df0f9a6352fb40f65594aa8f5d01d5c3b90f471ffa0be07739c4d|2019-07-03T12:00:00.000|"WEB"|"Virement +"|195|"51080106"

(email on 2nd is deleted, and hash on last is put on first).


Global context (maybe another solution more direct is possible) :

My goal is to replace the first field with a hash-calculated value of this field.

I use a temporary file to add my calculated field at the end of my file :

while read line
do
        echo -n "$line|"
        echo -n  $line | cut -d'|' -f1 | sed "s/\"//g" | tr -d '\n' | sha256sum | cut -d' ' -f1
done < $f_x_file_name.$f_x_file_extension > $f_x_file_name.hash.$f_x_file_extension ;

Thanks !

Regards

Upvotes: 1

Views: 208

Answers (4)

kvantour
kvantour

Reputation: 26501

While in the current situation, this is easily implemented, I'm always wondering why there is no concat function which does the reverse operation of split:

  • split(s, a[, fs ]): Split the string s into array elements a[1], a[2], ..., a[n], and return n. All elements of the array shall be deleted before the split is performed. The separation shall be done with the ERE fs or with the field separator FS if fs is not given. Each array element shall have a string value when created and, if appropriate, the array element shall be considered a numeric string (see Expressions in awk). The effect of a null string as the value of fs is unspecified.

  • concat(a[, ofs ]): Concatenate the array elements a[1], a[2], ..., a[n] with ofs as field separator or OFS if ofs is not given. Numeric string values are converted to strings using CONVFMT. The first n array elements are concatenated, where such that n+1 in a returns 0.

The implementation of concat would read:

function concat(a,  ofs,  s,i) {
     ofs=(ofs=="" && ofs==0 ? OFS : ofs)
     i=1; while(i in a) { s = s (i==1?"":ofs) a[i]; i++ }
     return s
}

Using this function, you could then easily create an array with elements and assemble it as a string of fields:

BEGIN{FS=OFS="|"}
{ n=split($0,a) }
{ a[2]=a[1]; a[1]=a[n]; delete a[n] }
{ print concat(a) }

See comments below for more information about this.

Upvotes: 0

Walter A
Walter A

Reputation: 20022

Modify the script where you calculate the hash.

while read -r line
do
   # hash from your command:
   # hash=$(echo -n  $line | cut -d'|' -f1 | sed "s/\"//g" | tr -d '\n' | 
   #        sha256sum | cut -d' ' -f1)
   # Slightly changed
   hash=$(cut -d'|' -f1 <<<"${line}"| tr -d '\n"' | sha256sum | cut -d' ' -f1)
   echo "${hash}|$(cut -d '|' -f2- <<< "${line}")"
done < "$f_x_file_name"."$f_x_file_extension" > "$f_x_file_name".hash."$f_x_file_extension" 

or even easier:

while IFS='|' read -r firstfield otherfields
do
   hash=$(sha256sum <<< "${firstfield}" | cut -d' ' -f1)
   echo "${hash}|${otherfields}"
done < "$f_x_file_name"."$f_x_file_extension" > "$f_x_file_name".hash."$f_x_file_extension" 

Upvotes: 0

karakfa
karakfa

Reputation: 67507

based on the script, not your description, you want

awk 'BEGIN{FS=OFS="|"} {$1=$NF; NF--}1' file

example:

$ seq 5 | paste -sd'|' | awk 'BEGIN{FS=OFS="|"} {$1=$NF; NF--}1'
5|2|3|4

Upvotes: 0

Ed Morton
Ed Morton

Reputation: 204174

If I understand correctly what you mean by:

put the last column at first, drop the 2nd, and print from the third to the last-1

then a more concise way of saying that would be:

move the first column to the 2nd and move the last column to the first

which would be:

awk 'BEGIN{FS=OFS="|"} {$2=$1; $1=$NF; NF--} 1' file

for example:

$ echo 'a|b|c|d' | awk 'BEGIN{FS=OFS="|"} {$2=$1; $1=$NF; NF--} 1'
d|a|c

Using NF-- to delete the last column is undefined behavior per POSIX, if your awk doesn't support it then just change NF-- to sub(/\|[^|]*$/,"").

If I misunderstood what you're trying to do then edit your question to provide concise, testable sample input and expected output.

Upvotes: 2

Related Questions