Jeni
Jeni

Reputation: 958

Sort a file preserving the header as first position with bash

When sorting a file, I am not preserving the header in its position:

file_1.tsv

Gene   Number  
a       3
u       7
b       9
sort -k1,1 file_1.tsv

Result:

a     3
b     9
Gene  Number
u     7

So I am tryig this code:

sed '1d' file_1.tsv | sort -k1,1 > file_1_sorted.tsv 
first='head -1 file_1.tsv' 
sed '1 "$first"' file_1_sorted.tsv

What I did is to remove the header and sort the rest of the file, and then trying to add again the header. But I am not able to perform this last part, so I would like to know how can I copy the header of the original file and insert it as the first row of the new file without substituting its actuall first row.

Upvotes: 3

Views: 2238

Answers (5)

Ed Morton
Ed Morton

Reputation: 203635

This will work using any awk, sort, and cut in any shell on every UNIX box and will work whether the input is coming from a pipe (when you can't read it twice) or from a file (when you can) and doesn't involve awk spawning a subshell:

awk -v OFS='\t' '{print (NR>1), $0}' file | sort -k1,1n -k2,2 | cut -f2-

The above uses awk to stick a 0 at the front of the header line and a 1 in front of the rest so you can sort by that number then whatever other field(s) you want to sort on and then remove the added field again with a cut. Here it is in stages:

$ awk -v OFS='\t' '{print (NR>1), $0}' file
0   Gene   Number
1   a       3
1   u       7
1   b       9

$ awk -v OFS='\t' '{print (NR>1), $0}' file | sort -k1,1n -k2,2
0   Gene   Number
1   a       3
1   b       9
1   u       7

$ awk -v OFS='\t' '{print (NR>1), $0}' file | sort -k1,1n -k2,2 | cut -f2-
Gene   Number
a       3
b       9
u       7

Upvotes: 2

Philippe
Philippe

Reputation: 26592

You can do this as well :

{ head -1; sort; } < file_1.tsv

** Update **

For macos :

{ IFS= read -r header; printf '%s\n' "$header" ; sort; } < file_1.tsv

Upvotes: 5

karakfa
karakfa

Reputation: 67507

a simpler awk

$ awk 'NR==1{print; next} {print | "sort"}' file

Upvotes: 4

RavinderSingh13
RavinderSingh13

Reputation: 133528

Could you please try following.

awk '
FNR==1{
  first=$0
  next
}
{
  val=(val?val ORS:"")$0
}
END{
  print first
  print val | "sort"
}
'  Input_file

Logical explanation:

  • Check condition FNR==1 to see if its first line; then save its values to variable and move on to next line by next.
  • Then keep appending all lines values to another variable with new line till last line.
  • Now come to END block of this code which executes when Input_file is done being read, there print first line value and put sort command on rest of the lines value there.

Upvotes: 2

James Brown
James Brown

Reputation: 37404

$ head -1 file; tail -n +2 file | sort

Output:

Gene   Number  
a       3
b       9
u       7

Upvotes: 3

Related Questions