Vicky
Vicky

Reputation: 1328

Sort command strange behaviour

Input file: salary.txt

1 rob   hr 10000
2 charls it 20000
4 kk  Fin 30000
5 km  it 30000
6 kl  it 30000
7 mark  hr 10000
8 kc  it 30000
9 dc  fin 40000
10 mn  hr  40000
3 abi  it 20000

objective: find all record with second highest salary where 4rthcolumn is salary (space separated record)

I ran two similar commands but the output is entirely different, What is that I am missing?

Command1 :

sort -nr -k4,4 salary.txt | awk '!a[$4]{a[$4]=$4;t++}t==2'

output:

8 kc  it 30000
6 kl  it 30000
5 km  it 30000
4 kk  Fin 30000

command2:

 cat salary.txt | sort -nr -k4,4 | awk '!a[$4]{a[$4]=$4;t++}t==2'  salary.txt

output:

2 charls it 20000

the difference in the two commands is only the way salary.txt is read but why the output is entirely different

Upvotes: 1

Views: 60

Answers (2)

Danilo Fiorenzano
Danilo Fiorenzano

Reputation: 159

Because in the second form awk will read directly from salary.txt - which you are passing as the name of the input file - ignoring the output from sort that you are passing to stdin. Leave out the final salary.txt in command2 and you'll see that the output matches that of command1. In fact, sort behaves the same way and the forms:

  • cat salary.txt | sort
  • echo "string that will be ignored" | sort salary.txt

will both yield the exact same output.

Upvotes: 2

twin
twin

Reputation: 1669

In your second command does not, awk does not read from stdin. If you change it to

cat salary.txt | sort -nr -k4,4 | awk '!a[$4]{a[$4]=$4;t++}t==2'

you get the same result

Upvotes: 1

Related Questions