Reputation: 1328
Input file: salary.txt
1 rob hr 10000
2 charls it 20000
4 kk Fin 30000
5 km it 30000
6 kl it 30000
7 mark hr 10000
8 kc it 30000
9 dc fin 40000
10 mn hr 40000
3 abi it 20000
objective: find all record with second highest salary where 4rthcolumn is salary (space separated record)
I ran two similar commands but the output is entirely different, What is that I am missing?
Command1 :
sort -nr -k4,4 salary.txt | awk '!a[$4]{a[$4]=$4;t++}t==2'
output:
8 kc it 30000
6 kl it 30000
5 km it 30000
4 kk Fin 30000
command2:
cat salary.txt | sort -nr -k4,4 | awk '!a[$4]{a[$4]=$4;t++}t==2' salary.txt
output:
2 charls it 20000
the difference in the two commands is only the way salary.txt is read but why the output is entirely different
Upvotes: 1
Views: 60
Reputation: 159
Because in the second form awk
will read directly from salary.txt
- which you are passing as the name of the input file - ignoring the output from sort
that you are passing to stdin. Leave out the final salary.txt
in command2 and you'll see that the output matches that of command1. In fact, sort
behaves the same way and the forms:
cat salary.txt | sort
echo "string that will be ignored" | sort salary.txt
will both yield the exact same output.
Upvotes: 2
Reputation: 1669
In your second command does not, awk does not read from stdin. If you change it to
cat salary.txt | sort -nr -k4,4 | awk '!a[$4]{a[$4]=$4;t++}t==2'
you get the same result
Upvotes: 1