Reputation: 958
Imagine I have this file in bash:
1 3 6 name1
1 2 7 name2
3 4 2 name1
2 2 2 name3
7 8 2 name2
1 2 9 name4
How could I extract just those lines which present the field "name" repeated and sort them?
My expected output would be:
1 3 6 name1
3 4 2 name1
1 2 7 name2
7 8 2 name2
I was trying to use sort -k4,4 myfile | uniq -D
, but I don't find how to tell uniq
to work with the 4th column.
Thanks!
Upvotes: 2
Views: 175
Reputation: 50750
You were close. You need to skip fields preceding the last one.
$ sort -k4 file | uniq -f3 -D
1 3 6 name1
3 4 2 name1
1 2 7 name2
7 8 2 name2
Upvotes: 3
Reputation: 133518
Could you please try following.
awk '
{
a[$NF]++
b[$NF]=(b[$NF]?b[$NF] ORS:"")$0
}
END{
for(i in a){
if(a[i]>1){
print b[i]
}
}
}
' Input_file
OR in case you want to sort the output try following then.
awk '
{
a[$NF]++
b[$NF]=(b[$NF]?b[$NF] ORS:"")$0
}
END{
for(i in a){
if(a[i]>1){
print b[i]
}
}
}
' Input_file | sort -k4
Upvotes: 2
Reputation: 785146
You may use this awk + sort
:
awk 'FNR==NR{freq[$NF]++; next} freq[$NF] > 1' file{,} | sort -k4
1 3 6 name1
3 4 2 name1
1 2 7 name2
7 8 2 name2
Upvotes: 1