Reputation: 9496
I have got a dataset like this
[email protected],2009-11-27
[email protected],2009-11-27
[email protected],2009-11-27
When I am running command to delete all of the same entries of column2
sort -t ',' -k2 stars.txt -u
It is deleting the entry of column1, and in order to delete the duplicate entries of column2, I am having to enter -k3
flag
sort -t ',' -k3 stars.txt -u
Can anyone explain to me why it is happening? Why I have to enter +1 to the column in the file to delete the column?
Upvotes: 1
Views: 585
Reputation: 195229
this is typical awk job, no sorting needed. I add one short line here, in case you want to give it a try.
awk -F, '!a[$2]++' file
will do the job.
Upvotes: 1
Reputation: 64613
In my system all works correctly:
$ sort -t, -k1 -u 1.txt
[email protected],2009-11-27
[email protected],2009-11-27
$ sort -t, -k2 -u 1.txt
[email protected],2009-11-27
It may be due to your locale. Can you please repleat the command but with LANG=C?
$ LANG=C sort -t, -k1 -u 1.txt
$ LANG=C sort -t, -k2 -u 1.txt
Upvotes: 2