Reputation: 1970
Using p.txt:
$cat p.txt
R 3
R 4
S 1
S 2
R 1
T 1
R 3
The following command sorts based on the second column:
$cat p.txt | sort -k2
R 1
S 1
T 1
S 2
R 3
R 3
R 4
The following command removes repeated values in the second column:
$cat p.txt | sort -k2 | awk '!x[$2]++'
R 1
S 2
R 3
R 4
Now inserting a comma for the sapce, we have the following file:
$cat p1.csv
R,3
R,4
S,1
S,2
R,1
T,1
R,3
The following command still sorts based on the second column:
$cat p1.csv | sort -t "," -k2
R,1
S,1
T,1
S,2
R,3
R,3
R,4
Below is NOT the correct output:
$cat p1.csv | sort -t "," -k2 | awk '!x[$2]++'
R,1
Correct output:
R,1
S,2
R,3
R,4
Any suggestions?
Upvotes: 1
Views: 10876
Reputation: 195039
well you have already used sort, then you don't need the awk at all. sort has -u
Also the cat
is not needed either:
sort -t, -k2 -u p1.csv
should give you expected output.
Upvotes: 5
Reputation: 2524
Well you don't need all such things, sort
and uniq
are enough to do such things
sort -t "," -k2 p1.csv | uniq -s 2
uniq -s 2
tells uniq to skip first 2 characters (i.e. till ,
)
Upvotes: 4
Reputation: 77095
You need to provide field separator for awk
cat p1.csv | sort -t "," -k2 | awk -F, '!x[$2]++'
Upvotes: 1
Reputation: 1723
Try awk -F,
in your last command. So:
cat p1.csv | sort -t "," -k2 | awk -F, '!x[$2]++'
Since your fields are separated by commas, you need to tell awk that the field separator is no longer whitespace, but instead the comma. The -F
option to awk does that.
Upvotes: 4