PPK
PPK

Reputation: 1427

Awk command, print lines which are occurring only once in a csv file

I have a csv file which may have duplicates. I need help in an awk command which prints only those lines whose occurrence is only once in the file.

Eg: Input file:

a,b
a,c
a,d
a,b
a,c
b,e
b,f
b,d
b,f
b,e

Output:

a,d
b,d

Thank you for your help.

Upvotes: 1

Views: 917

Answers (3)

Naibin Duan
Naibin Duan

Reputation: 3

3 method to print the uniq only once blast contigs.

awk 'NF>4' valsidate_1k_vs_gdd13|grep Chr|awk '{arr[$1]++}END{for(i in arr)if(arr[i]==1)print i}'  

awk 'NF>4' valsidate_1k_vs_gdd13|grep Chr|cut -f 1|sort| uniq -u

awk 'NF>4' valsidate_1k_vs_gdd13|grep Chr|cut -f 1|sort |uniq -c |grep  '\ 1 Chr'

Upvotes: 0

Akshay Hegde
Akshay Hegde

Reputation: 16997

Using awk:

awk '{arr[$0]++}END{for(i in arr)if(arr[i]==1)print i}' infile

Sort and uniq

$ sort file | uniq -u # -u generates unique entries; -d nonunique
a,d
b,d

Test Results:

$ cat file
a,b
a,c
a,d
a,b
a,c
b,e
b,f
b,d
b,f
b,e

$ awk '{arr[$0]++}END{for(i in arr)if(arr[i]==1)print i}' file
a,d
b,d

Explanation:

  • arr[$0]++ $0 is current line/record, which is used as array key, arr is array, arr[$0]++ holds a count of occurrence of key, so whenever awk finds duplicate key, count will be incremented by one.

  • so at the end block, loop through array, if count is equal to one, print such array key.

Upvotes: 2

RomanPerekhrest
RomanPerekhrest

Reputation: 92884

The shortest one with uniq command:

uniq -u <(sort file)
  • -u - only print unique lines

The output:

a,d
b,d

Upvotes: 1

Related Questions