dmeu
dmeu

Reputation: 4052

Count unique elements in a file per line

Let's say i have a file with 5 elements on each line.

$ cat myfile.txt

e1 e2 e3 e4 e5
e1 e1 e2 e2 e1
e1 e1 e4 e4 e4

for each line i want to do the following command to count the unique elements on each line.:

tr \\t \\n | sort -u | wc 

I can't figure out the first part of the command - can somebody help me?

Disclaimer: The file really looks like shown below - but i do xargs -L 5 to get the output as shown in the first part.

e1
e2
e3
e4
e5 

Upvotes: 0

Views: 924

Answers (3)

Faiz
Faiz

Reputation: 16245

Here's a perl version if you fancy one:

perl -F'\s' -pane '%H=map{$_=>1}@F; $_=keys(%H)."\n"' myfile.txt

Upvotes: 1

Vijay
Vijay

Reputation: 67211

You can use this:

perl -F -lane '$count{$_}++ for (@F);print scalar values %count;undef %count' your_file

Tested below:

> cat temp
e1 e2 e3 e4 e5
e1 e1 e2 e2 e1
e1 e1 e4 e4 e4
> perl -F -lane '$count{$_}++ for (@F);print scalar values %count;undef %count' temp
5
2
2
>

Upvotes: 1

Chris Seymour
Chris Seymour

Reputation: 85775

Given your input file:

$ cat file
e1 e2 e3 e4 e5
e1 e1 e2 e2 e1
e1 e1 e4 e4 e4

Unique elements in the file using awk:

awk '{for(i=1;i<=NF;i++) a[$i]} END{for (keys in a) print keys}' 
e1
e2
e3
e4
e5

Unique elements in the file using grep instead of tr:

$ grep -Eo '\w+' file | sort -u
e1
e2
e3
e4
e5

Unique elements per line in the file:

Using awk:

$ awk '{for(i=1;i<=NF;i++) a[$i]; print length(a); delete a}' file
5
2
2

awk solutions really are the way to go here but using bash since you tagged it:

#!/bin/bash

while read line; do
  echo $line | grep -Eo '\w+' | sort -u | wc -l 
done < file

Output:

5
2
2

Upvotes: 2

Related Questions