Reputation: 1197
I have a file | delimited, File has 4 columns
123|456|789|234
456|456|789|2345
124|456|789|2345
125|456|789|2345
126|456|789|2345
128|451|782|2345
Output --> 6
How can find the count of distinct records in column1 where col2=456 and col3=789.
I know how to find the distinct of col1 for whole file, but how do we find after applying some filter on other column
Upvotes: 0
Views: 425
Reputation: 26491
A slightly different logic than the previous awk solutions. But than also slightly:
awk -F '|' '($2!=246 || $3!=789){next}!a[$1]++{c++}END{print c}' file
Upvotes: 0
Reputation: 8711
You can try Perl
perl -lne ' /^(.+?)\|(.+?)\|(.+?)\|/ and $2==456 and $3==789 and $kv{$1}++ ; END { print scalar keys %kv } '
with the given inputs
$ cat vijiy.txt
123|456|789|234
456|456|789|2345
124|456|789|2345
125|456|789|2345
126|456|789|2345
128|451|782|2345
$ perl -lne ' /^(.+?)\|(.+?)\|(.+?)\|/ and $2==456 and $3==789 and $kv{$1}++ ; END { print scalar keys %kv } ' vijiy.txt
5
$
Upvotes: 0
Reputation: 37424
Using grep
'n cut
'n sort
'n wc
'n a bunch of pipes:
$ grep ^[^\|]*\|456\|789\| file | cut -d \| -f 1-3 | sort -u | wc -l
5
Demonstrated:
$ grep ^[^\|]*\|456\|789\| file # search for ^...|456|789|
123|456|789|234
456|456|789|2345
124|456|789|2345
125|456|789|2345
126|456|789|2345
and pipe it to
cut -d \| -f 1-3 # to lose the last field
123|456|789
456|456|789
124|456|789
125|456|789
126|456|789
and on to
sort -u # sort and get unique records
123|456|789
124|456|789
125|456|789
126|456|789
456|456|789
and finally
wc -l # count them
5
Super Mario's daily bread:
|
Upvotes: 1
Reputation: 189648
Here is a simple Awk script.
awk -F '|' '$2=="456" && $3=="789" { if (!a[$1]++) count++ }
END { print count }' file
Your sample data contains 5 distinct values in column 1 with these constraints.
If you want to pass in the values as parameters,
c2="456"
c3="789"
awk -F '|' -v col2="$c2" -v col3="$c3" '$2==col2 && $3==col3 {
if (!a[$1]++) count++ }
END { print count }' file
The array a
collects the keys from column 1 which we have already seen. If the array doesn't contain the value of the first column already, we add 1 to count
.
Upvotes: 1
Reputation: 133610
Could you please try following.
awk -F'|' '$2==456 && $3==789 && !a[$1,$2,$3]++{count++} END{print count}' Input_file
Upvotes: 1