theo4786
theo4786

Reputation: 159

count number of matching fields in each line

I have a file where each line has a certain number of fields matching "./." that looks something like:

chrM    57  .   T   C   4848.99 GT:AD:DP:GQ:PL  ./. 1/1:1,149:150:99:4903,439,0 0/0:202,0:202:99:0,541,6030 0/0:249,1:250:99:0,646,7558 0/0:249,1:250:99:0,647,7484 0/0:111,0:111:99:0,304,3346 0/0:171,0:172:99:0,397,4599 0/0:118,0:118:99:0,340,3827 0/0:247,0:247:99:0,650,7312 0/0:218,0:219:99:0,611,6728 0/0:242,0:242:99:0,686,7589 0/0:250,0:250:99:0,689,7599 0/0:144,0:144:99:0,409,4608 0/0:250,0:250:99:0,680,7585 0/0:141,3:144:99:0,321,4233 0/0:71,0:71:99:0,205,2260   0/0:204,0:205:99:0,568,6312 ./. 0/0:191,0:191:99:0,523,5874 0/0:249,0:250:99:0,665,7443 0/0:142,0:143:99:0,340,3991 0/0:218,0:218:99:0,575,6612 0/0:247,0:247:99:0,665,7412 0/0:250,0:250:99:0,692,7768 0/0:250,0:250:99:0,689,7749 0/0:247,2:249:99:0,674,7574

I would like to count the number of fields exactly matching "./." in each line, and print the number of matches for each line. I believe I could do something like the code below but code doesn't work (I am new to perl). I think there should be an easier solution in awk.

#! perl -w  

my$F=shift@ARGV;
open IN, "$F";
while(<IN>){
    $num1++ while ($string1 =~ m/\.\/\./g);
    print "The first line has $num1\n";
    next;
}

Upvotes: 0

Views: 160

Answers (4)

Kenosis
Kenosis

Reputation: 6204

Here's another option:

use strict;
use warnings;

while (<>) {
    print "Line $. has ", ( split m|\./\.| ) - 1, "\n";
}

Usage: perl script.pl dataFile [>outFile]

The brackets indicate an optional parameter you can use to send output to a file.

The script splits each line on the field-pattern that you want to match, then returns the number of elements-1 from that split as the number of fields matching "./.". On your sample line, it returns:

Line 1 has 2

Hope this helps!

Upvotes: 2

hwnd
hwnd

Reputation: 70732

You could do:

perl -nE '$count = () = m{\./\.}g; say "Line $. has $count";' file

Ideone Demo

Upvotes: 4

ate50eggs
ate50eggs

Reputation: 454

You need to set an iterator to capture each line. The match count syntax in Perl is a little strange too

my$F=shift@ARGV;
open IN, "$F";
$s = "\.\/\.";
while($string1 = <IN>){
    $num1 = () = $string1 =~ m/$s/gi;
    print "foo: $num1 $string1\n";
    next;

Upvotes: 1

John1024
John1024

Reputation: 113924

In awk:

$ awk '{c=0; for (i=1;i<=NF;i++) c+=($i=="./."); printf "Line %s has %s\n",NR,c+0;}' file
Line 1 has 2

How it works

By default, awk splits each record (line) into fields. We loop through all the fields looking for equality with ./..

  • c=0

    Set the count to zero.

  • for (i=1;i<=NF;i++) c+=($i=="./.")

    Increment count c by one every time a field exactly matches ./..

    $i is the content of the i'th field. $i=="./." is one if the field exactly matches ./.. Thus, c+=($i=="./.") increments c by one for each matching field.

  • printf "Line %s has %s\n",NR,c+0

    Print the results for this line.

Upvotes: 1

Related Questions