user1228191
user1228191

Reputation: 681

Remove lines that contains repeated regular expression in perl

I have an array that contains elements like:

@array = qw/ john jim rocky hosanna/;

INPUT FILE:

john wears blue shirt 

hosanna knows drawing

george and jim went to europe

john went to swimming

jim wears yellow shirt

rocky went to swimming

rocky learns painting

hosanna learns painting

REQUIRED OUTPUT:

john wears blue shirt 

hosanna knows drawing

george and jim went to europe

rocky went to swimming

so, I need to have only first occurrences lines.

Upvotes: 1

Views: 401

Answers (4)

Sidharth C. Nadhan
Sidharth C. Nadhan

Reputation: 2253

perl -ane 'print unless $a{$F[0]}++ ' inputfile

hope this works +

Upvotes: 1

Birei
Birei

Reputation: 36262

One way. I save array data to a hash and delete an entry when found in the input file.

Content of script.pl:

use warnings;
use strict;

## Input names to search.
my @array = qw/ john jim rocky hosanna/;

## Save names to a hash. This way they are easier to find out.
my %names = map { $_ => 1 } @array;

## Read file line by line.
while ( <> ) { 

    ## Avoid blank lines.
    next if m/\A\s*\Z/;

    ## Split line in fields.
    my @f = split;

    ## Count number of names in hash.
    my $num_entries = scalar keys %names;

    ## Remove words of hash found in line.
    for ( @f ) { 
        delete $names{ $_ };
    }   

    ## If now there are less names, it means that line had any of
    ## them, so print line.
    if ( scalar keys %names < $num_entries ) { 
        printf qq[%s\n], $_; 
    }   

    ## If hash is empty, there are no lines left to print, so exit of
    ## loop without checking more lines.
    last if scalar keys %names == 0;
}

Command:

perl script.pl infile

Output:

john wears blue shirt 

hosanna knows drawing

george and jim went to europe

rocky went to swimming

Upvotes: 1

DVK
DVK

Reputation: 129393

@seen{@array} = ();
@out = grep { (($w)=split; !($seen{$w}++) } @in;

Upvotes: 4

Perlnika
Perlnika

Reputation: 5066

What about making another array which indicates if the name was already used? Then, first time you read line with Jim, set variable in this array as used and write in into output. If it was already used in the past, do nothing.

@array =(john,jim,rocky,hosanna);
@used =(0,0,0,0);

Upvotes: 1

Related Questions