Gaurav Pant
Gaurav Pant

Reputation: 4209

remove a specific array element element from another array

Problem - I have two array as below.

my @arr1 = qw( jon won don pon );
my @arr2 = qw( son kon bon won kon don pon won pon don won);

I need to remove the first matching element of @arr1 from @arr2 i.e. in above example I need to remove won from @arr2.

Currently my logic is as below.

#!/usr/bin/perl
my @arr1 = qw( jon won don pon );
my @arr2 = qw( son kon bon won kon don pon won pon don won);
my @remove_indices = ();
my $remove_element;
my $first_remove_index;
OUTER_FOR: for my $i (0..@arr2) {
    $outer_element = $arr2[$i];
    foreach my $innr_element ( @arr1 ) {
        if($innr_element eq $outer_element) {
            push(@remove_indices, $i);
            $first_remove_index = $i;
            $remove_element = $innr_element;
            last OUTER_FOR;
        }
    }
}

for my $i ($first_remove_index+1..@arr2) {
    $outer_element = $arr2[$i];
    if($remove_element eq $outer_element) {
        push(@remove_indices, $i);
    }
}

if (@remove_indices > 0) {
        map {splice (@arr2, $_, 1)} reverse(@remove_indices);
                                    }

print "@arr2";

But it seems to be typical C/C++ style logic. I can't use hash. Is there any perl way to do the same?

Upvotes: 2

Views: 153

Answers (4)

VolDeNuit
VolDeNuit

Reputation: 53

Virus, of course you can use a hash for this problem.

It'll actually give you better-bounded CPU usage if one extends the size of one or both of the arrays in your example.

use strict;

my @arr1 = qw(jon won don pon);
my @arr2 = qw(son kon bon won kon don pon won pon don won);

my $i;
my %h;
for (@arr2) { push @{$h{$_} }, $i++ }
for my $a (@arr1) {
    if (exists $h{$a}) {
        for (@{$h{$a}}) {
            $arr2[$_] = '';
        }
        last;
    }
}

@arr2 = grep { length } @arr2;
print "@arr2\n";

I was curious as to the relative efficiency of the different proposed solutions and wrote a test program to try them out. You'll be pleased to know that your program is good as any when using your test data. But not when the arrays start growing in size!

What follows below is a bit mad, I know, but here goes anyway. And if you're going to be running your application zillions of times a day, however, you'll probably benefit by doing some benchmarking on your hardware. So bear with me :)

Here are relative CPU times for each of the 5 solutions in the order they were posted above (the most economical is shown as "1"). The first result column holds the CPU time using the data used in your example, and the three following columns increase one or both of the two arrays by prefixing them with 50 elements whose contents aren't found in the other.

First array                @arr1     @arr1   @arr1+50  @arr1+50
Second array               @arr2   @arr2+50    @arr2   @arr2+50

Your program                 1         7         3        45
Grep approach 1              1         6         3        43
Grep approach 2              3         9        33       160
Convert to string            2         4         3         6
Using hash                   2         6         2         7

The hash solution is twice as CPU-intensive as yours when run on your test data. But if you extend the second array by 50 elements, the hash is a bit better since your CPU time has now increased to 7 whereas the hash approach has gone from 2 to 6. But if both arrays are larger, your program needs 45 times more CPU-time to finish than it needed for the original data whereas the hash program only needs 3.5 times more (going from 2 to 7).

Obviously all require more CPU time as the arrays grow in size but not in the same proportions, and their slowdown isn't linear either. They can all probably also be tweaked a bit, and I imagine the results would change on different hardware platforms, but these ought to be reasonably indicative of their relative efficiency. Here are the times when the original arrays are grown by 100 elements instead of 50.

First array                @arr1     @arr1  @arr1+100  @arr1+100
Second array               @arr2  @arr2+100    @arr2   @arr2+100

Your program                 1        12         5       173
Grep approach 1              1        12         4       162
Grep approach 2              3        16        68       612
Convert to string            2         7         5        12
Using hash                   2        11         3        12

So it's a toss-up as to whether one should prefer the 4th approach (which converts the test array to a single string and then regex's through it, a died-in-the-wool perl programmer's solution) and the 5th (which uses a hash). Marginally, the "convert to string" approach is better, something that would be counter-intuitive to many programmers. It's also short and easy to read.

Bottom line... if you'll be working with data sets like the ones in your example, your code is fine (although you should fix "(0..@arr2)" at INNER_LOOP: to read "(0..$#arr2)").

Otherwise, use the 4th or the 5th depending on your taste. I'd personally go for the one that isn't mine, the "convert to string" program as long as I was 100% certain the joining character couldn't appear in the data.

I think I'd better get back to doing something productive now :)

Upvotes: 1

Qtax
Qtax

Reputation: 33928

Here's another way (assuming that you want to remove all occurrences of the first matched element):

use strict;
use warnings;

my @arr1 = qw(jon won don pon);
my @arr2 = qw(son kon bon won kon don pon won pon don won);

for my $elem (@arr2){
    if(grep { $_ eq $elem } @arr1){
        @arr2 = grep { $_ ne $elem } @arr2;
        last;
    }
}

print "@arr2";

Output:

son kon bon kon don pon pon don

Upvotes: 1

Red Cricket
Red Cricket

Reputation: 10480

I am sure there are many ways to do this. Here's one ...

my @arr1 = qw( jon won don pon );
my @arr2 = qw( son kon bon won kon don pon won pon don won);

my $s2 = join '|', @arr2;


my $item;
foreach $item (@arr1) {
        last unless $s2 !~ s/$item//g;
}
$s2 =~ s/\|\|/\|/g;
@arr2 = split /\|/, $s2;

print Dumper( @arr2 );

Upvotes: 0

TLP
TLP

Reputation: 67930

You can use a simple for loop, and exit the loop with last when you find a match. Use grep to remove the keywords.

use strict;
use warnings;

my @arr1 = qw( jon won don pon );
my @arr2 = qw( son kon bon won kon don pon won pon don won);
my @out;
for my $word (@arr1) {
    my @new = grep !/^\Q$word\E$/, @arr2;
    if (@new != @arr2) {
        print "'$word' found\n";
        @out = @new;
        last;
    }
}
print "@out";

Note that I use \Q ...\E to disable possible regex meta characters. The != comparison will compare the arrays sizes against each other, and when a difference is found, we know we have found our match.

Upvotes: 0

Related Questions