Extract the unique intervals from two arrays in perl?

Question

I am trying to extract non-overlapping intervals from two files with intervals (those that are unique). Here the case:

file1.txt

Start End
1 3
5 9
13 24
34 57

file2.txt

Start End
6 7
10 12
16 28
45 68

Expected result: an array having those intervals with elements present only in one file:

1-3 , 10-12

That's all... thank you very much in advance!

choroba · Accepted Answer

Process the files line by line. If there's no overlap, report the interval that starts earlier and advance its file. In case of an overlap, advance both files.

#!/usr/bin/perl
use warnings;
use strict;

use Data::Dumper;

my @F;
open $F[0], '<', 'file1.txt' or die $!;
open $F[1], '<', 'file2.txt' or die $!;

# Skip headers.
readline $_ for @F;

my @boundaries;
my @results;

sub earlier {
    my ($x, $y) = @_;
    if (! @{ $boundaries[$y] }
        or $boundaries[$x][1] < $boundaries[$y][0]
    ) {
        push @results, $boundaries[$x];
        $boundaries[$x] = [ split ' ', readline $F[$x] ];
        return 1
    }
    return 0
}

sub overlap {
    my ($x, $y) = @_;
    if ($boundaries[$x][1] < $boundaries[$y][1]) {
        do { $boundaries[$x] = [ split ' ', readline $F[$x] ] }
          until ! @{ $boundaries[$x] }
          or $boundaries[$x][0] > $boundaries[$y][1];
        $boundaries[$y] = [ split ' ', readline $F[$y] ];
        return 1
    }
    return 0
}

sub advance_both {
    @boundaries = map [ split ' ', readline $_ ], @F;
}

# init.
advance_both();
while (grep defined, @{ $boundaries[0] }, @{ $boundaries[1] }) {

    earlier(0, 1)
    or earlier(1, 0)
    or overlap(0, 1)
    or overlap(1, 0)
    or advance_both();
}

print join(' , ', map { join '-', @$_ } @results), "
";

Extract the unique intervals from two arrays in perl?

Answers (2)

Related Questions