Amit
Amit

Reputation: 1

How to merge columns from two different files in perl

I have written following perl code to read a text file (a1.txt) and average the time stamp. I want to read two files simultaneously (a1.txt and a2.txt) and combine all columns from both files.

The code below can only read one file at a time. Please help me to modify my below Perl code and give output in following format.

a1.txt:

PERFORMANCE TESTING


-------------------------------------------------------------------
PERF_SMK_OCUS_50    Version P-20-17
-------------------------------------------------------------------
300_wireframe_view_redraws_(GR) 00:01:56

80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:51

3_hidden_view_redraws_(GR) 00:01:35

6_Fast_HLR_activations_(CP) 00:01:10

120_hidden_view_redraws_with_Fast_HLR_(GR) 00:00:42

2_shaded_mouse_spins_(GR) 00:00:21

270_shaded_view_redraws_(GR) 00:01:39
-------------------------------------------------------------------

****************************************************
****************************************************
-------------------------------------------------------------------
PERF_SMK_OCUS_50    Version P-20-17
-------------------------------------------------------------------
300_wireframe_view_redraws_(GR) 00:01:56

80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:51

3_hidden_view_redraws_(GR) 00:01:35

6_Fast_HLR_activations_(CP) 00:01:09

120_hidden_view_redraws_with_Fast_HLR_(GR) 00:00:42

2_shaded_mouse_spins_(GR) 00:00:20

270_shaded_view_redraws_(GR) 00:01:39
-------------------------------------------------------------------

****************************************************
****************************************************
-------------------------------------------------------------------
PERF_SMK_OCUS_50    Version P-20-17
-------------------------------------------------------------------
300_wireframe_view_redraws_(GR) 00:01:55

80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:50

3_hidden_view_redraws_(GR) 00:01:34

6_Fast_HLR_activations_(CP) 00:01:09

120_hidden_view_redraws_with_Fast_HLR_(GR) 00:00:40

2_shaded_mouse_spins_(GR) 00:00:21

270_shaded_view_redraws_(GR) 00:01:35
-------------------------------------------------------------------

****************************************************
****************************************************

a2.txt:

PERFORMANCE TESTING

-------------------------------------------------------------------
PERF_SMK_OCUS_50    Version P-20-17
-------------------------------------------------------------------
80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:50

3_hidden_view_redraws_(GR) 00:01:37

6_Fast_HLR_activations_(CP) 00:01:12

120_hidden_view_redraws_with_Fast_HLR_(GR) 00:00:43

2_shaded_mouse_spins_(GR) 00:00:21

270_shaded_view_redraws_(GR) 00:01:35

240_realtime_rendered_redraws_(GR)_1 00:13:16
-------------------------------------------------------------------

****************************************************
****************************************************
-------------------------------------------------------------------
PERF_SMK_OCUS_50    Version P-20-17
-------------------------------------------------------------------
80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:50

3_hidden_view_redraws_(GR) 00:01:37

6_Fast_HLR_activations_(CP) 00:01:12

120_hidden_view_redraws_with_Fast_HLR_(GR) 00:00:42

2_shaded_mouse_spins_(GR) 00:00:20

270_shaded_view_redraws_(GR) 00:01:40

240_realtime_rendered_redraws_(GR)_1 00:13:14
-------------------------------------------------------------------

****************************************************
****************************************************
-------------------------------------------------------------------
PERF_SMK_OCUS_50    Version P-20-17
-------------------------------------------------------------------
80_wireframe_view_redraws_with_DATUMS_on_(GR) 00:00:50

3_hidden_view_redraws_(GR) 00:01:37

6_Fast_HLR_activations_(CP) 00:01:12

120_hidden_view_redraws_with_Fast_HLR_(GR) 00:00:44

2_shaded_mouse_spins_(GR) 00:00:20

270_shaded_view_redraws_(GR) 00:01:40

240_realtime_rendered_redraws_(GR)_1 00:13:24
-------------------------------------------------------------------

****************************************************
****************************************************

Desired output:

> Test Cases                                  a1.txt timestamp (hh:mm:ss)      a2.txt(hh:mm:ss)      delta (a1 -a2)(hh:mm:ss)
>----------------------------------------------------------------------------------------------------------------
>240_realtime_rendered_redraws_(GR)_1           N/A                            00:13:18             N/A

> 3_hidden_view_redraws_(GR)                     00:01:34                       00:01:37           -00:00:03

> 270_shaded_view_redraws_(GR)                   00:01:37                       00:01:38           -00:00:01

> 120_hidden_view_redraws_with_Fast_HLR_(GR)     00:00:41                       00:00:43           -00:00:02

> 300_wireframe_view_redraws_(GR)                00:01:55                        N/A                 N/A 

> 2_shaded_mouse_spins_(GR)                      00:00:20                       00:00:20            00:00:00

> 6_Fast_HLR_activations_(CP)                    00:01:09                       00:01:12           -00:00:03 

> 80_wireframe_view_redraws_with_DATUMS_on_(GR)  00:00:50                       00:00:50            00:00:00

My code:

my %retrieve;
my $count = 0;

my $file1 = 'a1.txt';

open (R, $file1) or  die ("Could not open $file1!");

while (<R>) {

    next unless /^*Retrieve_generic_/ ||
                /^*Retrieve_assembly_1_/ ||
                /^*Retrieve_assembly_2_/ ||
                /^*300_wireframe_view_/ || 
                /^*80_wireframe_view_/ ||
                /^*3_hidden_view_/ || 
                /^*Fast_HLR_/ || 
                /^*120_hidden_view_/ ||
                /^*shaded_view_/ ||
                /^*shaded_mouse_/ || 
                /^*realtime_rendered_/;
    $count++;
    my ( $retrieve, $time ) = split;
    my ( $h, $m, $s ) = split ':', $time;
    $retrieve{$retrieve} += $h * 3600 + $m * 60 + $s;

}
close(R);

for my $retrieve ( keys %retrieve ) {

    my $hms = secondsToHMS($retrieve{$retrieve} / ( 3));
    print "$retrieve\t$hms\n" if defined $hms;
}

# For seconds < 86400, else undef returned

sub secondsToHMS {
    my $seconds = $_[0]; 
    return undef if $seconds >= 86400;

    my $h = int $seconds / 3600;
    my $m = int( $seconds - $h * 3600 ) / 60;
    my $s = $seconds % 60;

    return sprintf( '%02d:%02d:%02d', $h, $m, $s );
}

Upvotes: 0

Views: 1067

Answers (1)

ddoxey
ddoxey

Reputation: 2063

Here's how I'd go about doing that.

#!/usr/bin/perl -Tw

use strict;
use warnings;
use English qw( -no_match_vars $OS_ERROR );

die 'expecting two filenames as arguments'
    if @ARGV != 2;

my @ids;

my %time_for;

for my $filename (@ARGV) {

    my $id;

    if ( $filename =~ m{\A ( .+? / )?( [^/.]+? )( [.] \w+ ) \z}xms ) {
        my $path = $1 || "";
        my $name = $2;
        my $ext  = $3 || "";
        $id       = $name;
        $filename = "$path$name$ext";
        push @ids, $id;
    }

    die "cant parse file ID from $filename"
        if !$id;

    die "cant find $filename"
        if !stat $filename;

    open my $fh, '<', "$filename"
        or die "open $filename: $OS_ERROR";

    while ( my $line = <$fh> ) {

        if ( $line =~ m{\A ( \w+ \( \w+ \) \w* ) \s+ ( \d+:\d+:\d+ ) }xms ) {

            my ( $subject, $hms ) = ( $1, $2 );

            my $seconds = hms_to_sec( $hms );

            $time_for{$subject}->{$id} ||= $seconds;

            $time_for{$subject}->{$id}
                = ( $seconds + $time_for{$subject}->{$id} ) / 2;
        }
    }

    close $fh,
        or die "close $filename: $OS_ERROR";
}

print <<"HEAD";
> Test Cases                                     $ids[0] timestamp (hh:mm:ss)          $ids[1] (hh:mm:ss)         delta ($ids[0]-$ids[1])(hh:mm:ss)
> ------------------------------------------------------------------------------------------------------------------------------
HEAD

for my $subject (sort keys %time_for) {

    my ( $a1, $a2 ) = @{ $time_for{$subject} }{@ids};

    my $delta = defined $a1 && defined $a2 ? $a1 - $a2 : undef;

    printf "> % -46s % -32s % -21s %s\n\n",
        $subject,
        sec_to_hms( $a1 ),
        sec_to_hms( $a2 ),
        sec_to_hms( $delta );
}

sub hms_to_sec {
    my ( $h, $m, $s ) = map { int $_ } map { $_ ? $_ : 0 } split /:/, $_[0];
    return $h * 3_600 + $m * 60 + $s;
}

sub sec_to_hms {
    my ( $s ) = @_;

    return 'N/A'
        if !defined $s || $s > 86_400;

    my $sign = ' ';

    if ( $s < 0 ) {
        $sign = '-';
        $s *= -1;
    }

    my $h = int $s / 3_600;
    my $m = int ( $s - $h * 3_600 ) / 60;

    return sprintf '%s%02d:%02d:%02d', $sign, $h, $m, $s % 60;
}

The output comes out like this.

> Test Cases                                     a1.txt timestamp (hh:mm:ss)      a2.txt(hh:mm:ss)      delta (a1 -a2)(hh:mm:ss)
> ------------------------------------------------------------------------------------------------------------------------------
> 120_hidden_view_redraws_with_Fast_HLR_(GR)      00:00:41                         00:00:43             -00:00:02

> 240_realtime_rendered_redraws_(GR)_1           N/A                               00:13:19             -00:13:19

> 270_shaded_view_redraws_(GR)                    00:01:37                         00:01:38             -00:00:01

> 2_shaded_mouse_spins_(GR)                       00:00:20                         00:00:20              00:00:00

> 300_wireframe_view_redraws_(GR)                 00:01:55                        N/A                    00:01:55

> 3_hidden_view_redraws_(GR)                      00:01:34                         00:01:37             -00:00:02

> 6_Fast_HLR_activations_(CP)                     00:01:09                         00:01:12             -00:00:02

> 80_wireframe_view_redraws_with_DATUMS_on_(GR)   00:00:50                         00:00:50              00:00:00

The filenames are assumed to use / as path separator. (A proper portable implementation might be a topic for another question.)

You can call this like:

./merge_columns.pl /some/path/a1.txt /another/path/a2.txt

I hope that's helpful.

Upvotes: 2

Related Questions