IAmYourFaja
IAmYourFaja

Reputation: 56914

Perl script to configure grep output

I have a pipe-delimited log file with the following column format:

<date>  <time> | <fruit> | <color> | <num_1> | <num_2> | <num_3>

So for example:

2013-03-27  23:01:52 | apple | green | 55 | 120 | 29
2013-03-27  23:01:56 | plumb | purple | 28 | 1 | 394
2013-03-27  23:01:59 | apple | red | 553 | 21 | 7822

I would like to write a perl script (though python or bash is acceptable as well) that greps out the <date> and <time> field (column 1) and either <num_1>, <num_2> or <num_3>, depending on the input you give the script. Hence running perl extract.pl 2 on the above information would give you <date>, <time> and <num_2>:

2013-03-27  23:01:52 | 120
2013-03-27  23:01:56 | 1
2013-03-27  23:01:59 | 21

I tried the following but it doesn't seem to work:

#!/usr/bin/perl

use warnings;
use strict;

my $col = $1;

print `grep "myapplog.txt" "m/_(\d{4})(\d\d)(\d\d)/ | $col"`

Here, I'm setting the col var to the script's first arg, and then trying to print the grep matching the datetime of the first column and the desires <num_X> column. Any ideas? Thanks in advance.

Upvotes: 3

Views: 806

Answers (2)

Gilles Qu&#233;not
Gilles Qu&#233;not

Reputation: 185073

Try doing this

using the first argument like in your wish (use @ARGV array, not $1 in perl):

#!/usr/bin/perl

use warnings; use strict;
use autodie; # No need to check open() errors

$\ = "\n";   # output record separator (no need \n)

# file-handle
open my $fh, "<", "myapplog.txt";

chomp(my $col = $ARGV[0]);

die("Not an integer !\n") unless $col =~ /^\d+$/;

# using the famous and magical <diamond> operator:
while (<$fh>) {
    chomp;
    my @F = split /\|/; # splitting current line in @F array
    print join("|", @F[0,$col+2]); # join on a array slice
}

close $fh;

Upvotes: 1

Fredrik Pihl
Fredrik Pihl

Reputation: 45662

Try using perl in awk-mode

$ perl -F'\|' -lane 'print $F[0]," | ", $F[4]' input
2013-03-27  23:01:52  |  120 
2013-03-27  23:01:56  |  1 
2013-03-27  23:01:59  |  21 

Pure awk:

awk -F"|" '{print $1, "|", $5}' input

Pure bash:

#!/bin/bash

IFS="|"

while read -a ARRAY;
do
    echo ${ARRAY[0]} "|" ${ARRAY[4]}
done < input

update

The pass e.g. a parameter to the awk-solution to determine witch column to print, use:

$ awk -vcol="5" -F"|" '{print $1, "|", $col}' input

in bash, the first parameter to function/script resides in $1 so use that as an index into ARRAY.

Something more official than a one-liner, using python:

#!/usr/bin/env python

import sys

col = raw_input('which column to print? -> ')
try:
    col = int(col)
except ValueError:
    print >> sys.stderr, "That was no integer"

with open("input") as fd:
    for line in fd:
        tmp = line.strip().split('|')
        print tmp[0], "|", tmp[col]

Upvotes: 4

Related Questions