Reputation: 56914
I have a pipe-delimited log file with the following column format:
<date> <time> | <fruit> | <color> | <num_1> | <num_2> | <num_3>
So for example:
2013-03-27 23:01:52 | apple | green | 55 | 120 | 29
2013-03-27 23:01:56 | plumb | purple | 28 | 1 | 394
2013-03-27 23:01:59 | apple | red | 553 | 21 | 7822
I would like to write a perl script (though python or bash is acceptable as well) that greps
out the <date>
and <time>
field (column 1) and either <num_1>
, <num_2>
or <num_3>
, depending on the input you give the script. Hence running perl extract.pl 2
on the above information would give you <date>
, <time>
and <num_2>
:
2013-03-27 23:01:52 | 120
2013-03-27 23:01:56 | 1
2013-03-27 23:01:59 | 21
I tried the following but it doesn't seem to work:
#!/usr/bin/perl
use warnings;
use strict;
my $col = $1;
print `grep "myapplog.txt" "m/_(\d{4})(\d\d)(\d\d)/ | $col"`
Here, I'm setting the col
var to the script's first arg, and then trying to print the grep matching the datetime of the first column and the desires <num_X>
column. Any ideas? Thanks in advance.
Upvotes: 3
Views: 806
Reputation: 185073
Try doing this
using the first argument like in your wish (use @ARGV
array, not $1
in perl
):
#!/usr/bin/perl
use warnings; use strict;
use autodie; # No need to check open() errors
$\ = "\n"; # output record separator (no need \n)
# file-handle
open my $fh, "<", "myapplog.txt";
chomp(my $col = $ARGV[0]);
die("Not an integer !\n") unless $col =~ /^\d+$/;
# using the famous and magical <diamond> operator:
while (<$fh>) {
chomp;
my @F = split /\|/; # splitting current line in @F array
print join("|", @F[0,$col+2]); # join on a array slice
}
close $fh;
Upvotes: 1
Reputation: 45662
Try using perl in awk-mode
$ perl -F'\|' -lane 'print $F[0]," | ", $F[4]' input
2013-03-27 23:01:52 | 120
2013-03-27 23:01:56 | 1
2013-03-27 23:01:59 | 21
Pure awk:
awk -F"|" '{print $1, "|", $5}' input
Pure bash:
#!/bin/bash
IFS="|"
while read -a ARRAY;
do
echo ${ARRAY[0]} "|" ${ARRAY[4]}
done < input
update
The pass e.g. a parameter to the awk-solution to determine witch column to print, use:
$ awk -vcol="5" -F"|" '{print $1, "|", $col}' input
in bash, the first parameter to function/script resides in $1
so use that as an index into ARRAY.
Something more official than a one-liner, using python:
#!/usr/bin/env python
import sys
col = raw_input('which column to print? -> ')
try:
col = int(col)
except ValueError:
print >> sys.stderr, "That was no integer"
with open("input") as fd:
for line in fd:
tmp = line.strip().split('|')
print tmp[0], "|", tmp[col]
Upvotes: 4