bipsen
bipsen

Reputation: 5

Replace first x occurrences on each line

Using perl on linux, I am trying to parse input from a FIFO and truncate parts of each line, and replace some of the remaining characters - in order to format it for a command line utility, that will be called when all the pending lines in the FIFO has been modified.

My input can look like this:

[1466621350] PROCESS_SERVICE_CHECK_RESULT;rs301;Disk IOPS;0;No disks exceeds defined IOPS thresholds | sda=1.40;100;200 sdb=0.00;200;400 sdc=0.00;100;200 sdd=0.00;800;900 sde=0.00;800;900 sdf=0.40;200;3003
[1466621350] PROCESS_SERVICE_CHECK_RESULT;rs301a;Connectivity - Admin sessions;0;Connection OK |
[1466621350] PROCESS_SERVICE_CHECK_RESULT;rs301a;Uptime;0;Uptime ok - 253 days 07:53:49 |
[1466621350] PROCESS_SERVICE_CHECK_RESULT;rs301a;Volumes in pool;0;Number of volumes: 500 is OK | numvols=500

The first part of the string - up to the first semicolon should be deleted - this will give me:

rs301;Disk IOPS;0;No disks exceeds defined IOPS thresholds | sda=1.40;100;200 sdb=0.00;200;400 sdc=0.00;100;200 sdd=0.00;800;900 sde=0.00;800;900 sdf=0.40;200;3003
rs301a;Connectivity - Admin sessions;0;Connection OK |
rs301a;Uptime;0;Uptime ok - 253 days 07:53:49 |
rs301a;Volumes in pool;0;Number of volumes: 500 is OK | numvols=500

From this, I need to replace the first 3 semicolons with TAB characters instead.

I am no expert in Perl Regular Expressions - so I have no idea how to achieve the desired output.

Is anyone able to help me out? What should my script variable replacement line look like ?

I know for sure, that this doesn't work, as it replaces all semicolons:

$nsca_mystr=~s/\;/\t/g;

Upvotes: 0

Views: 180

Answers (4)

PerlDuck
PerlDuck

Reputation: 5720

perl -p -e 's/^.+?;(.+?);(.+?);(.+?);/$1\t$2\t$3\t/;' < infile > outfile

This "splits" the input lines into four "fields" at the first four ; and replaces them with the 2nd, 3rd, and 4th field separated by \t. The 1st field is dropped and the remaining text (after the 4th ;) remains unchanged.

It could also be written as

perl -p -e 's/^(.+?);(.+?);(.+?);(.+?);/$2\t$3\t$4\t/;' < infile > outfile

to make the intention clearer, but this will be (in theory) slightly slower because it captures 4 groups instead of 3 and throws the 1st one away.

Upvotes: 1

SomeDude
SomeDude

Reputation: 14228

Ths is what I can think of :

First remove "[1466621350] PROCESS_SERVICE....." part

Then remove the first 3 ";"

#!/usr/bin/perl
use warnings;
my $line = "[1466621350] PROCESS_SERVICE_CHECK_RESULT;rs301;Disk IOPS;0;No disks exceeds defined IOPS thresholds | sda=1.40;100;200 sdb=0.00;200;400 sdc=0.00;100;200 sdd=0.00;800;900 sde=0.00;800;900 sdf=0.40;200;3003";


$line =~ s/[[\d]+]\s+PROCESS_SERVICE_CHECK_RESULT;//g;

print "After regex : ";
$line =~ s/(\w+);(\w+\s\w+);(\d+);(.*)/$1\t$2\t$3\t$4/g;
print $line;
print "\n";

Output :

After regex : rs301     Disk IOPS       0       No disks exceeds defined IOPS thresholds | sda=1.40;100;200 sdb=0.00;200;400 sdc=0.00;100;200 sdd=0.00;800;900 sde=0.00;800;900 sdf=0.40;200;3003

Upvotes: 0

fugu
fugu

Reputation: 6568

while (<DATA>) {
    my ($line) = $_ =~ /;(.*)/; # capture everything after first `;`
    $line =~ s/;/\t/ for 1 .. 3; # substitute `;` for \t for first 3 occurrences
    print "$line\n"; 
}

Upvotes: 1

Andrey
Andrey

Reputation: 1818

Try this for each line:

$nsca_mystr =~ s/^(.+?);(.+?);(.+?);/$1\t$2\t$3\t/;

Upvotes: 0

Related Questions