Reputation: 11294
Does anyone know any unix commands/perl script that would insert a specific character (that can be entered as either hex (ie 7C) or as the actual character (ie |)) in the position of the nth recurring occurence of a specific character.
ie perl script.pl "," 3 "|" data.txt
would replace every 3rd,6th,9th...etc comma with a pipe.
So if data.txt had the following before the script was run:
fd,3232,gfd67gf,
peas,989767,jkdfnfgjhf,
dhdhjsk,267,ujfdsy,fuyds,637296,ldosi,fduy,
873,fuisouyd,try
save,2837,ipoi
It should then have this after the script was run:
fd,3232,gfd67gf|
peas,989767,jkdfnfgjhf|
dhdhjsk,267,ujfdsy|fuyds,637296,ldosi|fduy,
873,fuisouyd|try
save,2837,ipoi
Upvotes: 3
Views: 2870
Reputation: 1359
I have an idea in bash script :
perl -pe 's/,/(++$n % 3 == 0) ? "|" : $&/ge' data.txt
That will do the trick.
Upvotes: 2
Reputation: 6856
This processes the input file one line at a time (no slurping :)
For hex input, just pass '\x7C'
or whatever, as $1
#!/bin/bash
b="${1:-,}" # the "before" field delimiter
n="${2:-3}" # the number of fields in a group
a="${3:-|}"; [[ $a == [\|] ]] && a='\|' # the "after" group delimiter
sed -nr "x;G; /(([^$b]+$b){$((n-1))}[^$b]+)$b/{s//\1$a/g}
s/.*\n//; h; /.*$a/{s///; x}; p" input_file
Here it is again, with some comments.
sed -nr "x;G # pat = hold + pat
/(([^$b]+$b){$((n-1))}[^$b]+)$b/{s//\1$a/g}
s/.*\n// # del fields from prev line
h # hold = mod*\n
/.*$a/{ s/// # pat = unmodified
x # hold = unmodified, pat = mod*\n
}
p # print line" input_file
Upvotes: 1
Reputation: 42411
# Get params and create part of the regex.
my $delim = "\\" . shift;
my $n = shift;
my $repl = shift;
my $wild = '.*?';
my $pattern = ($wild . $delim) x ($n - 1);
# Slurp.
$/ = undef;
my $text = <>;
# Replace and print.
$text =~ s/($pattern$wild)$delim/$1$repl/sg;
print $text;
Upvotes: 1
Reputation: 56059
How about a nice, simple awk
one-liner?
awk -v RS=, '{ORS=(++i%3?",":"|");print}' file.csv
One minor bug just occurred to me: it will print a ,
or |
as the very last character. To avoid this, we need to alter it slightly:
awk -v RS=, '{ORS=(++i%3?",":"|");print}END{print ""}' file.csv | sed '$d'
Upvotes: 3
Reputation: 67900
Small perl hack to solve the problem. Using the index
function to find the commas, modulus to replace the right one, and substr
to perform the replacement.
use strict;
use warnings;
while (<>) {
my $x=index($_,",");
my $i = 0;
while ($x != -1) {
$i++;
unless ($i % 3) {
$_ = substr($_,0,$x) ."|". substr($_,$x+1);
}
$x = index($_,",",$x + 1)
}
print;
}
Run with perl script.pl file.csv
.
Note: You can place the declaration my $i
before the while(<>)
loop in order to do a global count, instead of a separate count for each line. Not quite sure I understood your question in that regard.
Upvotes: 5
Reputation: 39158
use File::Slurp qw(read_file);
my ($from, $to, $every, $fname) = @ARGV;
my $counter = 0;
my $in = read_file $fname;
my $out = $in;
# copy is important because pos magic attached to $in resets with substr
while ($in =~ /\Q$from/gms) {
$counter++;
substr $out, pos($in)-1, length($from), $to unless $counter % $every;
};
print $out;
If the $from
and $to
parameters have different length, you still need to mess a bit with the second parameter of substr
to make it work correctly.
Upvotes: 3