Reputation: 287
I have the following:
XXUM_7_mauve_999119_ser_11.255255
UXUM_566_mauve_999119_ser_11.255255
IXUM_23_mauve_999119_ser_11.255255
and my attempt, which did not work, at a perl one liner to extract the first digit is as follows;
perl -pi -e "s/\S+_(\.+)_.+/Number$1/g" *.txt
I expected the following results:
Number 007
Number 566
Number 023
pls help
Upvotes: 2
Views: 790
Reputation: 126722
The problem is that this regex pattern /\S+_(\.+)_.+/
looks for a sequence of one or more literal dots .
surrounded by underscores, so something like _..._
would match, but such a sequence doesn't exist in your file. I think you didn't mean to escape the dot. But even then, because the \S+
is greedy, it would find and capture the last field delimited by underscores, and so would capture ser
from all three lines. Perhaps you meant to write \d+
instead of \.+
, which is pretty much what I have written below.
This will do as you ask. It looks for the first occurrence of an underscore that is followed by a number of decimal digits, and uses printf
to format the number as three digits.
You can add the -i
qualifier, but I suggest you test it as it is first to save overwriting your data with erroneous results. Of course you could redirect the output to another file if you wished.
perl -ne'/_(\d+)/ and printf "Number %03d\n", $1' myfile
output
Number 007
Number 566
Number 023
Upvotes: 2
Reputation: 36
cat > /tmp/test
XXUM_7_mauve_999119_ser_11.255255
UXUM_566_mauve_999119_ser_11.255255
IXUM_23_mauve_999119_ser_11.255255
perl -i -ne 'if ($_=~/^\w+\_(\d+)\_mauve/g) { printf "Number %03d\n", $1; }' /tmp/test
cat /tmp/test
Number 007
Number 566
Number 023
Upvotes: 1
Reputation: 241758
I'd use the -n
option instead of the -p
option and do the printing and formatting in the code:
perl -i~ -ne 'if (($num) = /[0-9]+/g) {
printf "Number %03d\n", $num;
} else {
print
}' *.txt
Upvotes: 1