Reputation: 33
I need to split a file into different ones.
Exmaple (original file):
*****3123123*****RAW
text1
text2
*****2312354***RAW
text3
Desired output:
[File1.txt]
*****3123123*****RAW
text1
text2
[File2.txt]
*****312312354***RAW
text3
I tried to use split, but I always get some extra white characters into the array
open FILE, "<file";
@file= <FILE>;
close FILE;
@lines = split (/(RAW\n)/, "@file");
foreach $value (@lines) {
if ($value =~ /[a-z]|[A-Z]|[1-9]/) {
print ("$value\n");
}
}
Output:
*****3123123*****RAW
text1
text2
*****312312354***RAW
text3
Edit: if I use print ("$value") instead of print ("$value\n") this is the output (notice the 1 extra space before the value:
*****3123123*****RAW
text1
text2
*****12354***RAW
text3
Upvotes: 3
Views: 13371
Reputation: 126722
This program pulls the decimal number from the RAW
line and uses it to name the output files. It expects the input file name as a parameter on the command line.
use strict;
use warnings;
@ARGV or die "Input file required as command-line parameter\n";
my $out;
while (<>) {
if ( /(\d+)\*+RAW$/ ) {
open $out, '>', "$1.out" or die $!;
select $out;
}
print $_ if $out;
}
Upvotes: 3
Reputation: 43673
If you want to stay with code you made, then simply just replace your line print ("$value\n");
with print ("$value");
and you've got it...
Or before print
remove \n
with chomp($value);
and stay with output print ("$value\n");
.
Upvotes: 0
Reputation: 39158
use strictures;
use File::Slurp qw(read_file write_file);
my $raw = read_file('raw.txt', binmode => ':raw');
my $header = qr/^ (?= [*]+ [0-9]+ [*]+ RAW\n)/msx;
my @chunks = split $header, $raw;
# (
# "*****3123123*****RAW\ntext1\ntext2\n",
# "*****2312354***RAW\ntext3"
# )
for my $i (1..@chunks) {
write_file("File$i.txt", {binmode => ':raw'}, $chunks[$i-1]);
}
Upvotes: 2
Reputation: 6592
Here's what I came up with. I can't help but feel this is reinventing the wheel.
#!usr/bin/perl
my $fi, $fi2;
my $line;
my $i;
my @lines;
my @filenameparts;
my $filename = "file_1.txt";
open($fi, "< original.txt");
@lines = <$fi>;
open ($fi2, " > $filename");
foreach (@lines)
{
if (($i > 0) and $_ =~ /RAW/)
{
@filenameparts = split("_", $filename);
foreach (@filenameparts)
{
print "Woooo".$_;
}
@filenameparts[1] = substr(@filenameparts[1], 0, @filenameparts[1].length() - 5);
@filenameparts[1] = ($filenameparts[1] + 1);
$filename = @filenameparts[0]."_".@filenameparts[1].".txt";
print $filename;
close($fi2);
open ($fi2, " > $filename");
$i = 0;
print $fi2 $_;
}
else
{
print $fi2 $_;
}
$i++;
}
Upvotes: 0
Reputation: 5555
You might do better with line-wise IO:
my $id = 0;
my $FILE = undef;
while (<>) {
if (/RAW/) {
close $FILE if defined $FILE;
$id++;
my $path = "File$id.txt";
open $FILE, '>', $path or die "Could not open $path: $!";
}
print $FILE $_ if defined $FILE;
}
close $FILE if defined $FILE;
Copied and adapted from one of my scripts that splits a mailbox file into one file per mail. You will have to adapt the script if the first line does not match /RAW/
Upvotes: 2