JamesT
JamesT

Reputation: 9

Perl: Substitute text string with value from list (text file or scalar context)

I am a perl novice, but have read the "Learning Perl" by Schwartz, foy and Phoenix and have a weak understanding of the language. I am still struggling, even after using the book and the web.

My goal is to be able to do the following:

  1. Search a specific folder (current folder) and grab filenames with full path. Save filenames with complete path and current foldername.

  2. Open a template file and insert the filenames with full path at a specific location (e.g. using substitution) as well as current foldername (in another location in the same text file, I have not gotten this far yet).

  3. Save the new modified file to a new file in a specific location (current folder).

I have many files/folders that I want to process and plan to copy the perl program to each of these folders so the perl program can make new .

I have gotten so far ...:

use strict;
use warnings;
use Cwd;
use File::Spec;
use File::Basename;
my $current_dir = getcwd;
open SECONTROL_TEMPLATE, '<secontrol_template.txt' or die "Can't open SECONTROL_TEMPLATE: $!\n";
my @secontrol_template = <SECONTROL_TEMPLATE>;
close SECONTROL_TEMPLATE;
opendir(DIR, $current_dir) or die $!;
my @seq_files = grep {
    /gz/
    } readdir (DIR);
open FASTQFILENAMES, '> fastqfilenames.txt' or die "Can't open fastqfilenames.txt: $!\n";
my @fastqfiles;
foreach (@seq_files) {
    $_ = File::Spec->catfile($current_dir, $_);
    push(@fastqfiles,$_);
}
print FASTQFILENAMES @fastqfiles;
open (my ($fastqfilenames),  "<", "fastqfilenames.txt") or die "Can't open fastqfilenames.txt: $!\n";
my @secontrol;
foreach (@secontrol_template) {
    $_ =~ s/@/$fastqfilenames/eg;
    push(@secontrol,$_);
}
open SECONTROL, '> secontrol.txt' or die "Can't open SECONTROL: $!\n";
print SECONTROL @secontrol;
close SECONTROL;
close FASTQFILENAMES;

My problem is that I cannot figure out how to use my list of files to replace the "@" in my template text file:

my @secontrol;
foreach (@secontrol_template) {
    $_ =~ s/@/$fastqfilenames/eg;
    push(@secontrol,$_);
}

The substitute function will not replace the "@" with the list of files listed in $fastqfilenames. I get the "@" replaced with GLOB(0x8ab1dc).

Am I doing this the wrong way? Should I not use substitute as this can not be done, and then rather insert the list of files ($fastqfilenames) in the template.txt file? Instead of the $fastqfilenames, can I substitute with content of file (e.g. s/A/{r file.txt ...). Any suggestions?

Cheers,

JamesT

EDIT:

This made it all better.

foreach (@secontrol_template) {
    s/@/$fastqfilenames/g;
    push @secontrol, $_;
}

And as both suggestions, the $fastqfiles is a filehandle.

replaced this: open (my ($fastqfilenames), "<", "fastqfilenames.txt") or die "Can't open fastqfilenames.txt: $!\n";

with this:

my $fastqfilenames = join "\n", @fastqfiles; 

made it all good. Thanks both of you.

Upvotes: 0

Views: 1622

Answers (2)

user1919238
user1919238

Reputation:

$fastqfilenames is a filehandle. You have to read the information out of the filehandle before you can use it.

However, you have other problems.

You are printing all of the filenames to a file, then reading them back out of the file. This is not only a questionable design (why read from the file again, since you already have what you need in an array?), it also won't even work:

Perl buffers file I/O for performance reasons. The lines you have written to the file may not actually be there yet, because Perl is waiting until it has a large chunk of data saved up, to write it all at once.

You can override this buffering behavior in a few different ways (closing the file handle being the simplest if you are done writing to it), but as I said, there is no reason to reopen the file again and read from it anyway.

Also note, the /e option in a regex replacement evaluates the replacement as Perl code. This is not necessary in your case, so you should remove it.

Solution: Instead of reopening the file and reading it, just use the @fastqfiles variable you previously created when replacing in the template. It is not clear exactly what you mean by replacing @ with the filenames.

  • Do you want to to replace each @ with a list of all filenames together? If so, you should probably need to join the filenames together in some way before doing the replacement.

  • Do you want to create a separate version of the template file for each filename? If so, you need an inner for loop that goes over each filename for each template. And you will need something other than a simple replacement, because the replacement will change the original string on the first time through. If you are on Perl 5.16, you could use the /r option to replace non-destructively: push(@secontrol,s/@/$file_name/gr); Otherwise, you should copy to another variable before doing the replacement.

Upvotes: 0

Miguel Prz
Miguel Prz

Reputation: 13792

$_ =~ s/@/$fastqfilenames/eg;

$fastqfilenames is a file handle, not the file contents.

In any case, I recommend the use of Text::Template module in order to do this kind of work (file text substitution).

Upvotes: 0

Related Questions