Reputation: 21
I'm gradually working my way up the perl learning curve (with thanks to contributors to this REALLY helpful site), but am struggling with how to approach this particular issue.
I'm building a perl utility which utilises three (c++) third party programmes. Normally these are run: A $file_list | B -args | C $file_out
where process A reads multiple files, process B modifies each individual file and process C collects all input files in the pipe and produces a single output file, with a null input file signifying the end of the input stream.
The input files are large(ish) at around 100Mb and around 10 in number. The processes are CPU intensive and the whole process need to be applied to thousands of groups of files each day, so the simple solution of reading and writing intermediate files to disk is simply too inefficient. In addition, the process above is only part of a processing sequence, where the input files are already in memory and the output file also needs to be in memory for further processing.
There are a number of solutions to this already well documented and I have a prototype version utilising IPC::Open3(). So far, so good. :)
However - when piping each file to process A through process B I need to modify the arguments in process B for each input file without interrupting the forward flow to process C. This is where I come unstuck and am looking for some suggestions.
As further background:
My apologies for the lack of "code to date", but I thought the question is more one of "How do I approach this?" rather than "How do I get my code to work?".
Any pointers or help would be very much appreciated.
Upvotes: 2
Views: 68
Reputation: 53498
If you're looking to feed output from different programs into the pipes, I'd suggest what you want to look at is ... well, pipe
.
This lets you set up a pipe - that works much like the ones you get from IPC::Open3
but have a bit more control over what you read/write into it.
Upvotes: 2
Reputation: 351
You need a fourth program (call it D) that determines what the arguments to B should be and executes B with those arguments and with D's stdin and stdout connected to B's stdin and stdout. You can then replace B with D in your pipeline.
What language you use for D is up to you.
Upvotes: 2