MikeT
MikeT

Reputation: 21

Programmable arguments in perl pipes

I'm gradually working my way up the perl learning curve (with thanks to contributors to this REALLY helpful site), but am struggling with how to approach this particular issue.

I'm building a perl utility which utilises three (c++) third party programmes. Normally these are run: A $file_list | B -args | C $file_out

where process A reads multiple files, process B modifies each individual file and process C collects all input files in the pipe and produces a single output file, with a null input file signifying the end of the input stream.

The input files are large(ish) at around 100Mb and around 10 in number. The processes are CPU intensive and the whole process need to be applied to thousands of groups of files each day, so the simple solution of reading and writing intermediate files to disk is simply too inefficient. In addition, the process above is only part of a processing sequence, where the input files are already in memory and the output file also needs to be in memory for further processing.

There are a number of solutions to this already well documented and I have a prototype version utilising IPC::Open3(). So far, so good. :)

However - when piping each file to process A through process B I need to modify the arguments in process B for each input file without interrupting the forward flow to process C. This is where I come unstuck and am looking for some suggestions.

As further background:

  1. Running in Ubuntu 16.04 LTS (currently within Virtual box)and perl v5.22.1
  2. The programme will run on (and within) a single machine by one user (me !), i.e. no external network communication or multi user or public requirement - so simplicity of programming is preferred over strong security.
  3. Since the process must run repeatedly without interruption, robust/reliable I/O handling is required.
  4. I have access to the source code of each process, so that could be modified (although I'd prefer not to).

My apologies for the lack of "code to date", but I thought the question is more one of "How do I approach this?" rather than "How do I get my code to work?".

Any pointers or help would be very much appreciated.

Upvotes: 2

Views: 68

Answers (2)

Sobrique
Sobrique

Reputation: 53498

If you're looking to feed output from different programs into the pipes, I'd suggest what you want to look at is ... well, pipe.

This lets you set up a pipe - that works much like the ones you get from IPC::Open3 but have a bit more control over what you read/write into it.

Upvotes: 2

JimNicholson
JimNicholson

Reputation: 351

You need a fourth program (call it D) that determines what the arguments to B should be and executes B with those arguments and with D's stdin and stdout connected to B's stdin and stdout. You can then replace B with D in your pipeline.

What language you use for D is up to you.

Upvotes: 2

Related Questions