Reputation: 23161
I'd like to use grep --count --fixed-strings needles.txt < haystack.txt
from a Node.js environment.
Instead of having a file for needles.txt
, I have an array of strings to search, and instead of haystack.txt
I have a large string/buffer of text.
What's the best combination of child_process
methods to use?
Something like:
import {spawn} from "child_process";
// haystack to search within
const haystack = "I am \n such a big string, do you\n see me?";
const readable = new Readable();
readable.push(haystack);
readable.push(null);
// the list of needles that would normally go in `--file=needles.txt`
const needles = ["find", "me", "or", "me"];
// spawn `fgrep`
// Q: How do I pass in `needles` as a string?
const fgrep = spawn(`fgrep`, [needles])
// pipe my haystack to fgrep
readable.pipe(fgrep.stdin);
Upvotes: 0
Views: 306
Reputation: 74670
For the grep
args, -e
lets you specify multiple patterns:
grep -e 1 -e 2
The JS for generating the args will be something like:
const needles = ["find", "me", "or", "me"];
const grep_pattern_args = needles.reduce((res, pattern) => {
res.push('-e', pattern)
return res
}, [])
const grep_args = [ '--count', '--fixed-strings', ...grep_pattern_args ]
3000 needles is heading into the territory of hitting execve
s length limit of MAX_ARG_STRLEN
in Linux of 128kiB. If you have lengthy needles you may need to write them to a file in any case to be safe.
spawn
is good as you get the writable stream back for stdin
that you can write to as haystack
is read/generated (assuming your Readable
stream example setup is contrived)
const stdout = []
const stderr = []
const fgrep = spawn('/usr/bin/fgrep', grep_args, { stdio: ['pipe', 'pipe', 'pipe'] })
fgrep.on('error', console.error)
// For larger output you can process more on the stream.
fgrep.stdout.on('data', chunk => stdout.push(chunk))
fgrep.stderr.on('data', chunk => {
process.stderr.write(chunk)
stderr.push(chunk)
})
fgrep.on('close', (code) => {
if (code !== 0) console.error(`grep process exited with code ${code}`)
stdout.map(chunk => process.stdout.write(chunk))
})
fgrep.stdin.pipe(haystream)
Upvotes: 1