Reputation: 20448
When I run:
unzip -p /tmp/document.docx word/document.xml | sed -e 's/<\/w:p>/\\n/g; s/<[^>]\{1,\}>//g; s/[^[:print:]\n]\{1,\}//g'
It correctly extracts the text from my .docx file.
But when I try to wrap this in a Node.js program as follows:
const spawn = require("child_process").spawn;
const command = "unzip"; ;
const child = spawn("sh", ["-c", "unzip -p /tmp/document.docx word/document.xml | sed -e 's/<\/w:p>/\\n/g; s/<[^>]\{1,\}>//g; s/[^[:print:]\n]\{1,\}//g'"]);
const stdout = child.stdout;
const stderr = child.stderr;
const output = "";
stderr.on("data", function(data) {
console.error("error on stderr", data.toString());
});
stdout.on("data", function(data) {
output += data;
});
stdout.on("close", function(code) {
});
I get the following error message:
error on stderr sed: -e expression #1, char 10: unknown option to `s'
How do I fix this error?
Upvotes: 1
Views: 372
Reputation: 23565
When using a command line that way in your code, you have to think about the interpretation of the \
made by node.js and antislash the antislash. One for the node.js one for the sed command.
spawn("sh", ["-c", "unzip -p /tmp/document.docx word/document.xml | sed -e 's/<\\/w:p>/\\\\n/g; s/<[^>]\\{1,\\}>//g; s/[^[:print:]\\n]\\{1,\\}//g'"])
Look at here
@T.J Crowder
In JavaScript, the backslash has special meaning both in string literals and in regular expressions. If you want an actual backslash in the string or regex, you have to write two: \.
Upvotes: 1