Piping unfiltered text to an awk system command's stdin

Question

I have an gawk script that has accumulated a bunch of HTML in a variable, and now should pipe it to lynx via a system command.

(feel free to tell me AWK is a bad solution... while read LINE; was wildly bad (slow), so this is take 2)

I tried this in awk:

    cmd = sprintf( "bash -c \'lynx -dump -force_html -stdin <<< \"%s\"\'", html )
    system ( cmd )

Bad idea, although simple test cases work, with raw HTML, special character issues and string termination issues abound, and escapes-within-escapes-within-escapes is just getting mindbogglingly complex.

lynx handles well whatever I throw at it on stdin, I just can't get it to stdin from awk without piping it through the the command line, which seems like an unwieldy solution.

Edit (adding detail about my end goal) in case awk isn't a good approach:

What I want is to parse HTML out of a large text file with delimiters between blocks of html. I need to pass each block of HTML to lynx to be formatted and dump that into a new, big text file.

Example input (a dump from another system):

**********URL: http://some/url

Any 'ol HTML document</title</head>
<body>
<p>With pretty much any character you can imagine at some point</p>
<p>I'm using lynx to strip off the HTML and give me a nice format</p>
</body>
</html>
**********URL: http://another/url
<html><head><title>My input file provides a few 100,000 such html documents

Each HTML document should be feed through lynx -dump. Lynx can read in the HTML from file (e.g. named pipe, or file is an option), or stdin (with the -stdin option).

My output is then:

**********URL: http://some/url
  Any 'ol HTML document

  With pretty much any character you can imagine at some point
  I'm using lynx to strip off the HTML and give me a nice format
**********URL: http://another/url
  My input file provides a few 100,000 such html documents

n0741337 · Accepted Answer

Try |& in gawk., which I found out about from here. That would let you send the output from gawk to the stdin of another command as a coprocess.

Piping unfiltered text to an awk system command's stdin

Answers (2)

Related Questions

Piping unfiltered text to an awk system command&#39;s stdin

Answers (2)

Related Questions

Piping unfiltered text to an awk system command's stdin