Village
Village

Reputation: 24363

How to make any sed, awk, and grep command ignore lines beginning with #?

I have a script containing a wide variety of sed, awk, and grep commands, the only similarity is that they each receive an input file, edit it, then send it to an output file. Here are some basic examples:

sed "/^word$/d" input.txt > output.txt
grep "word" input.txt > output.txt
awk 'sub(/.*{/,"")' RS='}' input.txt > output.txt

I need to edit all of the scripts, such that they will always ignore any lines that are commented out by placing a # at the start of the line. Is there a single solution, e.g. using pipe, to get any kind of sed, awk, and grep to ignore such commented lines, or do I need to use some in-built features in each of these to get them to ignore commented-out lines?

How can I make any sed, awk, and grep command ignore lines beginning with #?

Upvotes: 2

Views: 4392

Answers (3)

Ed Morton
Ed Morton

Reputation: 203403

Put grep -v '^#' file | in front of each command.

Upvotes: 2

Kaz
Kaz

Reputation: 58578

You can come close with Bash process substitution, which can use functions in your scripts, not only external commands:

#!/bin/bash

strip_comments()
{
   sed -e '/^#.*/d' "$1" 
}

# like grep "$1" "$2", but on comment-stripped "$2"
grep "$1" <(strip_comments "$2")

# ... other commands that use <(strip_comments ...)

Of course, grep cannot report the original locations in the original file. Ideally we would want some kind of macro language to hide this stuff.

To make the syntax look like

grep pattern input

we would need macro-preprocessing: something which recognizes the symbol input, and macro-substitutes in the <(whatever ...) syntax.

If you just put that into a variable and then use $variable, that won't work; however, it will if eval is used. With eval, you have to then escape things against the double evaluation; it's not pretty.

input='<(strip_comments input.txt)'  # quoted: this is like a symbol macro

# ...

eval grep '$pattern' $input

(Given that the original Bourne shell was developed by a man who began the program with #include <algol.h>, a header full of macros for making C look like Algol, it's ironic that the macro capability isn't there.)

Upvotes: 1

jaypal singh
jaypal singh

Reputation: 77095

You can make following changes to your script to ignore # lines:

For sed: (The following would ignore anyways since you have anchored word)

sed "/^#/!{/^word$/d}" input.txt > output.txt

For grep:

grep -v '^#' input.txt | grep "word" > output.txt

or

as suggested by @devnull in the comments, if your grep supports the -P option then you can do:

grep -P '^(?!#).*word' input.txt > output.txt

which is a negative lookahead telling grep that pick words that do not have # anchored at the beginning of the line.

For awk:

awk '!/^#/{sub(/.*{/,"")}' RS='}' input.txt > output.txt

Upvotes: 3

Related Questions