duxsco
duxsco

Reputation: 361

grep match with following lines that start with whitespace

The Problem

Let's say I have a file with the following content:

lmtp_bind_address (default: empty)
       The LMTP-specific version of the smtp_bind_address configuration parameter.  See there for details.

       This feature is available in Postfix 2.3 and later.

lmtp_bind_address6 (default: empty)
       The LMTP-specific version of the smtp_bind_address6 configuration parameter.  See there for details.

       This feature is available in Postfix 2.3 and later.

lmtp_body_checks (default: empty)
       The LMTP-specific version of the smtp_body_checks configuration parameter. See there for details.

       This feature is available in Postfix 2.5 and later.

I want to get the line starting with "lmtp_bind_address6" and all following lines that start with a whitespace:

lmtp_bind_address6 (default: empty)
       The LMTP-specific version of the smtp_bind_address6 configuration parameter.  See there for details.

       This feature is available in Postfix 2.3 and later.

How can I do this in bash?

Background Info

I am moving away from Centos 7 and am setting up Postfix on Debian 8. Therefore, I intend to read the "man 5 postconf" which is 7255 lines long as of Postfix version 2.11.3. In order to speed thing up I want to group Postfix options in the manpage that cover the same thing but for different protocols such as:

smtpd_tls_mandatory_ciphers
lmtp_tls_mandatory_ciphers
smtp_tls_mandatory_ciphers
tlsproxy_tls_mandatory_ciphers

As you can see up above, each of these four options makes it possible to select the TLS ciphers, but for different protocols (SMTPD, LMTP etc.).

First, I grouped the Postfix options the following way:

postconf | awk -F"=" '{print $1}' | sed 's/ //g' | rev | sort | \
rev > postconf_options.txt

Then, I dumped the manpage to a file:

man 5 postconf > postconf_manpage.txt

Now, I want to go through the grouped list and dump every match with the following lines, that start with a whitespace, to a third file:

cat postconf_options.txt | while read I; do
  grep -w "^$I" postconf_manpage.txt
done > postconf_manpage_grouped.txt

The above grep command just gives out the line with the match, but not the following lines starting with a whitespace.

Solution

I used double quotes with sed in order to be able to use variables. Here is the complete procedure with the solution from hek2mgl:

postconf | awk -F"=" '{print $1}' | sed 's/ //g' | rev | sort | rev > postconf_options.txt
man 5 postconf > postconf_manpage.txt
cat postconf_options.txt | while read I; do sed -n "/^$I/{p;:a;n;/^[[:space:]]\|^$/{p;ba}}" postconf_manpage.txt; done > postconf_manpage_grouped.txt

Upvotes: 4

Views: 1434

Answers (2)

hek2mgl
hek2mgl

Reputation: 157947

I would use sed:

sed -n '/^lmtp_bind_address6/{p;:a;n;/^[[:space:]]\|^$/{p;ba}}' file
  • /^lmtp_bind_address6/ matches the search pattern
  • {p;:a;n;/^[[:space:]]\|^$/{p;ba}} does print the line using p, defines a label :a which acts as a jump-mark to be able to iterate over lines starting with a space. n will read another line into the pattern buffer. /^[[:space:]]\|^$/ matches a line that starts with a space or an empty line. If that's the case the line will get printed using p and we'll jump back to a using ba.

Upvotes: 3

anubhava
anubhava

Reputation: 784968

I want to get the line starting with "lmtp_bind_address6" and all following lines that start with a whitespace

You can use this awk:

awk 'p && /^[^[:blank:]]/{p=0} /^lmtp_bind_address6/{p=1} p' file

lmtp_bind_address6 (default: empty)
       The LMTP-specific version of the smtp_bind_address6 configuration parameter.  See there for details.

       This feature is available in Postfix 2.3 and later.
  • /^lmtp_bind_address6/{p=1} sets p=1 when desired pattern is found at line start
  • p && /^[^[:blank:]]/{p=0} resets p=0 when p is set and when a non-blank character is found at line start.

Upvotes: 1

Related Questions