sherpaurgen
sherpaurgen

Reputation: 3274

extract the lines from file with bash or python

Here is my file content which is output of pflogsumm

Host/Domain Summary: Messages Received 
---------------------------------------
 msg cnt   bytes   host/domain
 -------- -------  -----------
    415     5416k  abc.com
     13    19072   xyz.localdomain

Senders by message count
------------------------
    415   [email protected]
     13   [email protected]

Recipients by message count
---------------------------
    506   [email protected]            <= Extract from here to ...
     70   [email protected]
     ..
     ...
     19   [email protected]
     17   [email protected]
     13   [email protected]           <= Extract ends here

Senders by message size
-----------------------
   5416k  [email protected]
...
 ...

The output seems to have the information feilds separated by "title" and a "new line". For example Recipients by message count ...<contents of interest> ... NewLine I tried with below sed expression but it returns all lines after matching the string "Recipients by message count"

sed -nr '/.*Recipients by message count/,/\n/ p'

Desired output: All emails under "Recipients by message count"

Upvotes: 1

Views: 127

Answers (6)

SLePort
SLePort

Reputation: 15461

Another sed one liner :

 sed '/Recipients by message count/,/^$/!d;//{N;d};' file

Upvotes: 0

Andreas Louv
Andreas Louv

Reputation: 47117

Using awk:

awk '/Recipients by message count/{p=1}!$0{p=0}p' input_file

Will print the Recipients by message count block

Breakdown:

/Recipients by message count/ {p=1} # When /pattern/ is matched set p = 1
!$0 {p=0}                           # When input line is empty set p = 0
p                                   # Print line if p is true, short for:
                                    # p { print $0 }

Upvotes: 4

sjsam
sjsam

Reputation: 21965

Below script :

sed -n '/Recipients/{n;n;:loop;/^$/!{p;n;b loop};q}' filename

will do the job for you.

Note : If the pattern of interest is at the very end, you require a trailing blank line.

Upvotes: 1

TessellatingHeckler
TessellatingHeckler

Reputation: 29033

An awk command, for the lines between "Recipients" and "Senders", if the line starts with a space, print it.

[name@server ~]$ awk '/^Recipients/,/^Senders/ { if ($0~/^ /) print }' input.txt
    506   [email protected]            <= Extracter from here to ...
     70   [email protected]
     ..
     ...
     19   [email protected]
     17   [email protected]
     13   [email protected]           <= Extract ends here

Upvotes: 0

Jorgen
Jorgen

Reputation: 195

Something like this:

    findthis = "Recipients by message count"

    with open("tst.dat") as f:
      while True:
        line = f.readline()
        if not line: break

        if not findthis in line:
          continue
        line = f.readline()

        while True:
          line = f.readline()
          if not line: break
          line = line.rstrip()     ## get rid of whitespace
          if line == "":           ## empty line
            break
          print(line)

If the file is big or you have wildcard searches, use the regular expression library.

Upvotes: 1

riteshtch
riteshtch

Reputation: 8769

$ sed -n '/Recipients by message count/,/^\s*$/ p' data | sed -n '1!{2!{$!p}}'
    506   [email protected]            <= Extracter from here to ...
     70   [email protected]
     ..
     ...
     19   [email protected]
     17   [email protected]
     13   [email protected]           <= Extract ends here

Upvotes: 2

Related Questions