Maxim Veksler
Maxim Veksler

Reputation: 30232

Selection section of file using sed based on predefined header

This expression sed -n '/statistics:/,/^ [^ ]/ p' selects the following section

  Channel statistics:
    Red:
      min: 0 (0)
      max: 255 (1)
      mean: 114.237 (0.447987)
      standard deviation: 115.1 (0.451372)
      kurtosis: -1.92845
      skewness: 0.0962143
    Green:
      min: 0 (0)
      max: 255 (1)
      mean: 113.318 (0.444384)
      standard deviation: 113.041 (0.443298)
      kurtosis: -1.94057
      skewness: 0.0648024
    Blue:
      min: 0 (0)
      max: 255 (1)
      mean: 111.01 (0.435332)
      standard deviation: 110.498 (0.433324)
      kurtosis: -1.92769
      skewness: 0.0747213
  Image statistics:

From the following file:

Image: /tmp/magick-XXpWFUXl
  Base filename: -
  Format: MIFF (Magick Image File Format)
  Class: DirectClass
  Geometry: 480x360+0+0
  Resolution: 72x72
  Print size: 6.66667x5
  Units: Undefined
  Type: TrueColor
  Base type: TrueColor
  Endianess: Undefined
  Colorspace: RGB
  Depth: 8-bit
  Channel depth:
    red: 8-bit
    green: 8-bit
    blue: 8-bit
  Channel statistics:
    Red:
      min: 0 (0)
      max: 255 (1)
      mean: 114.237 (0.447987)
      standard deviation: 115.1 (0.451372)
      kurtosis: -1.92845
      skewness: 0.0962143
    Green:
      min: 0 (0)
      max: 255 (1)
      mean: 113.318 (0.444384)
      standard deviation: 113.041 (0.443298)
      kurtosis: -1.94057
      skewness: 0.0648024
    Blue:
      min: 0 (0)
      max: 255 (1)
      mean: 111.01 (0.435332)
      standard deviation: 110.498 (0.433324)
      kurtosis: -1.92769
      skewness: 0.0747213
  Image statistics:
    Overall:
      min: 0 (0)
      max: 255 (1)
      mean: 84.6411 (0.331926)
      standard deviation: 109.309 (0.428662)
      kurtosis: -1.6052
      skewness: 0.582669
  Rendering intent: Undefined
  Interlace: None
  Background color: white
  Border color: rgb(223,223,223)
  Matte color: grey74
  Transparent color: black
  Compose: Over
  Page geometry: 480x360+0+0
  Dispose: Undefined
  Iterations: 0
  Compression: Zip
  Orientation: Undefined
  Properties:
    date:create: 2011-12-07T12:33:31+02:00
    date:modify: 2011-12-07T12:33:31+02:00
    signature: f2adc51db916151ddcc5b206a8921eec0234efa1eeb7484c0046506b749bc392
  Artifacts:
    verbose: true
  Tainted: False
  Filesize: 179KB
  Number pixels: 173KB
  Pixels per second: 0b
  User time: 0.000u
  Elapsed time: 0:01.000
  Version: ImageMagick 6.6.0-4 2011-06-15 Q16 http://www.imagemagick.org

The source of the expression is taken from the following page http://www.imagemagick.org/Usage/compare/

Upvotes: 1

Views: 110

Answers (2)

jaypal singh
jaypal singh

Reputation: 77155

Your expression: sed -n '/statistics:/,/^ [^ ]/ p'

How and why does it work?

sed in it's natural form follows the syntax sed 's/substitution/replacement/[g]' where s is for substitution and an optional g at the end for global replacement (if substitution text is found more than once in a line.

But sed can do much more. It has the ability to restrict the operation to certain lines. You can do that by -

 1. Specifying a line by its number. 
 2. Specifying a range of lines by number.
 3. All lines containing a pattern.
 4. All lines from the beginning of a file to a regular expression
 5. All lines from a regular expression to the end of the file.
 6. All lines between two regular expressions.

What is the sed format ?

Your sed format takes the last form. It starts performing it's magic from the line that contains statistics: to the line that contains exactly two spaces from the beginning of the line i.e __[^_] where _ is the space. i.e

sed -n '/statistics:/,/^ [^ ]/ p'
   |   ||           | |      | |
    ---  -----------   ------  V
     |        |           |    Since we suppressed
 Suppress This is     This is  the output, we need
  output   your         your   to invoke print
           start        end
           range       range

Why it selects the Channel statistics: section but not the Image statistics:?

In your original text lines after Image Statistics: are indented and have more than 2 spaces from the beginning of the line as a result they are not displayed. If you want to include Image Statistics: you can modify your Address End Range to be like this -

sed -n '/statistics:/,/^  Ren.*/p'

Why -n and p?:sed in it's natural form prints everything. Each line is put in the pattern space upon which all actions are performed on it and then the line is printed with a new line. The action over here is p which means entire text will be printed and the lines that matches the sed's range will be printed twice. To prevent this we invoke the -n. The -n option will not print anything unless an explicit request to print is found.

Upvotes: 1

Kent
Kent

Reputation: 195209

How and why does it work?

see below

What is the sed format ?

format is sed address1, address2

> Why it selects the Channel statistics: section but not the Image statistics: ?

first, I would say that the sed line in your question is not exactly the same as the line in your link. it should be sed -n '/statistics:/,/^(two spaces)[^ ]/ p'

see this example:

kent$  cat file1
x_1
 1
 2
 3
 o
x_2
 4
 5
 6

kent$  sed -n '/x/,/^[^ ]/p' file1
x_1
 1
 2
 3
 o
x_2

I think this would be quite similar as your file.

what sed does?

1 find out the 1st match of address1, which is /x/, so x_1 was found and accepted.

2 then searching the address2, /^[^ ]/ if not match, print. this regex means, the line not starts with space.

3 x_2 starts with x, not space, so match address2, print.

after x_2, there is no more line matching address1 /x/, so x_2 should be the last line

your image file example is the same, only difference is address2 in your case is line starting with two spaces.

just my 2 cents. hope helpful.

Upvotes: 0

Related Questions