Find and modify a pattern within the start and end patterns and update file

Question

I want to find and modify a pattern within the start and end pattern to update multiple files. I am breaking down the steps if this can achieved with awk / sed.

Find an occurrence of a string within a 'startpat' and 'endpat' (capture the lines between start and end)
Modify the strings in the instance, for instance: update 'sss: ccc' to 'sss: ddd' update 'brr: mmm' to 'brr: rel/ccc'
Now create a new set of lines from 'startpat' to 'endpat' with the update strings from step 2.
Append to the start of the file after '---'.
Delete the last occurrence set of lines from 'startpat' and 'endpat' if it matches the string 'sss: aaa' and 'brr: rel/aaa'

Note: most importantly would like preserve the indents since I am working with json/yaml files.

Input file format (PS ignore comments lines // while parsing the file):

---
 - startpat:        // Startpat - make this line inclusive
    ...
    sss: ccc        // pattern to be modified
    ppp: 'vvv'
    pname: 'vvv'
    brr: 'mmm'      // pattern to be modified
    jdk: jdk8
    jdks:
      - jdk8
      - jdk7
    file:
      - test:
          exec: 'input'
    ...

 - startpat:        // Endpat - make this line exclusive

expected output after processing:

---
 - startpat:
    sss: ddd
    ppp: 'vvv'
    pname: 'vvv'
    brr: 'mmm'
    jdk: jdk8
    jdks:
      - jdk8
      - jdk7
    file:
      - test:
          exec: 'input'

 - startpat:        // Startpat
    ....
    sss: ccc
    ppp: 'vvv'
    pname: 'vvv'
    brr: 'mmm'
    jdk: jdk8
    jdks:
      - jdk8
      - jdk7
    file:
      - test:
          exec: 'input'
    ...

 - startpat:        // Endpat

slitvinov · Accepted Answer

I think the simplest way is to save every line in an array. To get you started:

$ cat f.awk
BEGIN {
    # build regualr expressions to match "start pattern" and
    # "end pattern" (in the question they are the same)

    ws = "[\t ]*"            # white-spaces
    sp = "^" ws "- startpat:" # [s]tart [p]attern
    ep = sp                   # [e]nd   [p]attern

    # a regular expression to match "---"
    # possibly suraunded by white-spaces
    op = "^" ws "---" ws "$" # where to start appending
}

{ f[NR] = $0 } # save every line to an array

END {
    n = NR # number of line in the file

    find_blocks() # set `nb` (number of blocks), `ss` `ee`

    for (ib = 1; ib <= nb; ib++)
        process_block(ss[ib], ee[ib]) # pass start and end of each block
                                      # set `nex' (number of extra lines) and `eex'
    write()
}

function find_blocks(   i, l, is, ie) {
    for (i = 1; i <= n; i++) {
        l = f[i]
        if (is > ie && l ~ ep) ee[++ie] = i # end
        if (           l ~ sp) ss[++is] = i # start
    }
    nb = ie
}

function process_block(is, ie,   i, l) {
    for (i = is + 1; i <= ie - 1; i++) {
        l = f[i]
        # modify a line (an example)
        if (l ~ /brr:/) sub(/'mmm'/, "'rel/cc'", l)

        eex[++nex] = l # push the line to another array
    }
}

function write(   i, j, l) {
    i = 1
    while (i <= n) { # print everything before "---"
        print l = f[i++]
        if (l ~ op) break
    }

    for (j = 1; j <= nex; j++) # add an extra part
        print eex[j]

    while (i <= n)            # print the part after "---"
        print f[i++]
}

Input file

$ cat input
---
 - startpat:
    XXXXX
    brr: 'mmm'
 - startpat:
    YYYYY        
    brr: 'mmm'
 - startpat:

Usage:

awk -f f.awk input

Output:

---
    XXXXX
    brr: 'rel/cc'
    YYYYY        
    brr: 'rel/cc'
 - startpat:
    XXXXX
    brr: 'mmm'
 - startpat:
    YYYYY        
    brr: 'mmm'
 - startpat:

Find and modify a pattern within the start and end patterns and update file

Answers (1)

Related Questions