VNA
VNA

Reputation: 625

awk to generate consecutive sequence:

Would like to read first field then generate sequence based on "&-" and "&&-" delimiter.

Ex: If Digits field is  210&-3 ,  need to populate 210 and 213 only.
      If Digits field is  210&&-3 , need to populate 210,211,212 and 213.

Input.txt

DIGITS                    

  20
  210&-2     
  2130&&-3&-6&&-8

Desired Output:

DIGITS
  20
  210
  212
  2130
  2131
  2132
  2133
  2136
  2137
  2138

Have tried some commands but not materialised, any suggestions...

Upvotes: 1

Views: 351

Answers (3)

Ed Morton
Ed Morton

Reputation: 203684

$ cat tst.awk
BEGIN{ FS="&" }
{
    for (i=1;i<=NF;i++) {
        if ($i == "") {
            i++
            $i = $1 - $i
            for (j=(prev+1);j<$i;j++) {
                print j
            }
        }
        else if ($i < 0) {
            $i = $1 - $i
        }

        print $i
        prev = $i
    }
}
$
$ awk -f tst.awk file
20
210
212
2130
2131
2132
2133
2136
2137
2138

Upvotes: 0

n0741337
n0741337

Reputation: 2514

Here's an awk executable script version:

#!/usr/bin/awk -f

BEGIN {FS="[&]"}

{
    flen = length($1)
    ldigit = substr($1, flen)+0
    prefix = substr($1, 1, flen-1)+0

    if( ldigit !~ /[[:space:]]/ )
         print prefix ldigit

    doRange=0
    for(i=2;i<=NF;i++) {
        if( $i == "" ) { doRange=1; continue }
        if( !doRange ) { ldigit=-$i; print prefix ldigit }
        else {
            while( ldigit < -$i ) {
                ldigit++
                print prefix ldigit
            }
            doRange=0
        }
    }
}

Here's the breakdown:

  • Set the field separator to &
  • When their are commands to parse, break find the prefix and the ldigit values
  • Print out the first value using print prefix ldigit. This will print the header too. The if( ldigit !~ /[[:space:]]/ ) discards the blank lines
  • When there's no range, set ldigit and then print prefix ldigit
  • When there is a range, increment ldigit and print prefix ldigit for as long as required.

Using an older gawk version I get output like:

DIGITS
  20
  210
  212
  2130
  2131
  2132
  2133
  2136
  2137
  2138

Upvotes: 2

jaypal singh
jaypal singh

Reputation: 77105

Using GNU awk for patsplit:

gawk '{
    n = patsplit($0,patt,/[&][&]-|[&]-/,number); 
    lastnum = number[0]
    print lastnum
    if(n > 0) {
        for (i=1; i<=n; i++) {
            if (patt[i] ~ /^[&]-$/) {
                print number[0] + number[i]
                lastnum = number[0] + number[i]
            }
            if (patt[i] ~ /^[&][&]-$/) {
                for (num = lastnum + 1; num <= number[0] + number[i]; num++) {
                    print num
                }
                lastnum = number[0] + number[i]
            }
        }
    }
}' file

Output

20
210
212
2130
2131
2132
2133
2136
2137
2138

Upvotes: 1

Related Questions