Reputation: 8856

Ignoring escaped delimiters (commas) with awk?

If I had a string with escaped commas like so:

a,b,{c\,d\,e},f,g

How might I use awk to parse that into the following items?

a
b
{c\,d\,e}
f
g

Upvotes: 0

Answers (3)

system PAUSE

Reputation: 38550

{
   split($0, a, /,/)
   j=1
   for(i=1; i<=length(a); ++i) {
      if(match(b[j], /\\$/)) {
         b[j]=b[j] "," a[i]
      } else {
         b[++j] = a[i]
      }
   }
   for(k=2; k<=length(b); ++k) {
      print b[k]
   }
}

Split into array a, using ',' as delimiter
Build array b from a, merging lines that end in '\'
Print array b (Note: Starts at 2 since first item is blank)

This solution presumes (for now) that ',' is the only character that is ever escaped with '\'--that is, there is no need to handle any \\ in the input, nor weird combinations such as \\\,\\,\\\\,,\,.

Upvotes: 2

Cascabel

Reputation: 497302

I don't think awk has any built-in support for something like this. Here's a solution that's not nearly as short as DigitalRoss's, but should have no danger of ever accidentally hitting your made-up string (!Q!). Since it tests with an if, you could also extend it to be careful about whether you actually have \\, at the end of your string, which should be an escaped slash, not comma.

BEGIN {
    FS = ","
}

{
    curfield=1
    for (i=1; i<=NF; i++) {
        if (substr($i,length($i)) == "\\") {
            fields[curfield] = fields[curfield] substr($i,1,length($i)-1) FS
        } else {
            fields[curfield] = fields[curfield] $i
            curfield++
        }
    }
    nf = curfield - 1
    for (i=1; i<=nf; i++) {
        printf("%d: %s   ",i,fields[i])
    }
    printf("\n")
}

Upvotes: 1

DigitalRoss

Reputation: 146141

{
  gsub("\\\\,", "!Q!")
  n = split($0, a, ",")
  for (i = 1; i <= n; ++i) {
    gsub("!Q!", "\\,", a[i])
    print a[i]
  }
}

Upvotes: 2

Ignoring escaped delimiters (commas) with awk?

Answers (3)

Related Questions