Joël Dinel
Joël Dinel

Reputation: 1

jq filter to ignore values in select statement based on array values

Given the following JSON input :

{
  "hostname": "server1.domain.name\nserver2.domain.name\n*.gtld.net",
  "protocol": "TCP",
  "port": "8080\n8443\n9500-9510",
  "component": "Component1",
  "hostingLocation": "DC1"
}

I would like to obtain the following JSON output :

{
  "hostname": [
    "server1.domain.name",
    "server2.domain.name",
    "*.gtld.net"
  ],
  "protocol": "TCP",
  "port": [
    "8080-8080",
    "8443-8443",
    "9500-9510"
  ],
  "component": "Component1",
  "hostingLocation": "DC1"
}

Considering :

  1. That the individual values in the port array may, or may not, be separated by a - character (I have no control over this).
  2. That if an individual value in the port array does not contain the - separator, I then need to add it and then repeat the array value after the - separator. For example, 8080 becomes 8080-8080, 8443 becomes 8443-8443 and so forth.
  3. And finally, that if a value in the port array is already of the format value-value, I should simply leave it unmodified.

I've been banging my head against this filter all afternoon, after reading many examples both here and in the official jq online documentation. I simply can't figure out how to accomodate consideration #3 above.

The filter I have now :

{hostname: .hostname | split("\n"), protocol: .protocol, port: .port | split("\n") | map(select(. | contains("-") | not)+"-"+.), component: .component, hostingLocation: .hostingLocation}

Yields the following output JSON :

{
  "hostname": [
    "server1.domain.name",
    "server2.domain.name",
    "*.gtld.net"
  ],
  "protocol": "TCP",
  "port": [
    "8080-8080",
    "8443-8443"
  ],
  "component": "Component1",
  "hostingLocation": "DC1"
}

As you can see above, I subsequently lose the 9500-9510 value as it already contains the - string which my filter weeds out.

If my logic does not fail me, I would need to stick an if statement within my select statement to conditionally only send array values that do not contain the string - to my select statement but leave array values that do contain the separator untouched. However, I cannot seem to figure this last piece out.

I will happily accept any alternative filter that yields the desired output, however I am also really keen on understanding where my logics fails in the above filter.

Thanks in advance to anyone spending their valuable time helping me out!

/Joel

Upvotes: 0

Views: 655

Answers (1)

pmf
pmf

Reputation: 36048

First, we split the hostname string by a newline character (.hostname /= "\n") and do the same with the port string (.port /= "\n"). Actually, we can combine these identical operations into one: (.hostname, .port) /= "\n"

Next, for every element of the port array (.port[]) we split by any non-digit character (split("[^\\d]";"g")) resulting in an array of digit-only strings, from which we take the first element (.[0]), then a dash sign, and finally either the second element, if present, otherwise the first one again (.[1]//.[0])

With your input in a file called input.json, the following should convert it into the desired format:

jq '

  (.hostname, .port) /= "\n" |
  .port[] |= (split("[^\\d]";"g") | "\(.[0])-\(.[1]//.[0])")

' input.json

Regarding your considerations:

  1. As we split at any non-digit character, it makes no difference what other character separates the values of a port range. If more than one character could separate them (e.g. an arrow -> or with spaces before and after the dash sign -), simply replace the regex [^\\d] with [^\\d]+ for capturing more than one non-digit character.
  2. and 3. We always produce a range by including a dash sign and a second value, which depending on the presence of a second item may be either that or the first one again.

Regarding your approach:

Inside map you used select which evaluates to empty if the condition (contains("-") | not) is not met. As "9500-9510" does indeed contain a dash sign, it didn't survive. An if statement inside the select statement wouldn't help because even if select doesn't evaluate to empty it still doesn't modify anything, it just reproduces its input unchanged. Therefore, if select is letting through both cases (containing and not containing dash signs) it becomes useless. You could, however, work with an if statement outside of the select statement, but I considered the above solution as a simpler approach.

Upvotes: 0

Related Questions