Philipp Kolb
Philipp Kolb

Reputation: 54

How to split a string value in json and convert to nested objects using jq?

I am trying to use jq to convert something like this:

[
  {
    "type": "Feature",
    "properties": {
      "osm_id": "172544",
      "highway": "crossing",
      "other_tags": "\"crossing\"=>\"uncontrolled\",\"tactile_paving\"=>\"yes\""
    },
    "geometry": {
      "type": "Point",
      "coordinates": [
        13.3432342,
        52.5666157
      ]
    }
  }
]

into this:

[
  {
    "type": "Feature",
    "properties": {
      "osm_id": "172544",
      "highway": "crossing",
      "other_tags": {
        "crossing": "uncontrolled",
        "tactile_paving": "yes"
      }
    },
    "geometry": {
      "type": "Point",
      "coordinates": [
        13.3432342,
        52.5666157
      ]
    }
  }
]

right now, this is my progress:

jq 'map(try(.properties.other_tags |= split(",") // .)) | map(try(.properties.other_tags[] |= split("=>") // .)) | map(try(.properties.other_tags[] |= { (.[0]) : .[1] } // .))' example.json

but the output of "other_tags" looks like this:

  "other_tags": [
    {
      "\"crossing\"": "\"uncontrolled\""
    },
    {
      "\"tactile_paving\"": "\"yes\""
    }
  ]

I am pretty sure this is not as performant as it could be.

It's used to transform osm exports, which are fairly big

Is there a more elegant/shorter jq instruction i can use, also giving me the desired output as stated above?

Upvotes: 2

Views: 1200

Answers (3)

Philipp Kolb
Philipp Kolb

Reputation: 54

found a satisfying solution while fiddling around on jqplay:

jq '.features
  | map(try(.properties.other_tags |=
            (split("\",\"")
             | join("\"##strsplit##\"")
             | split("##strsplit##")
             | .[] |= split("=>") 
             | .[] |= {(.[0][1:-1]): (.[1][1:-1])}
             | add)) // .)'

edit: changed the array index, thanks to peak for your comment

edit2: comma tolerant and includes nodes w/o 'other_tags'

Upvotes: 0

peak
peak

Reputation: 116750

Here's a solution that assumes the input can be parsed into comma-separated segments matching the following regex (expressed as JSON):

"\"(?<key>[^\"]+)\"=>\"(?<value>[^\"]+)\""

# emit a stream
def unwrap:
  . as $s
  | if length==0 then empty
    else match( "\"(?<key>[^\"]+)\"=>\"(?<value>[^\"]+)\",?" )
    | (.captures|map({(.name): .string})|add), 
      ( $s[.length:]|unwrap)
    end
;

 map( .properties.other_tags |= ([unwrap]|from_entries) )

This approach has the (potential) advantage of allowing commas and occurrences of "=>" within the keys and values. Of course the implementation can be robustified (e.g. using try as you have done), but I've kept it simple so you can easily make modifications to meet your more detailed requirements.

Upvotes: 0

oliv
oliv

Reputation: 13249

You could also use this:

<file jq '[.[] | try(.properties.other_tags |= ("{" + gsub("=>"; ":") + "}" | fromjson))//.]'

This adds curly braces { and } to the wanted string and replace => by :. The string is then converted as a JSON object with the command fromjson.

The command doesn't change the JSON data if the .properties.other_tags isn't found.

Upvotes: 2

Related Questions