Jeff
Jeff

Reputation: 1889

Bash: How to add an array value to JSON?

I am putting together some bash script for parsing a URL into its components. I am blocked trying to figure out how to add an array value to a key within a JSON body.

Attempted Approach:

I have parsed the following URL: https://bar.foo.com/v2020/folders/8d55e749-bbd7-e811-9c19-3ca82a1e3f41/folders

This URL's path is:

URL_PATH: v2020/folders/8d55e749-bbd7-e811-9c19-3ca82a1e3f41/folders

This URL's path parts array is using

IFS='/' read -ra URL_PATH_PARTS <<< "$URL_PATH"

URL_PATH_PARTS [4]: v2020 folders 8d55e749-bbd7-e811-9c19-3ca82a1e3f41 folders

I want to add an array value to JSON that is formatted as follows:

{
  ...
  "parts": ["v2020", "folders", "8d55e749-bbd7-e811-9c19-3ca82a1e3f41", "folders"]
}

However, currently it looks like this and not sure how to best take the next step:

{
  ...
  "parts": "[v2020 folders 8d55e749-bbd7-e811-9c19-3ca82a1e3f41 folders]"
}

Bash code parsing URL into its components:

#!/usr/bin/env bash

HREF='https://bar.foo.com/v2020/folders/8d55e749-bbd7-e811-9c19-3ca82a1e3f41/folders'
# remove quotes
HREF=$(echo $HREF | tr -d '"')
echo "  HREF: $HREF"

# extract the PROTOCOL
URL_PROTOCOL=$(echo $HREF | grep :// | sed -e's,^\(.*://\).*,\1,g')
echo "  URL_PROTOCOL: $URL_PROTOCOL"

# extract the PROTOCOL SCHEME
URL_SCHEME=`echo ${URL_PROTOCOL::-3}`
echo "  URL_SCHEME: $URL_SCHEME"

# remove the PROTOCOL -- updated
URL=$(echo $HREF | sed -e s,$URL_PROTOCOL,,g)
echo "  URL: $URL"

# extract the host and port -- updated
URL_HOSTPORT=$(echo $URL | sed -e s,$user@,,g | cut -d/ -f1)
echo "  URL_HOSTPORT: $URL_HOSTPORT"

# by request host without port
URL_HOST="$(echo $URL_HOSTPORT | sed -e 's,:.*,,g')"
echo "  URL_HOST: $URL_HOST"

# by request - try to extract the port
URL_PORT="$(echo $URL_HOSTPORT | sed -e 's,^.*:,:,g' -e 's,.*:\([0-9]*\).*,\1,g' -e 's,[^0-9],,g')"
echo "  URL_PORT: $URL_PORT"

# Extract the path
URL_PATH="$(echo $URL | grep / | cut -d/ -f2-)"
echo "  URL_PATH: $URL_PATH"

IFS='/' read -ra URL_PATH_PARTS <<< "$URL_PATH"
echo "  URL_PATH_PARTS [${#URL_PATH_PARTS[@]}]: ${URL_PATH_PARTS[@]}"

URL_COMPONENTS="{ \
    \"protocol\": \"$URL_PROTOCOL\", \
    \"scheme\": \"$URL_SCHEME\", \
    \"url\": \"$URL\", \
    \"host\": \"$URL_HOST\", \
    \"path\": \"$URL_PATH\", \
    \"parts\": \"[${URL_PATH_PARTS[@]}]\" \
}"

echo -e "\n  URL_COMPONENTS:"
echo $URL_COMPONENTS |
    jq '.'

Console Response

  HREF: https://bar.foo.com/v2020/folders/8d55e749-bbd7-e811-9c19-3ca82a1e3f41/folders
  URL_PROTOCOL: https://
  URL_SCHEME: https
  URL: bar.foo.com/v2020/folders/8d55e749-bbd7-e811-9c19-3ca82a1e3f41/folders
  URL_HOST: bar.foo.com
  URL_PATH: v2020/folders/8d55e749-bbd7-e811-9c19-3ca82a1e3f41/folders
  URL_PATH_PARTS [4]: v2020 folders 8d55e749-bbd7-e811-9c19-3ca82a1e3f41 folders

  URL_COMPONENTS:
{
  "protocol": "https://",
  "scheme": "https",
  "url": "bar.foo.com/v2020/folders/8d55e749-bbd7-e811-9c19-3ca82a1e3f41/folders",
  "host": "bar.foo.com",
  "path": "v2020/folders/8d55e749-bbd7-e811-9c19-3ca82a1e3f41/folders",
  "parts": "[v2020 folders 8d55e749-bbd7-e811-9c19-3ca82a1e3f41 folders]"
}

Thank you

Appreciative of all feedback and suggestions!

Upvotes: 1

Views: 102

Answers (3)

Andrew Vickers
Andrew Vickers

Reputation: 2664

Don't bother with the array. Use variable substitution:

URL_PATH_PARTS=${URL_PATH//\/ }         # Replace slashes with spaces
SPACES="${URL_PATH_PARTS//[^ ]} "       # Append space to avoid fence-post error.
echo "  URL_PATH_PARTS [${#SPACES}]: ${URL_PATH_PARTS}"

...

 \"parts\": [ \"${URL_PATH_PARTS// /\", \"}\" ] \  # Replace spaces with '", "'

You could also do away with the intermediate 'URL_PATH_PARTS' variable (and lose some readability):

SLASHES="${URL_PATH//[^\/]}/"       # Append slash to avoid fence-post error.
echo "  URL_PATH_PARTS [${#SLASHES}]: ${URL_PATH//\// }"

...

 \"parts\": [ \"${URL_PATH//\//\", \"}\" ] \  # Replace slashes with '", "'

Upvotes: 2

Jeff
Jeff

Reputation: 1889

Thanks @CharlesDuffy, @dash-o, @AndrewVickers

I tried out all your suggestions.

The suggested approach I took was joelpurra/jq-hopkok

Bash Code

#!/usr/bin/env bash

URL='"https://apiuatna11.springcm.com/v201411/folders/8d55e749-bbd7-e811-9c19-3ca82a1e3f41/folders"'

# URL to components
echo $URL | ./jq-hopkok/src/url/to-components.sh

JSON response

{
  "value": "https://apiuatna11.springcm.com/v201411/folders/8d55e749-bbd7-e811-9c19-3ca82a1e3f41/folders",
  "valid": true,
  "scheme": {
    "value": "https",
    "valid": true
  },
  "domain": {
    "value": "apiuatna11.springcm.com",
    "components": [
      "apiuatna11.springcm.com",
      "springcm.com",
      "com"
    ],
    "tld": "com",
    "valid": true
  },
  "port": {
    "value": null,
    "separator": false,
    "valid": true
  },
  "path": {
    "value": "/v201411/folders/8d55e749-bbd7-e811-9c19-3ca82a1e3f41/folders",
    "components": [
      "v201411",
      "folders",
      "8d55e749-bbd7-e811-9c19-3ca82a1e3f41",
      "folders"
    ],
    "valid": true
  },
  "query": {
    "value": null,
    "separator": false,
    "components": [],
    "valid": true
  },
  "fragment": {
    "value": null,
    "separator": false,
    "valid": true
  }
}

Upvotes: 1

dash-o
dash-o

Reputation: 14491

Current code using: \"parts\": \"[${URL_PATH_PARTS[@]}]\" for the path. Possible solution is to iterate over the elements, creating combined string with quotes, and ',' separator

PP=
for P1 in "${URL_PATH_PARTS[@]}" ; do
  # Add ',' unless this is first item
  [ "$PP" ] && PP="$PP, "
  PP=$PP\"$P1\"
done

The replace IN (URL components)

\"parts\": \"[${URL_PATH_PARTS[@]}]\"

With

\"parts\": [ $PP ]

Upvotes: 1

Related Questions