Sammy
Sammy

Reputation: 21

Extract matching strings ids in JSON dataset from input file using Unix

I have JSON data from which I extract ids for the various customers using .jq. With the help of UNIX commands, I have fetched this JSON data through API calls. Now, I am trying to get all the IDs for that customer.

With the help of the below script if the customer has more than one id my data looks something like this

Script

appKey=12345678999999999988999
tokenInXml=$(curl -X POST \
      "APIURL/authentication" \
      -H 'Accept: application/xml' \
      -H 'Content-Type: application/xml' \
      -H "Finicity-App-Key: $appKey" \
      -H 'cache-control: no-cache' \
      -d "<credentials>
            <partnerId>11111111111111111</partnerId>
            <partnerSecret>asdfdsfggfgfgfdss</partnerSecret>
        </credentials>")

SUBSTRING=$(echo $tokenInXml| cut -d'>' -f 4)
token=$(echo $SUBSTRING| cut -d'<' -f 1)

COUNTER=0

while IFS= read -r line
do
    customerId=$(echo $line | awk '{print $1}')
        
    (( COUNTER++ ))

    response=$(curl --fail --silent \
     -H "App-Key:$appKey" -H "App-Token:$token" \
     -H "Content-Length:0" -H "Content-Type: application/xml" \
     -H "Accept:application/json" -H "Exclude-Metrics:true" \
     -X GET "APIURL/${customerId}/reports" -d '' )
    
    rep=$(echo "${response}" |
       jq -r '.reports[] | select(.id | contains("-")|not)| .id')
        
done < "inputfile.txt"

Example JSON file for customer id 123456788 will look like this

{ "data": [
  { "id": "dkfjgh48sdfg", "abc": "sdfdsf", "pqr": "sdkfj" },
  { "id": "sdfg742sdfg2", "abc": "aksjdfj", "pqr": "alkejf" },
  { "id": "sdfjk7df3dfg", "abc": "smdfb", "pqr": "sdjkfhakj" },
  { "id": "pltrsg2rt4tl", "abc": "artgrre", "pqr": "sfeewrwg" } ] }

Output:

123456789 hksdf21sdfi
aflire65sfdh
sdfghklj256v
sdkfjh2dr7gd

123456788 dkfjgh48sdfg
sdfg742sdfg2
sdfjk7df3dfg
pltrsg2rt4tl

Here the 123456789 123456788 are the customer ids and the alphanumeric string is id. I want to filter only those id's whose last four-digit matches with the input list which looks like this

Input List

123456789 256v
123456788 dfg2
123456788 t4tl

Expected output:

123456789 sdfghklj256v
123456788 sdfg742sdfg2
123456788 pltrsg2rt4tl

Upvotes: 1

Views: 227

Answers (1)

This is a two step solution.

  1. transform the input into an easy-to-use "dictionary"
  2. look up user ids in the new object using the last 4 chars of .id in .data

Transform the user list input.

$ jq -Rs -f users.jq < users.txt | tee  users.json 
{
  "eeee": "5",
  "sdfg": "44",
  "dfg2": "66",
  "3dfg": "12",
  "t4tl": "69",
  "aaaa": "1",
  "bbbb": "2",
  "cccc": "3",
  "dddd": "4"
}

database.json (input)

{ "data": [
        { "id": "dkfjgh48sdfg", "abc": "sdfdsf", "pqr": "sdkfj" },
        { "id": "sdfg742sdfg2", "abc": "aksjdfj", "pqr": "alkejf" },
        { "id": "sdfjk7df3dfg", "abc": "smdfb", "pqr": "sdjkfhakj" },
        { "id": "xxxxxxxxxxxx", "abc": "smdfb", "pqr": "sdjkfhakj" },
        { "id": "nope........", "abc": "smdfb", "pqr": "sdjkfhakj" },
        { "id": "____________", "abc": "smdfb", "pqr": "sdjkfhakj" },
        { "id": "pltrsg2rt4tl", "abc": "artgrre", "pqr": "sfeewrwg" } ] }

Select objects from data that match the input

$ jq -r --argjson users  "$(< users.json )" -f match.jq < database.json 
44 dkfjgh48sdfg
66 sdfg742sdfg2
12 sdfjk7df3dfg
69 pltrsg2rt4tl

Scripts

users.jq

  [
    split("\n")[]|
    split(" ")|
    select((.[]|length > 0))
  ]
  | map({  (.[1]|tostring)  :  (.[0]|tostring)  })
  | add

match.jq


.data[]
| (.id|split("")|reverse|.[0:4]|reverse|join("")) as $l4
| { orig: ., uid: $users[$l4] }
| select(.uid != null)
| "\( .uid ) \( .orig.id )"


Input

users.txt

  5 eeee
  44 sdfg
  66 dfg2
  12 3dfg
  69 t4tl
  1 aaaa
  2 bbbb
  3 cccc
  4 dddd

Upvotes: 0

Related Questions