user10086251
user10086251

Reputation: 41

simple jolt transform to insert values object

I am trying to understand simple jolt transformations , I have a simple 1 level json , I would like to change the sequence of columns and convert into Rows array with Values object repeating with comma separated values, adding current date at the end of Values object. And also add default tags

Sample JSON

{
  "name": "john",
  "Address2": "123 Street 1",
  "Address1": "Apt 101",
  "custId": "ZY-1234",
  "zip": "67834"
}

Expected JSON

{
  "Rows": [{
        "values": [
            "ZY-1234",
            "john",
            "123 Street 1",
            "Apt 101",
            "67834",
            "2022-09-01 10:10:09"
        ]
    }],
  "operationType": "Insert",
  "Source": "Dev"
}

Jolt Spec I have

[
  {
    "operation": "default",
    "spec": {
      "ChangedBy": "NIFI",
      "ChangedOn": "${now()}"
    }
  },
  {
    "operation": "shift",
    "spec": {
      "*": {
        "@": "Rows[#1].Values"
      }
    }
  },
  {
    "operation": "default",
    "spec": {
      "*": {
        "*": "&"
      },
      "operationType": "Insert",
      "Source": "Dev"
    }
  }
]

The Result JSON

{
  "Rows" : [ {
    "Values" : [ 
        "john", 
        "123 Street 1", 
        "Apt 101", 
        "ZY-1234", 
        "67834", 
        "NIFI", 
        "${now()}" ]
  } ],
  "operationType" : "Insert",
  "Source" : "Dev"
}

Issues :

  1. How to change the column sequence in Values array
  2. How to get current datetime

Upvotes: 1

Views: 527

Answers (1)

mattyb
mattyb

Reputation: 12083

If you want to guarantee the order of columns, you'll need to specify where in the target array (i.e. indices) to put them. If your input is a single JSON object as you have above, you can use 0 for the Rows index and pick each of the fields individually to put them in their place in the values array. Try the following:

[
  {
    "operation": "shift",
    "spec": {
      "custId": "Rows[0].Values[0]",
      "name": "Rows[0].Values[1]",
      "Address2": "Rows[0].Values[2]",
      "Address1": "Rows[0].Values[3]",
      "zip": "Rows[0].Values[4]",
      "#\\${now()}": "Rows[0].Values[5]"
    }
  },
  {
    "operation": "default",
    "spec": {
      "operationType": "Insert",
      "Source": "Dev",
      "ChangedBy": "NIFI",
      "ChangedOn": "${now()}"
    }
  }
]

My output from JoltTransformJSON is as follows:

{
  "Rows": [
    {
      "Values": [
        "ZY-1234",
        "john",
        "123 Street 1",
        "Apt 101",
        "67834",
        "Tue Sep 06 18:51:35 EDT 2022"
      ]
    }
  ],
  "Source": "Dev",
  "ChangedOn": "Tue Sep 06 18:51:35 EDT 2022",
  "ChangedBy": "NIFI",
  "operationType": "Insert"
}

You can use additional NiFi Expression Language functions such as toDate and format or toInstant and formatInstant to get the ${now()} expression to give you the timestamp string you want.

Upvotes: 1

Related Questions