Bruno Brs
Bruno Brs

Reputation: 693

How to transform XML in Apache Nifi

In NiFi I have a processor ExecuteSQL which returns the following

person_id| name | adress 
01       | John | Street 01 
01       | John | Street 02
02       | Deby | Street 01

Notice that Adress is a different table than Person, so a Left Join ends up duplicating values.

In Nifi I have converted the AVro to Json and then to XML, and this is the result:

<person>
  <person_id>01</person_id>
  <name>John</name>
  <address>Street 01</address>
</person>
<person>
  <person_id>01</person_id>
  <name>John</name>
  <address>Street 02</address>
</person>
<person>
  <person_id>02</person_id>
  <name>Deby</name>
  <address>Street 01</address>
</person>

However my desired result would be:

<person>
  <person_id>01</person_id>
  <name>John</name>
  <addresses>
    <address>Street 01</address>
    <address>Street 02</address>
  </addresses>
</person>
<person>
  <person_id>02</person_id>
  <name>Deby</name>
  <addresses>
    <address>Street 01</address>
  </addresses>
</person>

Is it possible to do it in Nifi? I can't seem to find any suitable processor for this, should I use XSLT (with Transform XML processor) ? write my own processor? How can I do that?

I'm new to NiFi and any help would be appreciated.

Upvotes: 1

Views: 4535

Answers (1)

Andy
Andy

Reputation: 14194

There are a few approaches you could take:

  1. Do the transformation in JSON before converting to XML -- the JoltTransformJSON processor handles complex transformations and has more documentation around that process (as well as online sandboxes for testing)
  2. Do the transformation in XSLT -- if you're more comfortable with XSLT, you can do this with TransformXML. There are many Stack Overflow answers which will help you craft the XSLT
  3. Write a Groovy script -- If the transformation logic is difficult to craft in Jolt or XSLT, a Groovy script in ExecuteScript will probably be the simplest solution. Groovy's XML handling is very terse and allows for powerful manipulation with map/object duck-typing. This would be my recommendation if the Jolt or XSLT specs are non-trivial
  4. Write a custom processor -- If you build a script that works well, you can migrate that code to a custom processor for long-term benefits (performance improvements, deployability, version control, configurability, etc.). I have recent slides on the process around developing a custom processor

Upvotes: 3

Related Questions