Roman
Roman

Reputation: 161

NiFi: change text in FlowFile (Python or ...)

Im very new in NiFi.. I get data(FlowFile ?) from my processor "ConsumerKafka", it seems like

data sample

So, i have to delete any text before '!',I know a little Python. So with "ExcecuteScript", i want to do something like this

my_string=session.get()
my_string.split('!')[1]
#it return "ZPLR_CHDN_UPN_ECN....."

but how to do it right? p.s. or, may be, use "substringAfterLast", but how? Tnanks.

Update:

I have to remove text between '"Tagname":' and '!', how can i do it without regex?

Upvotes: 0

Views: 640

Answers (2)

Sdairs
Sdairs

Reputation: 2032

If you simply want to split on a bang (!) and only keep the text after it, then you could achieve this with a SplitContent configured as:

Byte Sequence Format: Text

Byte Sequence: !

Keep Byte Sequence: false

Follow this with a RouteOnAttribute configured as:

Routing Strategy: Route to Property name

Add a new dynamic property called "substring_after" with a value: ${fragment.index:equals(2)}

For your input, this will produce 2 FlowFiles - one with the substring before ! and one with the substring after !. The first FlowFile (substring before) will route out of the RouteOnAttribute to the unmatched relationship, while the second FlowFile (substring after) will route to a substring_after relationship. You can auto-terminate the unmatched relationship to drop the text you don't want.

There are downsides to this approach though.

  1. Are you guaranteed that there is only ever a single ! in the content? How would you handle multiple?
  2. You are doing a substring on some JSON as raw text. Splitting on ! will result in a "} left at the end of the string.

These look like log entries, you may want to consider looking into ConsumeKafkaRecord and utilising NiFi's Record capabilities to interpret and manipulate the data more intelligently.

On scripting, there are some great cookbooks for learning to script in NiFi, start here: https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-1/ta-p/248922

Edit:

Given your update, I would use UpdateRecord with a JSON Reader and Writer, and Replacement Value Strategy set to Record Path Value .

This uses the RecordPath syntax to perform transformations on data within Records. Your JSON Object is a Record. This would allow you to have multiple Records within the same FlowFile (rather than 1 line per FlowFile).

Then, add a dynamic property to the UpdateRecord with:

Name: /Tagname

Value: substringAfter(/Tagname, '!' )

What is this doing?

The Name of the property (/Tagname) is a RecordPath to the Tagname key in your JSON. This tells UpdateRecord where to put the result. In your case, we're replacing the value of an existing key (but it could be also be a new key if you wanted to add one).

The Value of the property is the expression to evaluate to build the value you want to insert. We are using the substringAfter function, which takes 2 parameters. The first parameter is the RecordPath to the Key in the Record that contains the input String, which is also /Tagname (we're replacing the value of Tagname, with a substring of the original Tagname value). The second parameter is the String to split on, which is !.

Upvotes: 1

yaprak
yaprak

Reputation: 547

If your purpose getting the string between ! and "} use ReplaceText with (.*)!(.*)"} , capture second group and replace it with entire content

Please note that this regular expression may not be best for your case but I believe you can find solution for your problem with regular expression

enter image description here

Upvotes: 0

Related Questions