Shrads
Shrads

Reputation: 883

remove blank rows in the CSV file using NiFi

I want to achieve a simple use case to remove any blank row found in the CSV file. How Can I achieve this using NiFi?

I have CSV File as follows: (Plz see Attached Screenshot showing which row needs to be removed) enter image description here

I want to remove the first Blank Row in the csv just above the headers using NiFi. Please, any suggestion is much appreciated. Thank You!

Upvotes: 0

Views: 2719

Answers (1)

Andy
Andy

Reputation: 14194

You can use a ReplaceText processor which replaces \A\n|\n*\s*(?=\n) with '' (empty replacement value). The search regex looks for:

  • \A\n - beginning of the content immediately followed by a newline OR
  • \n*\s*(?=\n) - newline (0 or more) followed by whitespace (0 or more) followed by a newline (not captured using lookahead group)

Update

Not sure why this was downvoted or did not work for some user, as I just created a template and it worked exactly as described.

Overview of NiFi flow

Configuration of GenerateFlowFile processor

Configuration of ReplaceText processor

2019-01-08 12:25:27,642 INFO [Timer-Driven Process Thread-2] o.a.n.processors.standard.LogAttribute LogAttribute[id=2f22d047-0168-1000-47b0-9ec963e65367] logging for flow file StandardFlowFileRecord[uuid=6c9cc388-19c8-4b98-9970-6a6e3979e4ee,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1546979126561-1, container=default, section=1], offset=152, length=50],offset=0,name=6c9cc388-19c8-4b98-9970-6a6e3979e4ee,size=50]
--------------------------------------------------
Standard FlowFile Attributes
Key: 'entryDate'
    Value: 'Tue Jan 08 12:25:27 PST 2019'
Key: 'lineageStartDate'
    Value: 'Tue Jan 08 12:25:27 PST 2019'
Key: 'fileSize'
    Value: '50'
FlowFile Attribute Map Content
Key: 'filename'
    Value: '6c9cc388-19c8-4b98-9970-6a6e3979e4ee'
Key: 'path'
    Value: './'
Key: 'uuid'
    Value: '6c9cc388-19c8-4b98-9970-6a6e3979e4ee'
--------------------------------------------------
header1,header2,header3
A1,A2,A3
B1,B2,B3
C1,C2,C3

Upvotes: 4

Related Questions