Reputation: 299
I am reading from a database and making a csv out of it using QueryDatabaseRecord and ConvertRecord processors.
I want one of the columns in the csv to be extracted and used to name my CSV file that will be stored locally on my system via PutFile or in S3.
CSV looks like,
cola,colb,colc,date
A1,123,vin9,2020-02-04
A2,456,vin9,2020-02-04
A3,789,vin9,2020-02-04
I want to extract just the first row's colc and date field to produce a filename called vin9-2020-02-04.csv for my output dump.
Which processor can help me achieve this? Thanks!
Upvotes: 1
Views: 1598
Reputation: 12083
Are you guaranteed from the query to have the date column be the same value all the time? In either case, you can use PartitionRecord and partition on the date
column, you will get flow file(s) out that each have all the records for a unique date. If you choose basename
as the name of the user-defined property with value /date
, each flow file will have the basename
attribute set to the value of the date column. Then you can use UpdateAttribute to set filename
equal to ${basename}.csv
.
Upvotes: 1
Reputation: 31460
We can do that using ExecuteStreamCommand + UpdateAttribute
Processor!
Flows:
Option1:
1.QueryDatabaseRecord
2.ConvertRecord
3.ExecuteStreamCommand
Sample Shell Script:
This script gets file content and get only second line
into attr
attribute to the flowfile.
$ cat second_line.sh
#!/bin/sh
cat $1 |head -2|tail -1
4.UpdateAttribute:
Add new property:
filename
${attr:replaceAll('.+?,.+?,(.*)','$1'):replace(',','-')}.csv
Output flowfile from Updateattribute
processor will have the desired filename.
Option2:
Another way would be using Extract Text
processor:
Add new property as:
attr
^.*\r?\n(.*)
once we get attr
value as second line data then use UpdateAttribute processor to change the filename.
Upvotes: 1