Reputation: 309
I'm trying to read a log file like that one:
199.72.81.55 - - [01/Jul/1995:00:00:01 -0400] "GET /history/apollo/ HTTP/1.0" 200 6245
unicomp6.unicomp.net - - [01/Jul/1995:00:00:06 -0400] "GET /shuttle/countdown/ HTTP/1.0" 200 3985
199.120.110.21 - - [01/Jul/1995:00:00:09 -0400] "GET /shuttle/missions/sts-73/mission-sts-73.html HTTP/1.0" 200 4085
burger.letters.com - - [01/Jul/1995:00:00:11 -0400] "GET /shuttle/countdown/liftoff.html HTTP/1.0" 304 0
199.120.110.21 - - [01/Jul/1995:00:00:11 -0400] "GET /shuttle/missions/sts-73/sts-73-patch-small.gif HTTP/1.0" 200 4179
I'm sending 1000 lines each time I run this exercise, and I'm using a splitText processor, and in the extractText processor I use this regex:
successCode -> ^[0-9A-Z\-a-z\.]* - - \[[0-9A-Za-z\/\:]* -[0-9]*\] \"[A-Z]* [0-9A-Za-z\/\.\- ]*\" ([0-9]*) [0-9]*
tiemStamp -> ^[0-9A-Z\-a-z\.]* - - \[([0-9A-Za-z\/\:]*) -[0-9]*\] \"[A-Z]* [0-9A-Za-z\/\.\- ]*\" [0-9]* [0-9]*
important -> ^([0-9A-Z\-a-z\.]*) - - \[[0-9A-Za-z\/\:]* -[0-9]*\] \"[A-Z]* [0-9A-Za-z\/\.\- ]*\" [0-9]* [0-9]*
It can be a mistake on it. Surely here is my problem.
Then, I tryed to send different logs to different routes. If successCode == 200 then I tried to put it on route /user//success/%{tiemStamp}/, but all my lines go to the third way: "unmatched"
On the RouteOnContent processor I've tryed:
successCode -> ${successCode:equals("200")}
successCode -> ${successCode:contains(2)}
successCode -> ${successCode:contains("2")}
Has anyone worked with "RouteOnContent" processor?
Upvotes: 2
Views: 10931
Reputation: 706
Basically you can use both RouteOnAttribute
or RouteOnText
, but each uses different parameters.
If you chose to use ExtractText
, the properties you defined are populated for each row (after the original file was split by SplitText
processor).
Now, you have two options:
Extract Text
.Each processor routes the FlowFile differently:
RouteOnAttribute queries the attributes of the FlowFile (a NiFi Expression Language query). For example, let's say I defined the property 'name', routing based on its value can be:
On the other hand, RouteOnContext queries the content of the FlowFile based on a regex expression. For example:
After defining these parameters, you can continue to route based on these dynamic relationships:
Upvotes: 2
Reputation: 1199
According to the documentation, the ExtractText
Processor "Evaluates one or more Regular Expressions against the content of a FlowFile. The results of those Regular Expressions are assigned to FlowFile Attributes [...]"
So you should not use a RouteOnContent
but a RouteOnAttribute
processor in the next step.
(If you stop your RouteOnXXX
processor in order to keep the messages in the queue, you can see the content of the flowfiles. On the "Attributes" tab of a flowfile, you can see the values of the different attributes. And I confirm that with your regexp, I have successCode=200. )
Upvotes: 3