Peter
Peter

Reputation: 41

Parse received data as json by fluentd

I am trying to receive data by fluentd from external system thats looks like: data={"version":"0.0";"secret":null}

Response is: 400 Bad Request 'json' or 'msgpack' parameter is required

If i send (can not change real source) same string with "json" instead of "data" (like json={"version":"0.0";"secret":null}), everything is OK. How can i config fluentd to accept it same way? Thanks.

example of fluent.conf:

<source>                                                  
  @type http                                              
  port 24224                                              
  bind 0.0.0.0          

  # accept "{"key":"value"} input                                    
  format json   

  # accept "json={"key":"value"} input                                    
  #format default
</source>                                               
<match **>                                              
  @type file                                            
  @id   output1                                         
  path         /fluentd/log/data.*.log                  
  symlink_path /fluentd/log/data.log                   
  format json                                           
  append       true                                     
  time_slice_format %Y%m%d                              
  time_slice_wait   10m                                 
  time_format       %Y%m%dT%H%M%S%z                     
</match>

I have tried using regex or to modify data by nginx. Regex is not possible due to encoded and complex data and did not find way how to modify POST data with nginx (also this is bad way).

Upvotes: 1

Views: 6308

Answers (2)

Peter
Peter

Reputation: 41

Ill answer myself. After trying a lot of configurations (and hours of reading official documentations of fluentd/nginx and blogs) I decided to create plugin (http://docs.fluentd.org/articles/plugin-development#parser-plugins). I have ended with this solution:

  1. Parser plugin

    module Fluent
      class TextParser
        class CMXParser < Parser
          # Register this parser
          Plugin.register_parser("parser_CMX", self)
    
          config_param :format_hash, :string, :default => "data" #  delimiter is configurable with " " as default
    
          def configure(conf)
            super
          end
    
          # This is the main method. The input "text" is the unit of data to be parsed.
          def parse(text)
            text = WEBrick::HTTPUtils.parse_query(text)
            record = JSON.parse(text[@format_hash])
            yield nil, record
          end
        end
      end
    end
    
  2. Config for Fluentd

    <source>                                
      @type http                            
      port 24224                            
      bind 0.0.0.0                          
      body_size_limit 32m                   
      keepalive_timeout 5s                  
      format parser_CMX
    </source>                     
    
    <match **>                            
      @type file                          
      @id   output1                       
      path         /fluentd/log/data.*.log
      symlink_path /fluentd/log/data.log  
      format json                         
      append       true                   
      time_slice_format %Y%m%d            
      time_slice_wait   10m               
      time_format       %Y%m%dT%H%M%S%z   
    </match>                    
    

I think there is space to implement this to core code, becouse base in_http script does the same thing, except it use only hardcoded string "params['json']". It can use new variable like "format_hash"/"format_map" that can contains map for this purpose.

Upvotes: 2

repeatedly
repeatedly

Reputation: 718

http://docs.fluentd.org/articles/in_http

This article shows accepted formats.

How can i config fluentd to accept it same way?

It means do you want to parse data={"k":"v"} with format json? If so, it can't.

Upvotes: 0

Related Questions