ata
ata

Reputation: 9011

logstash with elasticsearch_http

Apparently logstash OnDemand account does not work when I wanted to post an issue.

Anyways, I have a logstash setup with redis, elasticsearch, and kibana. My logstash are collecting logs from several files and putting in redis just fine.

Logstash version 1.3.3 Elasticsearch version 1.0.1

The only thing I have in elasticsearch_http for logstash is the host name. This all setup seems to glue together just fine.

The problem is that the elasticsearch_http is not consuming the redis entries as they come. What I have seen by running it in debug mode is that it flush about 100 entries after every 1 min (flush_size and idle_flush_time's default values). The documentation however states, from what I understand is, that it will force a flush in case the 100 flush_size is not satisfied (for example we had 10 messages in last 1 min). But it seems to work the other way. Its flushing about 100 messages every 1 min only. I changed the size to 2000 and it flush 2000 every min or so.

Here is my logstash-indexer.conf

input {
 redis {
  host => "1xx.xxx.xxx.93"
  data_type => "list"
  key => "testlogs"
  codec => json
 }
}
output {
 elasticsearch_http {
  host => "1xx.xxx.xxx.93"
 }
}

Here is my elasticsearch.yml

cluster.name: logger
node.name: "logstash"
transport.tcp.port: 9300
http.port: 9200
discovery.zen.ping.unicast.hosts: ["1xx.xxx.xxx.93:9300"]
discovery.zen.ping.multicast.enabled: false
#discovery.zen.ping.unicast.enabled: true
network.bind_host: 1xx.xxx.xxx.93
network.publish_host: 1xx.xxx.xxx.93

The indexer, elasticsearch, redis, and kibana are on same server. The log collection from file is done on another server.

Upvotes: 0

Views: 2789

Answers (1)

John Petrone
John Petrone

Reputation: 27497

So I'm going to suggest a couple of different approaches to solve you problem. Logstash as you are discovering can be a bit quirky so I've found a these approaches useful in dealing with unexpected behavior from logstash.

  1. Use the elasticsearch output instead of elasticsearch_http. You can get the same functionality by using elasticsearch output with protocol set to http. The elasticsearch output is more mature (milestone 2 vs milestone 3) and I've seen this change make a difference before.
  2. Set the defaults for idle_flush_time and flush_size. There have been issues with Logstash defaults previously, I've found it to be a lot safer to set them explicitly. idle_flush_time is in seconds, flush_size is the number of records to flush.
  3. Upgrade to more recent versions of logstash. There is enough of a change in how logstash is deployed with version 1.4.X (http://logstash.net/docs/1.4.1/release-notes) that I'd that I'd bite the bullet and upgrade. It's also significantly easier to get attention if you still have a problem with the most recent stable major release.
  4. Make certain your Redis version matches those support by your logstash version.
  5. Experiment with setting the batch, batch_events and batch_timeout values for the Redis output. You are using the list data_type. list supports various batch options and as with some other parameters it's best not to assume the defaults are always being set correctly.
  6. Do all of the above. In addition to trying the first set of suggestions, I'd try all of them together in various combinations.
  7. Keep careful records of each test run. Seems obvious but between all the variations above it's easy to lose track - I'd keep careful records and try to change only one variation at a time.

Upvotes: 1

Related Questions