Reputation: 20090
As part of my tuning, I've been adjusting the maxSpoutPending
parameter. However, it would be nice to know how many tuples in the topology at any time, so I could tell how much of an impact this parameter is having on my topologies' performance.
I dug around in the source but didn't find anything. Is this a value I can find in the Storm UI? Or possibly I can override something somewhere to log this value?
Upvotes: 2
Views: 592
Reputation: 3172
You said you're looking for insight on the effectiveness of the maxTuplesPending attribute.
Working with the KafkaSpout provided by Storm, (I've modified the source code to add more logging to see what's happening) the next() method gets called all the time (<1ms). So I've always seen relatively fast turn around (<1ms) from when a Tuple gets ack'd or failed (reducing the MaxPending count) and when a new tuple gets sent into the topology (hitting the MaxPending count again). Logs from today showing the time stamps from when a Tuple gets ack'd and then another one gets sent out.
2015-10-16T12:20:15.162-0500 s.k.PartitionManager [INFO] PM! 6 - ack
2015-10-16T12:20:15.163-0500 s.k.PartitionManager [INFO] PM! 177 - next
2015-10-16T12:20:15.400-0500 s.k.PartitionManager [INFO] PM! 10 - ack
2015-10-16T12:20:15.401-0500 s.k.PartitionManager [INFO] PM! 178 - next
2015-10-16T12:20:15.649-0500 s.k.PartitionManager [INFO] PM! 22 - ack
2015-10-16T12:20:15.649-0500 s.k.PartitionManager [INFO] PM! 180 - next
2015-10-16T12:20:16.511-0500 s.k.PartitionManager [INFO] PM! 27 - ack
2015-10-16T12:20:16.512-0500 s.k.PartitionManager [INFO] PM! 182 - next
This shows fairly instantaneous turnaround. So for my use case there's pretty much always maxPending count number of Tuples in my Topology.
My tuples also don't get processed rather quickly (~1 sec), so for tuples that get processed much faster or for different types of Spouts I couldn't say.
Upvotes: 1
Reputation: 62330
It depends on what you mean by "how many tuples are in the topology".
client.getTopologyInfo("topolgoyName")
(with client = NimbusClient.getConfiguredClient(...)
.TopologyInfo
might still be helpful, but I am not sure if/how to compute the value you want to know.Upvotes: 1
Reputation: 2647
Given that you have enough messages in your spout you can force the spout from reading from the beginning and see how many tuples you can process in 10 minutes. (and with basic math you can obtain the number of tuples per second).
For example with a kafka spout you can do the following:
SpoutConfig spoutConfig = new SpoutConfig(
// your spout config
);
spoutConfig.forceFromStart = true; // this is how you tell the spout to read from the oldest kafka offset
KafkaSpout kafkaSpout = new KafkaSpout(spoutConfig);
And then let the topology run for 15 minutes and see how many tuples the topology processed in the last 10 minutes.
Upvotes: 0