padawan
padawan

Reputation: 62

Storm 2.0.0 going Out Of Memory

I upgraded my code base to use storm 2.0.0 from 1.1.1. Now I observe that if I run topology in local mode, it goes out of memory after few minutes.

[THREAD ID=AsyncLocalizer Executor - 2-EventThread] Dev-APC180-local o.a.s.s.o.a.z.ClientCnxn Error while calling watcher java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:717) at org.apache.storm.shade.org.apache.zookeeper.ClientCnxn.start(ClientCnxn.java:421) at org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:454) at org.apache.storm.shade.org.apache.curator.utils.DefaultZookeeperFactory.newZooKeeper(DefaultZookeeperFactory.java:29) at org.apache.storm.shade.org.apache.curator.framework.imps.CuratorFrameworkImpl$2.newZooKeeper(CuratorFrameworkImpl.java:213) at org.apache.storm.shade.org.apache.curator.HandleHolder$1.getZooKeeper(HandleHolder.java:101) at org.apache.storm.shade.org.apache.curator.HandleHolder.getZooKeeper(HandleHolder.java:57) at org.apache.storm.shade.org.apache.curator.ConnectionState.reset(ConnectionState.java:204) at org.apache.storm.shade.org.apache.curator.ConnectionState.handleExpiredSession(ConnectionState.java:380) at org.apache.storm.shade.org.apache.curator.ConnectionState.checkState(ConnectionState.java:315) at org.apache.storm.shade.org.apache.curator.ConnectionState.process(ConnectionState.java:169) at org.apache.storm.shade.org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:533) at org.apache.storm.shade.org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:508)

Above is the stack trace of the error. On deeper analysis I found that there are around 5000 threads of 'AsyncLocalizer EventThread' and 'AsyncLocalizer SendThread'. It is spawned by AsyncLocalizer.updateBlobs.

AsyncLocalizer.updateBlobs is a scheduled task which runs every 30 seconds. Please point me to right direction. I'm clueless on what I missed.

Upvotes: 0

Views: 220

Answers (1)

Stig Rohde Døssing
Stig Rohde Døssing

Reputation: 3651

This is very likely due to https://issues.apache.org/jira/browse/STORM-3501. Blob cleanup is broken in 2.0.0, so the supervisor keeps trying to download blobs that are actually deleted. I think this also causes it to start a huge number of Curator instances.

Upvotes: 1

Related Questions