Reputation: 11
I have storm running locally as single-machine setup. I want to send a topology with an alternative yaml configuration for the crawler. I get an error when the topology cannot load an expected property which is included in the alternative configuration file.
I am trying to send a topology to the storm cluster using this command:
storm jar stormcrawler.jar topology.BasicTopology -conf conf/crawler-conf.yaml
The crawler-conf.yaml contains the following properties:
config:
topology.workers: 1
topology.es.spouts: 1
When I run the script I get this error:
Exception in thread "main" java.lang.NullPointerException
at topology.BasicTopology.run(BasicTopology.java:78)
This is the bit of code in the BasicTopology class:
@Override
protected int run(String[] args) {
log.info(this.conf.keySet().toString());
int nbWorkers = (int) this.conf.get("topology.workers"); <--- NPE
As far as I've been able to investigate, the problem is the storm.py script will interpret the "-conf" as a "common config" (it looks for the -c flag) and set it as a Storm option. So it will interpret that we are trying to set "onf" as a storm option, so it runs storm with the
-Dstorm.options=onf
After picking up "onf" as storm option, what is being set to the topology as args is just "conf/crawler-conf.yaml". Since this arg is not preceded by "-conf", the yaml file is not parsed for its properties.
This didn't happen in 1.2.2 but is happening now in 2.3.0 (argparser was added to the storm.py script)
Upvotes: 0
Views: 63
Reputation: 4864
Try
storm local target/stormcrawler.jar --local-ttl 3600 topology.BasicTopology -- -conf conf/crawler-conf.yaml
See https://github.com/DigitalPebble/storm-crawler/tree/master/archetype/src/main/resources/archetype-resources for an example of what is generated from the archetype.
If you are porting an existing SC 1.x to 2.x, it might be a good idea to have a clean start with the archetype and add the bits that are specific to your application. Quite a few things have changed since 1.x and it would be a good way of making sure that you haven't forgotten anything.
I would also recommend that you consider using Flux files as they are more flexible than a hard-coded topology. Most users, including myself, use them.
Upvotes: 0