Reputation: 2032
My Java EE app uses crawler4j which starts the crawl using the following code:
CrawlConfig config = new CrawlConfig();
config.setCrawlStorageFolder("C:/crawler4j_storage");
PageFetcher pageFetcher = new PageFetcher(config);
RobotstxtConfig robotstxtConfig = new RobotstxtConfig();
RobotstxtServer robotstxtServer = new RobotstxtServer(robotstxtConfig, pageFetcher);
CrawlController controller = new CrawlController(config, pageFetcher, robotstxtServer);
controller.start(Crawler.class, 1);
The EJB is injected in Crawler.class:
@Stateless
@LocalBean
public class Crawler extends WebCrawler {
@Inject private SeedFacadeLocal seedEJB;
public void doSomething () {
seedEJB.findAll(); // gives the NullPointerException
}
My guess is that it has something to do with the way Crawler.class is passed as a argument. SeedFacadeLocal is a @Local bean interface which has a @Stateless bean implementation. I inject this bean at numerous other places and it works fine.
I think that by starting the crawl with "controller.start(Crawler.class, 1)" results in Crawler.class being a POJO instead of a EJB. Therefor annotations in Crawler.class are ignored.
Upvotes: 0
Views: 618
Reputation: 47163
CrawlController
creates instances of crawlers with a simple newInstance
call:
That won't do any sort of injection, so your crawler's injected fields will be null.
If you want to use an injected crawler, then you will need to take control of the way that the CrawlController
creates crawlers. However, there is no obvious way to do that; it is rather badly designed from that point of view.
What you will probably have to do is to separate out your domain logic (the stuff you write in your EJB) from the crawler class, and write a simple, newInstance-able, crawler class which calls the EJB when appropriate. The EJB would not itself be a crawler. You can get hold of a reference to an EJB using JNDI.
Upvotes: 1