Reputation: 180
i am using Crawler4j Example code but i found that i got an exception.
Here is my exception :
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/http/conn/scheme/SchemeSocketFactory
at LocalDataCollectorController.main(LocalDataCollectorController.java:24)
Caused by: java.lang.ClassNotFoundException: org.apache.http.conn.scheme.SchemeSocketFactory
Here is my code :
public static void main(String[] args) throws Exception {
String root Folder = "D:\\";
int numberOfCrawlers = 5;
System.out.println("numberOfCrawlers"+numberOfCrawlers);
System.out.println(rootFolder);
CrawlConfig config = new CrawlConfig();
config.setCrawlStorageFolder(rootFolder);
config.setMaxPagesToFetch(10);
config.setPolitenessDelay(1000);
PageFetcher pageFetcher = new PageFetcher(config);
RobotstxtConfig robotstxtConfig = new RobotstxtConfig();
RobotstxtServer robotstxtServer = new RobotstxtServer(robotstxtConfig, pageFetcher);
CrawlController controller = new CrawlController(config, pageFetcher, robotstxtServer);
controller.addSeed("http://www.ohloh.net/p/crawler4j");
controller.start(LocalDataCollectorCrawler.class, numberOfCrawlers);
List<Object> crawlersLocalData = controller.getCrawlersLocalData();
long totalLinks = 0;
long totalTextSize = 0;
int totalProcessedPages = 0;
for (Object localData : crawlersLocalData) {
CrawlStat stat = (CrawlStat) localData;
totalLinks += stat.getTotalLinks();
totalTextSize += stat.getTotalTextSize();
totalProcessedPages += stat.getTotalProcessedPages();
}
System.out.println("Aggregated Statistics:");
System.out.println(" Processed Pages: " + totalProcessedPages);
System.out.println(" Total Links found: " + totalLinks);
System.out.println(" Total Text Size: " + totalTextSize);
}
}
Upvotes: 2
Views: 19328
Reputation: 39641
Download the HttpClient
and add it to your build path.
There is also a package which contains all crawler4j dependencies in the download section. You should use this to avoid further problems.
Upvotes: 5
Reputation: 5811
The reason for NoClassDefFoundError
is always the same: you didn't supply the dependency class during runtime execution. In other words, when you ran your example, you didn't put HttpClient's JAR-file on the classpath. Do this and the problem will go away.
Upvotes: 2