Reputation: 33
I am working on a dynamic web project. I want to write a servlet class to response to a frame submit request and perform some cluster computing tasks using apache spark(for example, calculating pi). The doGet function of the servlet(named Hello) is as following
public void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
String [] args=new String[2];
args[0]="local";
args[1]="4";
double count=0;
count=performSpark.cpi(args);
//double count=3.14;
String text1 =String.valueOf(count);
response.sendRedirect("wresultjsp.jsp?text1=" + text1);
}
The performSpark class is as following:
public class performSpark {
static double cpi(String[] input)
{
JavaSparkContext jsc = new JavaSparkContext(input[0], "performspark",
System.getenv("SPARK_HOME"), JavaSparkContext.jarOfClass(performSpark.class));
int slices = (input.length == 2) ? Integer.parseInt(input[1]) : 2;
int n = 1000000 * slices;
List<Integer> l = new ArrayList<Integer>(n);
for (int i = 0; i < n; i++)
{
l.add(1);
}
JavaRDD<Integer> dataSet = jsc.parallelize(l);
int count = dataSet.map(new Function<Integer, Integer>() {
@Override
public Integer call(Integer integer) {
double x = Math.random() * 2 - 1;
double y = Math.random() * 2 - 1;
return (x * x + y * y < 1) ? 1 : 0;
}
}).reduce(new Function2<Integer, Integer, Integer>() {
@Override
public Integer call(Integer integer, Integer integer2) {
return integer + integer2;
}
});
double result=4.0 * count / n;
return result;
}
}
The spark-assemply-2.10-0.9.1-hadoop2.2.0.jar
is copied to WEB-INF/lib
.
The build is successful but when I run the servlet in a tomcat7 server,the java.lang.ClassNotFoundException
is reported when create the JavaSparkContext:
Servlet.service() for servlet [Hello] in context with path [/sparkdemo] threw exception [Servlet execution threw an exception] with root cause java.lang.ClassNotFoundException: org.apache.spark.api.java.function.Function at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1720) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571) at Hello.doGet(Hello.java:54) at Hello.doPost(Hello.java:74) at javax.servlet.http.HttpServlet.service(HttpServlet.java:646) at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
Any one knows how to correct this problem?
Upvotes: 2
Views: 2396
Reputation: 534
We had a similar use case in our project where in we want to submit user queries to spark interactively from a web project. The way we achieved it was by first creating a spark session and attaching our custom servlet it to: .attachHandler()
In attachHandler method of our custom servlet we attached our Servlet class to ServletContextHandler of spark:
ServletContextHandler handler = new ServletContextHandler();
HttpServlet servlet = new <CustomServlet>(spark);
ServletHolder sh = new ServletHolder(servlet);
handler.setContextPath(<root context>);
handler.addServlet(sh, <path>);
spark.sparkContext().ui().get().attachHandler(handler);
Now that the servlet has got attached to Spark UI say on port 4040, then u can submit requests to it directly. We overrode the doGet method of our servlet to accept a JSON containing SQL to be run, submitted SQL using
ds = this.spark.sql(query);
Iterated over dataset returned, and added it to response object.
Another way to do this is to leverage Apache Livy.
Hope this helps.
Upvotes: 0
Reputation: 33
Finally, I've found the solution as follows.
When tomcat server starts, it loads the spark-assemply-2.10-0.9.1-hadoop2.2.0.jar
and an error:validateJarFile (.....) - jar not loaded. See Servlet Spec3.0 ......
is reported, which indicates that there exists some overlapped jar dependency.
Then I open the spark-assemply-2.10-0.9.1-hadoop2.2.0.jar
and find an overlapped folder in javax/servlet. After delete the servlet folder, the spark-assemply-2.10-0.9.1-hadoop2.2.0.jar
is loaded successfully in tomcat and the ClassNotFoundException is gone.
Upvotes: 1