Reputation: 387
I try to use mleap on EMR cluster, but when i try to use it i receive the following error:
self._java_obj = _jvm().ml.combust.mleap.spark.SimpleSparkSerializer()
TypeError: 'JavaPackage' object is not callable
I load the jar files from the code
script to start the job:
CLUSTER_ID=XXXXX
JOB_NAME=mleap_sample
SCRIPT_PATH=s3://XXXX/mleap_model.py
aws emr --profile xxx add-steps --cluster-id $CLUSTER_ID \
--steps Name=$JOB_NAME,Jar=command-runner.jar,\
Args=[spark-submit,--deploy-mode,client,\
--conf,spark.yarn.submit.waitAppCompletion=true,\
$SCRIPT_PATH],ActionOnFailure=CONTINUE
Inside the code i have:
sc._jsc.addJar(
"s3a://xxxx/mleap_script/mleap-base_2.11-0.16.0-sources.jar")
sc._jsc.addJar(
"s3a://xxxx/mleap_script/mleap-core_2.11-0.16.0-sources.jar")
sc._jsc.addJar(
"s3a://xxxx/mleap_script/mleap-runtime_2.11-0.16.0-sources.jar")
sc._jsc.addJar(
"s3a://xxxx/mleap_script/mleap-spark-base_2.11-0.16.0-sources.jar")
sc._jsc.addJar(
"s3a://xxxx/mleap_script/mleap-spark-extension_2.11-0.16.0-sources.jar")
sc._jsc.addJar(
"s3a://xxxx/mleap_script/mleap-tensor_2.11-0.16.0-sources.jar")
and on EMR logs during loading i see:
20/07/07 14:01:47 INFO SparkContext: Added JAR s3a://xxxx/mleap_script/mleap-base_2.11-0.16.0-sources.jar at s3a://mds-user-data-new/mleap_script/mleap-base_2.11-0.16.0-sources.jar with timestamp 1594130507784
20/07/07 14:01:47 INFO SparkContext: Added JAR s3a://xxxx/mleap_script/mleap-core_2.11-0.16.0-sources.jar at s3a://mds-user-data-new/mleap_script/mleap-core_2.11-0.16.0-sources.jar with timestamp 1594130507786
20/07/07 14:01:47 INFO SparkContext: Added JAR s3a://xxxx/mleap_script/mleap-runtime_2.11-0.16.0-sources.jar at s3a://mds-user-data-new/mleap_script/mleap-runtime_2.11-0.16.0-sources.jar with timestamp 1594130507788
20/07/07 14:01:47 INFO SparkContext: Added JAR s3a://xxxx/mleap_script/mleap-spark-base_2.11-0.16.0-sources.jar at s3a://mds-user-data-new/mleap_script/mleap-spark-base_2.11-0.16.0-sources.jar with timestamp 1594130507790
20/07/07 14:01:47 INFO SparkContext: Added JAR s3a://xxxx/mleap_script/mleap-spark-extension_2.11-0.16.0-sources.jar at s3a://mds-user-data-new/mleap_script/mleap-spark-extension_2.11-0.16.0-sources.jar with timestamp 1594130507793
20/07/07 14:01:47 INFO SparkContext: Added JAR s3a://xxxx/mleap_script/mleap-tensor_2.11-0.16.0-sources.jar at s3a://mds-user-data-new/mleap_script/mleap-tensor_2.11-0.16.0-sources.jar with timestamp 1594130507796
I'm using spark version 2.4.5
Any idea why i'm facing this issue?
PS: i've the same error message if i use pyspark on sagemaker notebooks.
Thanks
Upvotes: 2
Views: 266