Reputation: 173
I am trying to get data using dataframes from cassandra by using spark-cassandra-connector
but getting below exception.
Note: Connection is successful to cassandra.
Spark version: 2.4.1
spark-cassandra-connector version: 2.5.1
Error starting ApplicationContext. To display the conditions report re-run your application with 'debug' enabled.
2021-10-01 11:32:01.649 ERROR 17404 --- [ main] o.s.boot.SpringApplication : Application run failed
java.lang.InstantiationError: com.datastax.oss.driver.internal.core.util.collection.QueryPlan
at com.datastax.spark.connector.cql.LocalNodeFirstLoadBalancingPolicy.newQueryPlan(LocalNodeFirstLoadBalancingPolicy.scala:122) ~[spark-cassandra-connector-driver_2.11-2.5.1.jar:2.5.1]
at com.datastax.oss.driver.internal.core.metadata.LoadBalancingPolicyWrapper.newQueryPlan(LoadBalancingPolicyWrapper.java:155) ~[java-driver-core-shaded-4.11.3.jar:na]
at com.datastax.oss.driver.internal.core.cql.CqlRequestHandler.onThrottleReady(CqlRequestHandler.java:193) ~[java-driver-core-shaded-4.11.3.jar:na]
at com.datastax.oss.driver.internal.core.session.throttling.PassThroughRequestThrottler.register(PassThroughRequestThrottler.java:52) ~[java-driver-core-shaded-4.11.3.jar:na]
at com.datastax.oss.driver.internal.core.cql.CqlRequestHandler.(CqlRequestHandler.java:171) ~[java-driver-core-shaded-4.11.3.jar:na]
at com.datastax.oss.driver.internal.core.cql.CqlRequestAsyncProcessor.process(CqlRequestAsyncProcessor.java:44) ~[java-driver-core-shaded-4.11.3.jar:na]
at com.datastax.oss.driver.internal.core.cql.CqlRequestSyncProcessor.process(CqlRequestSyncProcessor.java:54) ~[java-driver-core-shaded-4.11.3.jar:na]
at com.datastax.oss.driver.internal.core.cql.CqlRequestSyncProcessor.process(CqlRequestSyncProcessor.java:30) ~[java-driver-core-shaded-4.11.3.jar:na]
at com.datastax.oss.driver.internal.core.session.DefaultSession.execute(DefaultSession.java:230) ~[java-driver-core-shaded-4.11.3.jar:na]
at com.datastax.oss.driver.api.core.cql.SyncCqlSession.execute(SyncCqlSession.java:54) ~[java-driver-core-shaded-4.11.3.jar:na]
at com.datastax.oss.driver.api.core.cql.SyncCqlSession.execute(SyncCqlSession.java:78) ~[java-driver-core-shaded-4.11.3.jar:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_271]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_271]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_271]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_271]
at com.datastax.spark.connector.cql.SessionProxy.invoke(SessionProxy.scala:43) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.sun.proxy.$Proxy81.execute(Unknown Source) ~[na:na]
at com.datastax.spark.connector.rdd.partitioner.dht.TokenFactory$$anonfun$1.apply(TokenFactory.scala:99) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.rdd.partitioner.dht.TokenFactory$$anonfun$1.apply(TokenFactory.scala:98) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:112) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$withSessionDo$1.apply(CassandraConnector.scala:111) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.cql.CassandraConnector.closeResourceAfterUse(CassandraConnector.scala:129) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1] at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:111) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.rdd.partitioner.dht.TokenFactory$.forSystemLocalPartitioner(TokenFactory.scala:98) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.rdd.partitioner.SplitSizeEstimator$class.tokenFactory(SplitSizeEstimator.scala:9) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.tokenFactory$lzycompute(CassandraTableScanRDD.scala:64) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.tokenFactory(CassandraTableScanRDD.scala:64) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.rdd.partitioner.SplitSizeEstimator$class.estimateDataSize(SplitSizeEstimator.scala:12) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.rdd.partitioner.SplitSizeEstimator$class.estimateSplitCount(SplitSizeEstimator.scala:21) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.estimateSplitCount(CassandraTableScanRDD.scala:64) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$1.apply$mcI$sp(CassandraTableScanRDD.scala:228) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$1.apply(CassandraTableScanRDD.scala:228) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.rdd.CassandraTableScanRDD$$anonfun$1.apply(CassandraTableScanRDD.scala:228) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at scala.Option.getOrElse(Option.scala:121) ~[scala-library-2.11.12.jar:na]
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.partitionGenerator$lzycompute(CassandraTableScanRDD.scala:228) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.partitionGenerator(CassandraTableScanRDD.scala:224) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.getPartitions(CassandraTableScanRDD.scala:273) ~[spark-cassandra-connector_2.11-2.5.1.jar:2.5.1]
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253) ~[spark-core_2.11-2.4.1.jar:2.4.1]
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251) ~[spark-core_2.11-2.4.1.jar:2.4.1]
at scala.Option.getOrElse(Option.scala:121) ~[scala-library-2.11.12.jar:na]
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251) ~[spark-core_2.11-2.4.1.jar:2.4.1]
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126) ~[spark-core_2.11-2.4.1.jar:2.4.1]
at org.apache.spark.rdd.RDD.count(RDD.scala:1168) ~[spark-core_2.11-2.4.1.jar:2.4.1]
at org.apache.spark.api.java.JavaRDDLike$class.count(JavaRDDLike.scala:455) ~[spark-core_2.11-2.4.1.jar:2.4.1]
at org.apache.spark.api.java.AbstractJavaRDDLike.count(JavaRDDLike.scala:45) ~[spark-core_2.11-2.4.1.jar:2.4.1]
Upvotes: 2
Views: 1155
Reputation: 16313
The error you posted indicates that the embedded Java driver is not able to generate a query plan -- list of Cassandra nodes to connect to as coordinators. There is possibly an issue with how you've defined the contact points.
You normally need to specify a contact point with the cassandra.connection.host
parameter. Here's an example of how you would start a Spark shell using the connector:
$ spark-shell
--packages com.datastax.spark:spark-cassandra-connector_2.11:2.5.1
--conf spark.cassandra.connection.host=cassandra_ip
--conf spark.sql.extensions=com.datastax.spark.connector.CassandraSparkExtensions
In your case, it looks like you're creating a connection from Spring Boot and you are probably running into conflicts with dependencies.
You will need to update your original question with details of your configuration including details of the dependencies plus what command you're running to connect to Spark so those answering your question have a better idea of what the problem is. Cheers!
Upvotes: 1