chrisTina
chrisTina

Reputation: 2368

Spark - convert scala to java

Using apache-spark to process data.

Given such scala codes:

val rdd1 = sc.cassandraTable("player", "playerinfo").select("key1", "value")
val rdd2 = rdd1.map(row => (row.getString("key1"), row.getLong("value")))

Basically, it covert a RDD 'rdd1' to another RDD 'rdd2', but it stores 'rdd1' as key-value pair form.

Pay attention that the source data is from cassandra and keys1 is a part of composite key and value is the value.

Then how to convert this into Java so that I will have a JavaPairRDD<String,Long> using spark Java API? I already have an cassandraRowsRDD generated successfully from the Java codes below:

  JavaRDD<String> cassandraRowsRDD = javaFunctions(sc).cassandraTable("player", "playerinfo")
            .map(new Function<CassandraRow, String>() {
                @Override
                public String call(CassandraRow cassandraRow) throws Exception {
                    return cassandraRow.toString();
                }
            });

Upvotes: 0

Views: 1170

Answers (1)

suztomo
suztomo

Reputation: 5202

CassandraJavaRDD inherits mapToPair methods. You can call it to get key-value pair RDD in Java.

    JavaPairRDD<String, String> cassandraKeyValuePairs = javaFunctions(sc).cassandraTable("player", "playerinfo").mapToPair(
            new PairFunction<CassandraRow, String, String>() {
                @Override
                public Tuple2<String, String> call(CassandraRow row) throws Exception {
                    return new Tuple2(row.getString("key1"), row.getLong("value"));
                }
            }
    );

You can also call the function on your cassandraRowsRDD.

Upvotes: 2

Related Questions