Reputation: 2368
Using apache-spark
to process data.
Given such scala
codes:
val rdd1 = sc.cassandraTable("player", "playerinfo").select("key1", "value")
val rdd2 = rdd1.map(row => (row.getString("key1"), row.getLong("value")))
Basically, it covert a RDD
'rdd1' to another RDD
'rdd2', but it stores 'rdd1' as key-value pair form.
Pay attention that the source data is from cassandra
and keys1
is a part of composite key and value
is the value.
Then how to convert this into Java
so that I will have a JavaPairRDD<String,Long>
using spark Java API? I already have an cassandraRowsRDD
generated successfully from the Java codes below:
JavaRDD<String> cassandraRowsRDD = javaFunctions(sc).cassandraTable("player", "playerinfo")
.map(new Function<CassandraRow, String>() {
@Override
public String call(CassandraRow cassandraRow) throws Exception {
return cassandraRow.toString();
}
});
Upvotes: 0
Views: 1170
Reputation: 5202
CassandraJavaRDD
inherits mapToPair
methods. You can call it to get key-value pair RDD in Java.
JavaPairRDD<String, String> cassandraKeyValuePairs = javaFunctions(sc).cassandraTable("player", "playerinfo").mapToPair(
new PairFunction<CassandraRow, String, String>() {
@Override
public Tuple2<String, String> call(CassandraRow row) throws Exception {
return new Tuple2(row.getString("key1"), row.getLong("value"));
}
}
);
You can also call the function on your cassandraRowsRDD
.
Upvotes: 2