Reputation: 852
In my Spark Streaming application I receive the following data types:
{
"timestamp": 1479740400000,
"key": "power",
"value": 50
}
I want to group by timestamp
and key
and aggregate the value
field.
Is there any way of keying by an object rather than a string? I want to do something like the following:
JavaPairDStream<AggregationKey, Integer> aggregation = data.mapToPair(
(PairFunction<DataObject, AggregationKey, Integer>) data -> {
return new Tuple2<>(new AggregationKey(data), data.value);
}
).reduceByKey(
(Function2<Integer, Integer, Integer>) (value1, value2) -> {
return value1 + value2;
}
);
But this way of trying to group doesn't work in Spark.
To get around this in the interim I'm doing new AggregationKey(data).toString()
. I don't know if this is an acceptable solution or not.
Upvotes: 2
Views: 297
Reputation:
Any object can be used with byKey
methods as long as:
Upvotes: 2