Pratik Khadloya
Pratik Khadloya

Reputation: 12869

PairRDD from SQL

Is it possible to have a pair RDD from the below SQL query.
The pair being ((item_id, flight_id), metric1)
item_id, flight_id are part of group by.

SELECT
  item_id,
  flight_id,
  SUM(metric1) AS metric1
FROM mytable
GROUP BY
  item_id,
  flight_id

Upvotes: 0

Views: 48

Answers (1)

zero323
zero323

Reputation: 330093

As as mentioned by eliasah you can simply map over a RDD (with optional rdd between query and map) as follows:

sqlContext.sql(query).map{case Row(item_id: U, flight_id: V, metric1: T) =>
  ((item_id, flight_id), metric1)}

Where T, U, V are types of data, sqlContext is a SQLContext instance and query is a query provided in your question.

Upvotes: 1

Related Questions