A.HADDAD
A.HADDAD

Reputation: 1886

Write JavaPairRdd to Csv

JavaPairRdd has saveAsTextfile function, with which you can save data in a text format.

However what I need is to save the data as CSV file, so I can use it later with Neo4j.

My question is:

How to save the JavaPairRdd 's data in CSV format? Or is there a way to transform the rdd from :

Key   Value
Jack  [a,b,c]

to:

Key  value
 Jack  a
 Jack  b
 Jack  c

Upvotes: -3

Views: 581

Answers (1)

Arthur PICHOT UTRERA
Arthur PICHOT UTRERA

Reputation: 336

You should use the flatMapValues function on your JavaPairRdd: Pass each value in the key-value pair RDD through a flatMap function without changing the keys; this also retains the original RDD's partitioning.

Just by returning the value it will create a line per element in the input lists preserving the keys.

  // In Java
  JavaPairRDD<Object, List<String>> input = ...;
  JavaPairRDD<Object, String> output = input.flatMapValues((Function<List<String>, Iterable<String>>) Functions.identity());

Upvotes: 1

Related Questions