prakash
prakash

Reputation: 111

Upload Spark RDD to REST webservice POST method

Frankly i'm not sure if this feature exist?sorry for that

My requirement is to send spark analysed data to file server on daily basis, file server supports file transfer through SFTP and REST Webservice post call.

Initial thought was to save Spark RDD to HDFS and transfer to fileserver through SFTP. I would like to know is it possible to upload the RDD directly by calling REST service from spark driver class without saving to HDFS. Size of the data is less than 2MB

Sorry for my bad english!

Upvotes: 3

Views: 2366

Answers (2)

sgvd
sgvd

Reputation: 3939

There is no specific way to do that with Spark. With that kind of data size it will not be worth it to go through HDFS or another type of storage. You can collect that data in your driver's memory and send it directly. For a POST call you can just use plain old java.net.URL, which would look something like this:

import java.net.{URL, HttpURLConnection}

// The RDD you want to send
val rdd = ???

// Gather data and turn into string with newlines
val body = rdd.collect.mkString("\n")

// Open a connection
val url = new URL("http://www.example.com/resource")
val conn = url.openConnection.asInstanceOf[HttpURLConnection]

// Configure for POST request
conn.setDoOutput(true);
conn.setRequestMethod("POST");

val os = conn.getOutputStream;
os.write(input.getBytes);
os.flush;

A much more complete discussion of using java.net.URL can be found at this question. You could also use a Scala library to handle the ugly Java stuff for you, like akka-http or Dispatch.

Upvotes: 2

Jakob Odersky
Jakob Odersky

Reputation: 1461

Spark itself does not provide this functionality (it is not a general-purpose http client). You might consider using some existing rest client library such as akka-http, spray or some other java/scala client library.

That said, you are by no means obliged to save your data to disk before operating on it. You could for example use collect() or foreach methods on your RDD in combination with your REST client library.

Upvotes: 0

Related Questions