John
John

Reputation: 3996

How do I connect to and write a csv file to a remote instance of Databricks Apache Spark from Java?

I'm trying to connect to a remote instance of Databricks and write a csv file to a specific folder of the DBFS. I can find bits and pieces here and there but I'm not seeing how to get this done. How do I add the file to DBFS on a remote Databricks instance from a Java program running on my local machine?

I'm currently using a community instance I created from here: https://databricks.com/try-databricks

This is the url for my instance (I'm guessing the "o=7823909094774610" is identifying my instance).
https://community.cloud.databricks.com/?o=7823909094774610

Here's some of the resources I'm looking at trying to resolve this but I'm still not able to get off of the ground:

Upvotes: 0

Views: 806

Answers (1)

Bram
Bram

Reputation: 406

You could take a look at the DBFS REST API, and consider using that in your Java application.

If a Java solution is not required, then you could also take a look at the databricks-cli. After installing it with pip (pip install databricks-cli) you simply have to:

  1. Configure the CLI by running: databricks configure
  2. Copy the file to DBFS by running: databricks fs cp <source> dbfs:/<target>

Upvotes: 2

Related Questions