Saswat Ray
Saswat Ray

Reputation: 197

Creating REST api in Databricks which would display delta tables information to the user?

I would like to create an API endpoint in Azure Databricks that would publish delta table data to a user. All I found by searching there is no REST API to expose delta table content. Is it true?

Upvotes: 4

Views: 6377

Answers (2)

Alex Ott
Alex Ott

Reputation: 87279

You can just leverage the Databricks SQL Statement Execution API to access data on Azure Databricks. Follow the tutorial that shows how to work with that API using curl

Upvotes: 3

JayashankarGS
JayashankarGS

Reputation: 8160

In Databricks there is no REST Api to get data in delta table, but we can get data of delta table using Databricks jobs.

Follow below steps,

First, create a notebook in databricks which gets data in delta table and returns the data in Json format. Add below code in your notebook, alter accordingly to your table.

df = spark.sql("SELECT  *  FROM test")
j_son = df.toJSON()

res=""
for i in j_son.collect():
    res = res + "," + i

dbutils.notebook.exit("["+res[1:]+"]")

Then create a job using this notebook.

enter image description here

In the next page create the task name, select the path to notebook and your cluster on which it needs to be run. Then hit on create.

enter image description here

After creating you will get details as below, copy the job id.

enter image description here

This is the job in databricks provides API's for running job, getting results of that particular job.

For more information follow this documentation.`

Next create an backed server where the API request raised to databricks job. I've used javascript and expressJs framework.

In visual code open new folder and execute below command in terminal. Make sure you have nodeJs installed in your system.

npm install express axios cors

You will get the project structure as below in your folder.

enter image description here

Next, create .js file, mine it's btst.js. After creating, add below code in that file.

const  express = require('express');
const  axios = require('axios');
const  cors = require('cors');
const  app = express();

app.use(cors({
origin:'*'
}));

const  port = 5000;

// Set the access token for authentication
const  token = 'your databricks token';

// Set the base URL for the Databricks REST API
const  baseUrl = 'https://your databricks host/api/2.1/';

app.get('/runjob', async  function(req, res) {
try {
const  headers = {
'Authorization':  `Bearer ${token}`,
'Content-Type':  'application/json'
}

function  sleep(ms) {
return  new  Promise(resolve  =>  setTimeout(resolve, ms));
}

const  requestBody = {
job_id:  "1121949406005982"
};
const  response = await  axios.post(`${baseUrl}jobs/run-now/`, requestBody, { headers });
console.log(`run id : ${response.data.run_id}`)
console.log("job is running...")
const  t_res = await  axios.get(`${baseUrl}jobs/runs/get?run_id=${response.data.run_id}`,{ headers })
task_id = t_res.data.tasks[0].run_id
console.log(`task id : ${task_id}`)
await  sleep(15000)
const  d_res = await  axios.get(`${baseUrl}jobs/runs/get-output?run_id=${task_id}`,{ headers })
j_res = JSON.parse(d_res.data.notebook_output.result)
console.log(j_res)
res.send(j_res);
} catch (error) {
console.log(error)
res.status(500).send(`Error1.......${error}`);
}
});

app.listen(port, () => {
console.log(`Server listening at http://localhost:${port}`);
});

Here you paste the job id you copied earlier in databricks.

enter image description here Then execute below command in terminal.

node yourfilename.js

In my case it is

node btst.js

Then your server will be running at http://localhost:5000.

You send GET request to this url in browser or using postman at endpoint. http://localhost:5000/runjob, you will get the data in json format so you can use it anywhere. This is the below result, it takes 15 seconds to display because after sending post request to jobs it needs to run the task, it takes about 10-15 seconds. enter image description here

and data in databricks table is enter image description here

Upvotes: 3

Related Questions