Gargoyle
Gargoyle

Reputation: 10375

Databricks and following ExternalLinks

I'm running a query statement with external links like so:

var statement = SqlStatement.Create(processor.Sql, warehouseId);
statement.Disposition = SqlStatementDisposition.EXTERNAL_LINKS;

_client = DatabricksClient.CreateClient("https://....azuredatabricks.net/", personalAccessToken);

var result = await _client.SQL.StatementExecution.Execute(statement, cancellationToken);

foreach (var link in result.Result.ExternalLinks) {
   await foreach (var row in httpClient.GetFromJsonAsAsyncEnumerable<List<string?>>(link.ExternalLink, cancellationToken) {

I'm stuck now getting the "next" chunk. link.NextChunkInternalLink gives me a path like /api/2.0/sql/statements/.... What do I do with that value?

Upvotes: 0

Views: 34

Answers (1)

To fetch the next chunk of results using the NextChunkInternalLink, you need to make another API call to the provided link.

You can do this using the Databricks CLI below is the code:

databricks api get /api/2.0/sql/statements/${NEXT_CHUNK_INTERNAL_LINK} \
-o 'sql-execution-response.json' \
&& jq . 'sql-execution-response.json' \
&& export NEXT_CHUNK_INTERNAL_LINK=$(jq -r .next_chunk_internal_link 'sql-execution-response.json') \
&& echo NEXT_CHUNK_INTERNAL_LINK=$NEXT_CHUNK_INTERNAL_LINK

Nexr, In your C# code, you can make an HTTP GET request to the NextChunkInternalLink to retrieve the next chunk of results.

var nextChunkLink = link.NextChunkInternalLink;
while (!string.IsNullOrEmpty(nextChunkLink))
{
    var nextChunkResponse = await httpClient.GetFromJsonAsync<YourResponseType>(nextChunkLink, cancellationToken);    
    foreach (var row in nextChunkResponse.Result)
    {        
    }
    nextChunkLink = nextChunkResponse.NextChunkInternalLink;
}

The loop will continue fetching and processing chunks until there are no more chunks to fetch.

Reference: Using external links with Microsoft.Azure.Databricks.Client does not return all results

Upvotes: 0

Related Questions