Faizal
Faizal

Reputation: 363

Read parquet file data from azure data lake to Excel stored in SharePoint Online

We have a following requirement to ingest data into an Excel file.

What would be the optimal approach to implement this? Would it be better to implement server side CSOM or other Azure tools?

Your help is much appreciated.

Upvotes: 0

Views: 2592

Answers (1)

Adam Marczak
Adam Marczak

Reputation: 2351

I probably would approach this like so (few components but the least amount of custom code)

Flow

  1. Use logic app to copy parquet file to Azure Data Lake (or blob) Storage and trigger Data Factory pipeline
  2. Data Factory Pipeline executes Databricks Python or Scala notebook (both can transform data save to Excel) as an output
  3. Use Logic App to transfer Excel back to SharePoint

Because

  • Logic Apps are amazing for SharePoint operations task orchestration.
  • Data Factory nicely orchestrates databricks notebook execution.
  • Databricks can easily read Parquets and as easily can save data as Excel files so it seems like the perfect choice.

While there are 3 components you need to use in best case scenario it will take literally under 10 lines of code to do what you need to. Rest is just 'clicking'.

You could potentially skip data factory and call Databricks via REST api with logic apps, but why bother when there are out of the box components to do that.

Upvotes: 1

Related Questions