Myriam22
Myriam22

Reputation: 1

Optimizing Incremental Data Refresh in PowerBI with a Non-Foldable M Query

I am currently working on a PowerBI project where I have a source CSV file stored in SharePoint. This file contains a history of events related to employees, with each event having a start date and an end date, along with other columns that define the employee's status (I have multiple rows for a single personID).

Problem: In PowerQuery, I performed transformations to extract the status at the end of each month, resulting in one row per end of the month for each employee. The issue is that the source file contains approximately 200,000 rows with around 45,000 distinct personIDs. Since 2021, this results in roughly 45,000 * 12 * 4 rows, making this operation very resource-intensive in PowerQuery.

Solution Attempt:

I had to change my approach. I decided to generate a file at the end of each month that contains the status of each employee as of that date and load it incrementally. For the historical data, I plan to handle it in SQL by requesting a SQL Server environment where I will process the data. I subsequently chose to implement an incremental refresh in PowerBI because the CSV file on SharePoint is generated each month with the same name and thus gets overwritten. Therefore, I reload my data each month starting from the previous month’s date.

Issue:

However, this approach does not work in PowerBI. When I import my data source from the web (CSV file on SharePoint), I encounter an error stating that the M query cannot be folded.

What can I do in this situation? If this is not feasible, what alternative solution could I propose?

Upvotes: 0

Views: 77

Answers (1)

Amira Bedhiafi
Amira Bedhiafi

Reputation: 1

I would go for using Power BI Dataflows since they support incremental refresh and can handle the transformations you want to apply. You create a dataflow that connects to your SharePoint CSV file, do what you need to do as transformations there, and enable incremental refresh in the dataflow settings. This way, the "heavy lifting" is done in the Power BI service rather than in the Power BI Desktop.

Introduction to dataflows and self-service data prep

Using incremental refresh with dataflows

Otherwise, preprocess your data with an external ETL tool (SSIS, ADF ...)

Upvotes: 0

Related Questions