Sambo
Sambo

Reputation: 125

Power BI: Make a line chart continuous when source contains null values (handle missing values)

This question spinned off a question I posted earlier; Custom x-axis values in Power BI

Suppose the following data set:

enter image description here

Focus on the second and third row. How can I make the line in corresponding graph hereunder be continuous and not stop in the middle?

enter image description here

In Excel I used to solve this problem by applying NA() in a formula generating data for the graph. Is there a similar solution using DAX perhaps?

Upvotes: 3

Views: 20614

Answers (1)

vestland
vestland

Reputation: 61074

The short version:


What you should do:

Unleash python and after following the steps there, insert this script (dont't worry, I've added some details here as well):

import pandas as pd

# retrieve index column
index = dataset[['YearWeek_txt']]

# subset input data for further calculations
dataset2 = dataset[['1', '2']]

# handle missing values
dataset2 = dataset2.fillna(method='ffill')

Then you will be able to set it up like this:

enter image description here

Why you should do it:

For built-in options, these are your choices as far as I know:

1. Categorical YearMonth and categorical x-axis does not work

enter image description here

2. Categorical YearMonth and continuous x-axis does not work

enter image description here

3. Numerical YearMonth and categorical x-axis does not work

enter image description here

4. Numerical YearMonth and continuous x-axis does not work

enter image description here


The details, starting with why built-in approaches fail:


1. Categorical YearMonth and categorical x-axis

I've used the following dataset that resembles the screenshot of your table:

YearWeek   1     2
201603     2.37  2.83
201606           2.55
201607     2.98 
201611     2.33  2.47
201615     2.14  2.97
201619     2.15  2.02
201623     2.33  2.8
201627     3.04  2.57
201631     2.95  2.98
201635     3.08  2.16
201639     2.50  2.42
201643     3.07  3.02
201647     2.19  2.37
201651     2.38  2.65
201703     2.50  3.02
201711     2.10  2
201715     2.76  3.04

And NO, I didn't bother manually copying ALL your data. Just your YearWeek series. The rest are random numbers between 2 and 3.

Then I set the data up as numerical for 1 and 2, and YearWeek aa type text in the Power Query Editor:

enter image description here

So this is the original setup with a table and chart like yours:

enter image description here

The data is sorted descending by YearWeek_txt:

enter image description here

And the x-axis is set up asCategorical:

enter image description here

Conclusion: Fail


2. Categorical YearMonth and numercial x-axis

With the same setup as above, you can try to change the x-axis type to Continuous:

enter image description here

But as you'll see, it just flips right back to 'Categorical', presumably because the type of YearWeek is text.

Conclusion: Fail


3. Numerical YearMonth and categorical x-axis

I've duplicated the original setup so that I've got two tables Categorical and Numerical where the type of YearWeek are text and integer, respectively:

enter image description here

So numerical YearMonth and categorical x-axis still gives you this:

enter image description here

Conclusion: Fail


4. Numerical YearMonth and continuous x-axis does not work

But now, with the same setup as above, you are able to change the x-axis type to Continuous:

enter image description here

And you'll end up with this:

enter image description here

Conclusion: LOL


And now, Python:


In the Power Query Editor, activate the Categorical table, select Transform > Run Python Script and insert the following snippet in the Run Python Script Editor:

# 'dataset' holds the input data for this script

import pandas as pd

# retrieve index column
index = dataset[['YearWeek_txt']]

# subset input data for further calculations
dataset2 = dataset[['1', '2']]

# handle missing values
dataset2 = dataset2.fillna(method='ffill')

Click OK and click Table next to dataset2 here:

enter image description here

And you'll get this (make sure that the column data types are correct):

enter image description here

As you can see, no more missing values. dataset2 = dataset2.fillna(method='ffill') has replaced all missing values with the preceding value in both columns.

Click Close&Apply to get back to the Desktop, and enjoy your table and chart with no more missing values:

enter image description here

Conclusion: Python is cool


End note:

There are a lot of details that can go wrong here with decimal points, data types and so on. Let me know how things work out for you and I'll have a look at it again if it doesn't work on your end.

Upvotes: 3

Related Questions