Reputation: 9
I am very new to Tableau and I have an requirement to build an Data Quality Dashboard. I was wondering I can get some assistance in developing this.
Details:
1st requirement - To show a visual of all fields in the table with their appropriate counts of nulls/valid values/invalid values
Table name: Customers
Fields: - Name - Address - CustomerCategory - DOB - CustomerType
So I want three checks with counts to be done, Nulls, Valid Values, Invalid Values
The end result I would like is a stacked bar chart (x axis - count/percentage, y axis - field name). Each bar on the chart will represent one field e.g. Name or DOB...so on and will comprise of three counts of the three checks I have mentioned above and will total up to 100%. And it will also have two filters - activity_month, provider_code
What is the best way in doing this? And if I need to create a calculated fields what's the optimum way?
Option 1
Do I need to create a calculated field for each field for each check?
e.g. Name_null - Total count null values Name_invalid - Total count invalid values Name_valid - Total count valid values
CustomerCategory - Total count null values
CustomerCategory - Total count invalid values
CustomerCategory - Total count valid values
In this option this would create many calculated fields.
Option 2
Do I create three fields with their total respective counts per check for all fields:
e.g. Total_Null Total_Invalid Total_Valid
I coded this previously in SQL and brought it to Tableau data source and reliased I need row level data which is a requirement for the next visual Requirement 2
SQL example
SELECT
'CustomerCategory' AS 'field', SUM(CASE WHEN CustomerCategory IS NULL THEN 1 ELSE 0 END) AS 'null', SUM(CASE WHEN CustomerCategory NOT IN ('01', '1','02','2') THEN 1 ELSE 0 END) AS 'invalid', SUM(CASE WHEN CustomerCategory IN ('01', '1','02','2') THEN 1 ELSE 0 END) AS 'valid', der_activity_month, der_provider_code
FROM Customers
GROUP BY activity_month, provider_code
UNION
SELECT
'DOB' AS 'field', SUM(CASE WHEN DOBIS NULL THEN 1 ELSE 0 END) AS 'null', SUM(CASE WHEN DOB = '1900/01/01' THEN 1 ELSE 0 END) AS 'invalid', SUM(CASE WHEN DOB <> '1900/01/01' OR aea_initial_assessment_date IS NOT NULL THEN 1 ELSE 0 END) AS 'valid', der_activity_month, der_provider_code
FROM Customers
GROUP BY activity_month, provider_code
If I were to go with the second option how do I code to get the Total count of nulls and Total count of Invalid values and Total count of valid values in Tableau?
2nd requirement:
Show a dynamic drill down per field visual on a bar chart to show the total count of invalid and null values.
E.g.
CustomerCategory
Valid values 01, 1, 02, 2
So on the bar chart I would like to show invalid values which are not in the above valid value list. And also count of nulls
Reading the above, I would appreciate if anyone can give me the guidance how to build the dashboard correctly.
Much appreciated and thank you.
Upvotes: 0
Views: 183
Reputation: 777
1st requirement:
You can go with Option 2. Drag the Field column on to Columns shelf and Measure Values on to Rows shelf then remove the SUM(Number of Records) from Measure Values pane. Then drag the Measure Names on to Color Marks (if Measure Names is already present with Detail Marks then remove it). For filters you can drag der_activity_month and der_provider_code fields on to Filters shelf and then Select Show Filter. You can choose the Colors for Measure Names(Null, Invalid, Valid).
2nd requirement:
Make one more sheet and drag the Field column on to Columns shelf and Create Calculated Field with an expression Invalid+Null and drag this Calculated Field on to Rows shelf, make sure that it is SUM(Calculated Field) and it should be Continuous(Green Color).
For drill down, go to Worksheet on Top Menu, Actions > Add Action(choose Filter). Then under Source Sheets: Select The main sheet(1st sheet) and under Target Sheets: Select the 2nd sheet and Choose Show all values on right side then click OK.
At last to create Dashboard simply drag the 1st sheet on to Dashboard then select the filters(if they are not already selected).
Upvotes: 0