Reputation: 1623
I'm currently working in Azure synapse DWH and I have some theoretical questions:
How I can create relationships between tables (Dim's and Fact's) and what implications I would have If I want to create those relationships.
I read that To create a primary key, I would need to set a nonclustered table, but what that means?
Upvotes: 2
Views: 1939
Reputation: 21
Two solutions I'm aware of thus far:
Import the serverless DBO's to PowerBI; Model and Create the Dataset there. you can do massive ETL, including Foreign Key creation, and Filtering of NULL values to create primary keys for Dimensions. Aggregate data and create Fact tables, etc... Its far easier then using the Synapse GUI. Drawbacks are PBI licensing related.
Create a "Lake Database" (map as you go, great for 5 or less entities.tables.) ETL is low-code. But I'm skeptical that after 40 hours of training; I should have just learned how to scrip this in Workbook/Spark.
Do BOTH; use PowerBI to develop your model and test it. Then go back to synapse and deploy the working model as a pipeline or lake database.
Points of Clarity from the top posted solution:
Do not trust the auto-relationship of PowerBI; stay away from pre-made REFID relationships in PBI unless you know for sure this is what you want. (step 6: original poster; if ETL is correct its a 1:M)
Publishing with .PBIX has its limitations with sharing and other issues the OP mentioned. Lake Database might be the workaround if you have Tabelau, Python, or Qlik as your solution.
DataVerse is coming; and PBI Analytics as well as predictive analysis with HD Insights will be embedded into D365. You will also be able to create drag and drop dashboards. As of 08-05-2022 this is already working in its infancy; even thought they want you to go modular; with hybrid serverless setup you can STILL Pull the aggregate measures from D365 into synapse and Reverse engineer them.
Upvotes: 0
Reputation: 21
having a similar issue:
I cannot find an automated solution thus far;
I'm importing 'entities' from D365 to datalake; and it does NOT come with the relationships.
it will also NOT suggest the "Related Tables"
Introduce; ETL of 'entities' using T-SQL and Spark. Governance of:: py.spark, notebooks, Schema, linting T-SQL. orchestration of activities and pipelines, workflows. Etc...
OR For small datasets and projects:
at some point 'dataverse' may allow D365 data making this easier.
depending on your 'cost/spend' cloning all of D365 still doesn't solve your relationship needs.
Upvotes: 0
Reputation: 14379
Azure Synapse Analytics (ASA) has three engines:
None of these currently support database relationships, as at today. I suspect you mean dedicated SQL pools and just to confirm it does not support the FOREIGN KEY
syntax. Relationships is more of an OLTP concept and not common in big data platforms, which ASA is.
Therefore your options are to enforce these relationships downstream or on import to your warehouse. A common method is to identify unknown values and substitute them with a -1 / Unknown value on import. This will ensure there are no NULLs in your key columns.
Additionally, enforce your relationships downstream eg in an Azure Analysis Services tabular model or Power BI model.
If you really need relationships then depending on your data volumes you might consider Azure SQL Database which supports data volumes up to 4TB alongside columnstore indexes which give great compression.
Upvotes: 3