vestland
vestland

Reputation: 61134

What is happening under the hood when a relationship is established between tables?

This question is not limited to Power BI, but it will help me explain my problem.

If you have more than one table in Power BI, you can establish a relationship between them by dragging a column from one table to the other like this:

enter image description here

And you can edit that relationship by clicking the occuring line:

enter image description here

And by the way, here are the structures of the two tables:

# Table1
A,B
1,abc
2,def
3,ghi
4,jkl

# Table2
A,C
1,abc
1,def
2,ghi
3,ghit

This works fine since column A in Table1 consists of unique values and can work as a primary key. And now you can head over to the Report tab, set up two tables, and slice and dice at your hearts desire either by clicking directly under A in Table1, or by introducing a slicer:

enter image description here

But the thing is that you can do that without having established a relationship between the tables. Delete the relationshiop under Relationships and go back to Report and select Home > Manage Relationships to see what I mean:

enter image description here

As the dialog box says 'There are no relationships defined yet.' But you can still subset one table by making selections in the other just like before (EDIT: This statement has been proven wrong in the answer from RADO) . I do know that you can highlight the slicer and select Format > Edit Interactions and deselect the tables associated with the slicer. But I'm still puzzled by the whole thing.

So is there something happening under the hood here that I'm not aware of? Or is the relationship between tables really defined by the very contents of the tables - in the sence that the existence of related values accross tables with the existence of a potential primary key (be it natural or synthetic) makes it possible to query them using SQL, dplyr verbs or any other form of querying techniques. And that you really do not need an explicitly defined relationship?

Or put in another way, does the establishment of a Power BI table relationship have a SQL equivalent? Perhaps like the following:

CREATE TABLE Persons (
    ID int NOT NULL,
    LastName varchar(255) NOT NULL,
    FirstName varchar(255),
    Age int,
    PRIMARY KEY (ID)
);

I'm sorry If I'm rambling a bit here, but I'm just very confused. And googling has so far only added to the confusion. So thank you for any insights!

Upvotes: 1

Views: 1499

Answers (3)

Ken Evans
Ken Evans

Reputation: 392

You asked "What is happening under the hood?" The simple answer is "Statements about relationships."

Many well meaning people draw ER diagrams and seem to either forget or be unaware of the fact that their ER diagrams are really "pictures of statements in language."

The problem is ambiguity.

Many well meaning people jump straight to ER diagrams without also expressing the logical statements on which their ER diagrams are based. In effect, this means that the person who draws the ER diagram seems to expect that the "reader" of the ER diagram will be able reconstruct the statements from which the ER diagram was drawn.

Here is an example to illustrate what I mean. My purpose is to show the linguistic basis of the "under the covers" relationship between Students and their Addresses.

So, what's under the covers is language!

A simple diagram ER Diagram 1

The statements from which the diagram is derived. The statements

A more complex diagram

Student lives at Address

The statements from which the diagram is derived.

enter image description here

Upvotes: 1

philipxy
philipxy

Reputation: 15158

In the RM (relational model) & ERM (entity-relationship model) tables represent relation(ship)s/association. Hence, relational in "RM" & relationship in "ERM".

FKs (foreign keys) get erroneously called "relationships" in pseudo-ERM methods. A SQL FK constraint says subrows appear elsewhere as PK (primary key) or UNIQUE. A DBMS uses them to disallow invalid updates & to optimize queries.

Power BI "relationships" are not FKs. They are instructions on how to build queries.

When there is a FK we do often want to join on it. So we often want a Power BI relationship when there is a FK.

Create and manage relationships in Power BI Desktop
(See also its Download PDF link for Developer.)

PS We do not need constraints to hold or be declared or be known to query. The constraints (Including PKs, FKs, UNIQUE & cardinalities) are determined by the table meanings--(characteristic) predicates--& what business situations can arise. If constraints hold then we just sometimes get fewer rows than otherwise & some query pairs always return the same results when otherwise they wouldn't.

Foreign keys are not needed to join tables!
Is there any rule of thumb to construct SQL query from a human-readable description?

PS Cross join is inner join with a TRUE condition (or no condition in some DBMSs), period. Whether there is a "relationship" aka FK is irrelevant. If the condition is FK=PK or anything else other than TRUE then it's not a cross join; otherwise it is a cross join whether or not there is a FK between the tables. It's just that we frequently want PK=FK in a condition & tools can & do use the presence of a FK towards a default condition.

CROSS JOIN vs INNER JOIN in SQL Server 2008

Upvotes: 1

RADO
RADO

Reputation: 8148

Your statement "But you can still subset one table by making selections in the other just like before" is not correct. This is a key issue here.

Relations enable propagation of filter context in Power BI. That's a very loaded phrase, and you will have to learn what it means if you plan to use Power BI. It's the most important concept to understand.

To see what I mean, you will need to write DAX measures and try to manipulate them using your tables. You will immediately see the difference when you have or don't have relations.

How the whole system works (simplified): PowerBI contains a language called "DAX". You will create measures in DAX, and PowerBI will then translate them into its internal language called xmSQL, which is a special flavor of SQL. In xmSQL, regular connection is translated into LEFT OUTER JOIN, like this:

SELECT SUM(Sales.Amount)
FROM Sales
LEFT OUTER JOIN Customer
ON Sales.Customer_Key = Customer.Customer_Key

By-directional relations are a bit more complex, but conceptually similar.

Overall, when you create relations between tables, you are telling PowerBI engine how to join the tables. The engine then also adds some optimizations to speed up the queries. Every time you execute a DAX measure, click a slicer or a visual, PowerBI generates multiple xmSQL statements in the background, executes them, and then renders their results as visuals. You can see these SQL queries with some tools such as DAX Studio.

Note that it's not strictly necessary to establish relations between tables in PowerBI. You can imitate the same behavior using DAX (programmatically), but such "virtual" relations are more complex and can be substantially slower.

Upvotes: 1

Related Questions