jhodgson4
jhodgson4

Reputation: 1656

Selecting specific attributes in a Many to Many relationship

Consider I have this scenario:

users table

| id | name |
|----|------|
| 1  | Joe  |

attributes table

| id | name   |
|----|--------|
| 1  | Age    |
| 2  | Height |
| 3  | Gender |

client_attribute table

| id | client_id | attribute_id | value |
|----|-----------|--------------|-------|
| 1  | 1         | 1            | 45    |

Attributes could extend to several hundred variations. When querying the user and selecting a 'common' attribute column, for example 'age':

Should my code assume the name or id of the attribute? If not, should a commonly accessed attribute be placed onto the users table directly? What if these common attributes keep extending?

My queries may pick out two or three attributes in different areas and I'm finding this table design to make things difficult ( but I can see the arguments for it ).

Upvotes: 0

Views: 666

Answers (3)

spencer7593
spencer7593

Reputation: 108480

The illustration given is not really a "many-to-many" relationship between entities.

The tables shown are an illustration of a typical EAV (Entity-Attribute-Value) implementation.

Q: Should my code assume the name or id of the attribute?

A: Typically, the SQL would reference the attribute rows by id. But we can also make use of the name column. Typically, the application code would handle the translation of the "selected" attributes. (We would need to dive into some examples to illustrate.)

Q: If not, should a commonly accessed attribute be placed onto the users table directly?

A: The traditional relational model would have a single row for the entity (e.g. a row in the user table), with a separate column for each attribute: first name, last name, gender, date of birth, height, etc.

Q: What if these common attributes keep extending?

In the traditional relational model, we would add additional columns (DDL ALTER TABLE operations. With EAV model, we don't need to add columns, we'd be adding rows to the attribute table (DML INSERT operation).

The EAV model is more complex, in that attribute values are not stored in the entity row, but as rows in a separate table, much like we would for multi-valued attributes. EAV has some advantages, as well as some significant drawbacks, as you note ...


Q: My queries may pick out two or three attributes in different areas and I'm finding this table design to make things difficult

A: The real complexity (and performance issues) come when trying to wrangle that EAV model to return a row like we'd get back in a traditional relational model; if we're trying to write a query that returns a row that looks like it came from the user table with separate column for each attribute.

If we go EAV, then we should go full EAV, and not try to write SQL that returns resultsets that look like they are from a relational model. It is possible, but the queries get complicated.


Also, I would not store "age" attribute, since that will vary with time; a persons age is the difference between the current date and date of birth.

The EAV model holds out a promise of great flexibility and ease of use, to the uninitiated. That flexibility comes at a cost. Consider how you are going to handle the various domains of the attributes. First and last name strings can be stored as VARCHAR, but some attributes may be dates, decimal, integer. Are we going to store all attributes as VARCHAR, or should we have multiple columns, and then some kind of discriminator that tells us which datatype column the attribute should be pulled from.

Upvotes: 1

Rami
Rami

Reputation: 67

Should my code assume the name or id of the attribute?

Preferably I would opt to use the PK/Index for the join vs the name but basically both will work there is just a minor performance implication.

If not, should a commonly accessed attribute be placed onto the users table directly? What if these common attributes keep extending?

I am not seeing any issue with the design you provided or any complication in retrieving any subset of data you want.

1) Retrieve all user and attribute information:

SELECT A.name, C.name, B.value FROM users A
JOIN client_attribute B ON A.id = B.client_id
JOIN attributes C ON B.attribute_id = C.id

Output:

Joe         Age     45
Joe         Height  5
Joe         Gender  Male
Michelle    Age     23
Michelle    Height  4
Michelle    Gender  Female

2) Retrieve specific attribute ex. Age:

SELECT A.name, C.name, B.value FROM users A
JOIN client_attribute B ON A.id = B.client_id
JOIN attributes C ON B.attribute_id = C.id
where C.name = 'Age';

Preferably using id:

SELECT A.name, C.name, B.value FROM users A
JOIN client_attribute B ON A.id = B.client_id
JOIN attributes C ON B.attribute_id = C.id
where C.id = 1;

Output:

Joe         Age   45
Michelle    Age   23

3) Retrieve information for a specific user:

SELECT A.name, C.name, B.value FROM users A
JOIN client_attribute B ON A.id = B.client_id
JOIN attributes C ON B.attribute_id = C.id
where C.id = 1 AND A.name = 'Joe';

Or using id:

SELECT A.name, C.name, B.value FROM users A
JOIN client_attribute B ON A.id = B.client_id
JOIN attributes C ON B.attribute_id = C.id
where C.id = 1 AND A.id = 1;

Output:

Joe   Age   45

My queries may pick out two or three attributes in different areas and I'm finding this table design to make things difficult ( but I can see the arguments for it ).

Not really sure what you mean by your queries may pick out two or three attributes in different areas you can basically write the query to specify any targeted subset of data based on your needs. Hope the above helped, if I missed something let me know.

Upvotes: 0

Simone Rossaini
Simone Rossaini

Reputation: 8162

You can use a checkbox for select "attributes", the value of that will be ID and in the query you will use "WHERE attributes in ("$variable") $variable will cointain all id of what you want select.

Upvotes: 0

Related Questions