Reputation: 1656
Consider I have this scenario:
users
table
| id | name |
|----|------|
| 1 | Joe |
attributes
table
| id | name |
|----|--------|
| 1 | Age |
| 2 | Height |
| 3 | Gender |
client_attribute
table
| id | client_id | attribute_id | value |
|----|-----------|--------------|-------|
| 1 | 1 | 1 | 45 |
Attributes could extend to several hundred variations. When querying the user and selecting a 'common' attribute column, for example 'age':
Should my code assume the name or id of the attribute? If not, should a commonly accessed attribute be placed onto the users
table directly? What if these common attributes keep extending?
My queries may pick out two or three attributes in different areas and I'm finding this table design to make things difficult ( but I can see the arguments for it ).
Upvotes: 0
Views: 666
Reputation: 108480
The illustration given is not really a "many-to-many" relationship between entities.
The tables shown are an illustration of a typical EAV (Entity-Attribute-Value) implementation.
Q: Should my code assume the name
or id
of the attribute?
A: Typically, the SQL would reference the attribute
rows by id
. But we can also make use of the name
column. Typically, the application code would handle the translation of the "selected" attributes. (We would need to dive into some examples to illustrate.)
Q: If not, should a commonly accessed attribute be placed onto the users table directly?
A: The traditional relational model would have a single row for the entity (e.g. a row in the user
table), with a separate column for each attribute: first name, last name, gender, date of birth, height, etc.
Q: What if these common attributes keep extending?
In the traditional relational model, we would add additional columns (DDL ALTER TABLE operations. With EAV model, we don't need to add columns, we'd be adding rows to the attribute
table (DML INSERT operation).
The EAV model is more complex, in that attribute values are not stored in the entity row, but as rows in a separate table, much like we would for multi-valued attributes. EAV has some advantages, as well as some significant drawbacks, as you note ...
Q: My queries may pick out two or three attributes in different areas and I'm finding this table design to make things difficult
A: The real complexity (and performance issues) come when trying to wrangle that EAV model to return a row like we'd get back in a traditional relational model; if we're trying to write a query that returns a row that looks like it came from the user
table with separate column for each attribute.
If we go EAV, then we should go full EAV, and not try to write SQL that returns resultsets that look like they are from a relational model. It is possible, but the queries get complicated.
Also, I would not store "age" attribute, since that will vary with time; a persons age is the difference between the current date and date of birth.
The EAV model holds out a promise of great flexibility and ease of use, to the uninitiated. That flexibility comes at a cost. Consider how you are going to handle the various domains of the attributes. First and last name strings can be stored as VARCHAR, but some attributes may be dates, decimal, integer. Are we going to store all attributes as VARCHAR, or should we have multiple columns, and then some kind of discriminator that tells us which datatype column the attribute should be pulled from.
Upvotes: 1
Reputation: 67
Should my code assume the name or id of the attribute?
Preferably I would opt to use the PK/Index for the join vs the name but basically both will work there is just a minor performance implication.
If not, should a commonly accessed attribute be placed onto the users table directly? What if these common attributes keep extending?
I am not seeing any issue with the design you provided or any complication in retrieving any subset of data you want.
1) Retrieve all user and attribute information:
SELECT A.name, C.name, B.value FROM users A
JOIN client_attribute B ON A.id = B.client_id
JOIN attributes C ON B.attribute_id = C.id
Output:
Joe Age 45
Joe Height 5
Joe Gender Male
Michelle Age 23
Michelle Height 4
Michelle Gender Female
2) Retrieve specific attribute ex. Age:
SELECT A.name, C.name, B.value FROM users A
JOIN client_attribute B ON A.id = B.client_id
JOIN attributes C ON B.attribute_id = C.id
where C.name = 'Age';
Preferably using id:
SELECT A.name, C.name, B.value FROM users A
JOIN client_attribute B ON A.id = B.client_id
JOIN attributes C ON B.attribute_id = C.id
where C.id = 1;
Output:
Joe Age 45
Michelle Age 23
3) Retrieve information for a specific user:
SELECT A.name, C.name, B.value FROM users A
JOIN client_attribute B ON A.id = B.client_id
JOIN attributes C ON B.attribute_id = C.id
where C.id = 1 AND A.name = 'Joe';
Or using id:
SELECT A.name, C.name, B.value FROM users A
JOIN client_attribute B ON A.id = B.client_id
JOIN attributes C ON B.attribute_id = C.id
where C.id = 1 AND A.id = 1;
Output:
Joe Age 45
My queries may pick out two or three attributes in different areas and I'm finding this table design to make things difficult ( but I can see the arguments for it ).
Not really sure what you mean by your queries may pick out two or three attributes in different areas you can basically write the query to specify any targeted subset of data based on your needs. Hope the above helped, if I missed something let me know.
Upvotes: 0
Reputation: 8162
You can use a checkbox for select "attributes", the value of that will be ID and in the query you will use "WHERE attributes in ("$variable")
$variable will cointain all id of what you want select.
Upvotes: 0