nobody
nobody

Reputation: 2759

Need a tip on simple MySQL db design

I am trying to make a simple item database using MySQL for a game. Here is what my 3 tables would look like

     items
     itemId | itemName 
    -------------------
     0001   | chest piece
     0002   | sword
     0003   | helmet
     attributes (attribute lookup table)
     attributeId       | attributeName
     ---------------------------------
     01             | strength
     02             | agility
     03             | intellect
     04             | defense
     05             | damage
     06             | mana
     07             | stamina
     08             | description
     09             | type
     item_attributes (junction table)
     itemId  | attributeId   | value (mixed type, bad?)
     ------------------------------------
     0001   | 01            | 35
     0001   | 03            | 14
     0001   | 09            | armor
     0001   | 08            | crafted by awesome elves

     0002   | 09            | weapon
     0002   | 05            | 200
     0002   | 02            | 15
     0002   | 08            | your average sword

     0003   | 04            | 9000
     0003   | 09            | armor
     0003   | 06            | 250

My problem with this design is that value column in item_attributes table needs to use varchar data type, since the value's data can be int, char, varchar. I think this is a bad approach because I would not be able to quickly sort my items based on particular attributes. It would also suffer performance hit when a query such as get items with attribute strength that has value between 15 and 35 is processed.

Here is my potential fix. I simply added a data_type column to the attributes table. So it would look something like this

     attributes (attribute lookup table)
     attributeId       | attributeName | data_type
     ---------------------------------------------------
     01             | strength      | int
     09             | type          | char
     08             | intellect     | varchar

Then I would add 3 more columns to item_attributes table, int, char, varchar. Here is how the new item_attributes table would look like.

     item_attributes (junction table)
     itemId  | attributeId   | value                 | int  | char | varchar
     ------------------------------------------------------------------------
     0002   | 09            | weapon                | null |weapon| null
     0002   | 05            | 200                   | 200  | null | null
     0002   | 02            | 15                    | 15   | null | null
     0002   | 08            | your average sword    | null | null | your average sword

So now if I were to sort items based on its strength attribute, I would use int column. Or search for an item based on its description, I would search the varchar column.

I still, however, believe my design is a bit weird. Now I would have to look up the data_type column in attribute table and dynamically determine which column in item_attributes table is relevant to what I am looking for.

Any inputs would be greatly appreciated.

Thanks in advance.

EDIT 11/29/2010 Here is a detailed list of my items

--------------------------------------
http://wow.allakhazam.com/ihtml?27718
Aldor Defender's Legplates
Binds when picked up
LegsPlate
802 Armor
+21 Strength
+14 Agility
+21 Stamina
Item Level 99
Equip: Improves hit rating by 14.
--------------------------------------
http://wow.allakhazam.com/ihtml?17967
Refined Scale of Onyxia
Leather
Item Level 60
--------------------------------------
http://wow.allakhazam.com/ihtml?27719
Aldor Leggings of Puissance
Binds when picked up
LegsLeather
202 Armor
+15 Agility
+21 Stamina
Item Level 99
Equip: Increases attack power by 28. 
Equip: Improves hit rating by 20.
--------------------------------------
http://wow.allakhazam.com/ihtml?5005
Emberspark Pendant
Binds when equipped
NeckMiscellaneous
+2 Stamina
+7 Spirit
Requires Level 30
Item Level 35
--------------------------------------
http://wow.allakhazam.com/ihtml?23234
Blue Bryanite of Agility
Gems
Requires Level 2
Item Level 10
+8 Agility
--------------------------------------
http://wow.allakhazam.com/ihtml?32972
Beer Goggles
Binds when picked up
Unique
HeadMiscellaneous
Item Level 10
Equip: Guaranteed by Belbi Quikswitch to make EVERYONE look attractive! 
--------------------------------------
http://wow.allakhazam.com/ihtml?41118
Gadgetzan Present
Binds when picked up
Unique
Item Level 5
"Please return to a Season Organizer"
--------------------------------------
http://wow.allakhazam.com/ihtml?6649
Searing Totem Scroll
Unique
Quest Item
Requires Level 10
Item Level 10
Use:  
--------------------------------------
http://wow.allakhazam.com/ihtml?6648
Stoneskin Totem Scroll
Unique
Quest Item
Requires Level 4
Item Level 4
Use:  
--------------------------------------
http://wow.allakhazam.com/ihtml?27864
Brian's Bryanite of Extended Cost Copying
Gems
Item Level 10
gem test enchantment
--------------------------------------

EDIT #2

  1. These 10 examples are not representative of all 35316 items data that I have collected.
    • NeckMiscellaneous means that item is in both categories of `Neck` and `Misc`.
    • Unique means the only one item can be used on character.
    • Don’t read too much into the “Action”, they are just quest description
    • When an item says `Equip: increase attack power by 28` it just means +28 attack power on the player character. It is the same as +15 agility.
  2. There are a total of 241884 one-to-many item-attribute records, so that comes about to 241884/35316 ~= 8 average attributes per item. Also the data is mined from the website into a gigantic text file. There is NO “well formed” information to identify an item’s type or category. So if the word “sword” appears on either 3rd or 4th line, it is automatically categorized as sword.
  3. The item might get changed on each new update of the game.
  4. There is no universal attribute shared amongst the item besides `name`
  5. The item data is accessible through a web app. Unclear about what you mean by bits and vectors?
  6. The regular expression is used during data mining stage to clean up the special character and search for specific keyword in order to categorize the items. Also to extract attribute name and value. For example, +15 agility would have string agility extracted as attribute name and 15 as value. (I don’t understand much about question 6 and 6.1. Slog stands for server log here? Translate regexes to SQL?)

Model Diagram

Here is an example how a query looks like

select *
from itemattributestat
where item_itemId=251

item_itemId | attribute_attributeId | value | listOrder
=======================================================
'251', '9', '0', '1'
'251', '558', '0', '2'
'251', '569', '0', '3'
'251', '4', '802', '4'
'251', '583', '21', '5'
'251', '1', '14', '6'
'251', '582', '21', '7'
'251', '556', '99', '8'
'251', '227', '14', '9'

The list order is here to keep track of which attribute should be listed first. For formatting purpose

create view itemDetail as
select Item_itemId as id, i.name as item, a.name as attribute, value
from ((itemattributestat join item as i on Item_itemId=i.itemId) 
    join attribute as a on Attribute_attributeId=a.attributeId)
order by Item_itemId asc, listOrder asc;

The above view produces the following with

select *
from itemdetail
where id=251;
id | item | attribute | value
'251', 'Aldor Defender''s Legplates', 'Binds when picked up', '0'
'251', 'Aldor Defender''s Legplates', 'Legs', '0'
'251', 'Aldor Defender''s Legplates', 'Plate', '0'
'251', 'Aldor Defender''s Legplates', 'Armor', '802'
'251', 'Aldor Defender''s Legplates', 'Strength', '21'
'251', 'Aldor Defender''s Legplates', 'Agility', '14'
'251', 'Aldor Defender''s Legplates', 'Stamina', '21'
'251', 'Aldor Defender''s Legplates', 'Item Level', '99'
'251', 'Aldor Defender''s Legplates', 'Equip: Improves hit rating by @@.', '14'

An attribute with value 0 means the attribute name represents the item type. 'Equip: Improves hit rating by @@.', '14' @@ is place holder here, a processed output on a browser will be 'Equip: Improves hit rating by 14.'

Upvotes: 2

Views: 1190

Answers (3)

PerformanceDBA
PerformanceDBA

Reputation: 33708

Why do you have an attribute table ?

Attributes are columns, not tables.

The website link tells us nothing.

The whole idea of a database is that you join the many small tables, as required, for each query, so you need to get used to that. Sure, it gives you a grid, but a short and sweet one, without Nulls. What you are trying to do is avoid tables; go with just one massive grid, which is full of Nulls.

(snip)

Do not prefix your attribute names (column names) with the table name, that is redundant. This will become clear to you when you start writing SQL which uses more than one table: then you can use the table name or an alias to prefix any column names that are ambiguous.

The exception is the PK, which is rendered fully, and used in that form wherever it is an FK.

Browse the site, and read some SQL questions.

After doing that, later on, you can think about if you wantstrength and defense to be attributes (columns) of type; or not. Et cetera.

Responses to Comments 30 Nov 10

.
Excellent, you understand your data. Right. Now I understand why you had an Attribute table.

  1. Please make sure those 10 examples are representative, I am looking at them closely.

    • Type:Gem Name:Emberspark Pendant ... Or, is NeckMiscellaneous a type ?
    • Is Unique a true ItemType ? I think Not
    • Action.Display "Please return to a Season Organizer"
    • Where are the Attrinutes for AttackPower and HitRating ?
      .
  2. How many different types of items (of 35,000) are there, ala my Product Cluster example. Another way of stating that question is, how many variations are there. I mean, meaningfully, not 3500 Items ÷ 8 Attributes ?

  3. Will the item_attributes change without a release of s/w (eg. a new Inner Strength attribute) ?

  4. Per Item, what Attributes are repeating (more than one); so far I see only Action ?

  5. It is a game, so you need a db that is tight and very fast, maybe fully memory resident, right. No Nulls. No VAR Anything. Shortest Datatypes. Never Duplicate Anything (Don't Repeat Yourself). Are you happy with bits (booleans) and vectors ?

  6. Do you need to easily translate those regexes into SQL, or are you happy with a serious slog for each (ie. once you get them working in SQL they are pretty stable and then you don't mess with it, unless you find a bug) (no sarcasm, serious question) ?

    6.1 Or maybe it is the other way round: the db is disk-resident; you load it into memory once; you run the regexes on that during gameplay; occasionally writing to disk. Therefore there is no need to translate the regexes to SQL ?

Here's a Data Model of where I am heading, this not at all certain; it will be modulated by your answers. To be clear:

  • Sixth Normal Form is The Row consists of the Primary Key and, at most, one Attribute.

  • I have drawn (6.1) not (6), because your data reinforces my belief that you need a pure 6NF Relational database

  • My Product Cluster Data Model, the better-than-EAV example, is 6NF, then Normalised again (Not in the Normal Form sense) by DataType, to reduce no of tables, which you have already seen. (EAV people usually go for one or a few gigantic tables.)

  • This is straight 5NF, with only the 2 tables on the right in 6NF.

Link to Game Data Model

Link to IDEF1X Notation for those who are unfamiliar with the Relational Modelling Standard.

Response to Edit #2 05 Dec 10

1.1. Ok, corrected.

1.2. Then IsUnique is an Indicator (boolean) for Item.

1.3. Action. I understand. So Where are you going to store it ?

1.4. NeckMiscellaneous means that item is in both categories of Neck and Misc. That means two separate Item.Name=Emberspark Pendant, each with a different Category.

.
2. and 5. So you do need fast fast memory-resident db. That's why I am trying to get you across the line, away from GridLand, into RelationalLand.
.
3. Ok, we stay with Fifth Normal Form, no need for 6NF or the Product Cluster (tables per Datatype). Sofar the Values are all Integers.
.
4. I can see additionally: Level, RequiredLevel, IsUnique, BindsPickedUp, BindsEquipped.
.
5. Bits are booleans { 0 | 1 }. Vectors are required for (Relational) projections. We will get to them later.
.
6. Ok, you've explained, You are not translating regular expressions to SQL. (Slog means hard labour). .
7. What is Category.ParentId ? Parent Category ? That has not come up before.
.
8. Attribute.GeneratedId ?

Please evaluate the Data Model (Updated). I have a few more columns, in addition to what you have in yours. If there is anything you do not understand in the Data Model, ask a specific question. You've read the Notation document, right ?

I have Action as a table, with ItemAction holding the Value:
Equip: increase attack power by 28 is Action.Name=Increase attack power by and ItemAction.Value=28.

Upvotes: 3

Larry Lustig
Larry Lustig

Reputation: 50970

You are dealing with a two common problems:

  • Entities that are similar to each other but not identical (all items have a name and description, but not necessarily an intellect).
  • A design in which you need to add attributes once the database is in production (you can pretty easily predict that at some point you'll need to add, for instance, a magic-resistance attribute to some items).

You've solved your problem by reinventing the EAV system in which you store both attribute names and values as data. And you've rediscovered some of the problems with this system (type checking, relational integrity).

In this case, I'd personally go with a solution midway between the relational one and the EAV one. I'd take the common columns and add them, as columns, to either the items table or, if items represents kinds of items, not individual ones, to the items_owners table. Those columns would include description and possibly type and in the example you gave, pretty much match up with the text columns. I'd then keep the existing layout for those attributes that are numerical ratings, making the value type int. That gives you type-checking and proper normalization across the integer attributes (you won't be storing lots of NULLs) at the expense of the occasional NULL type or description.

Upvotes: 0

David
David

Reputation: 218818

I think having the data_type column just further complicates the design. Why not simply have type and description be columns on the items table? It stands to reason that every item would have each of those values, and if it doesn't then a null would do just fine in a text column.

You can even further normalize the type by having an item_types table and the type column in items would be a numeric foreign key to that table. Might not be necessary, but might make it easier to key off of the type on the items table.

Edit: Thinking about this further, it seems like you may be trying to have your data tables match a domain model. Your items would have a series of attributes on them in the logic of the application. This is fine. Keep in mind that your application logic and your database persistence layout can be different. In fact, they should not rely on each other at all at a design level. In many small applications they will likely be the same. But there are exceptions. Code (presumably object-oriented, but not necessarily) and relational data have different designs and different limitations. De-coupling them from one another allows the developer to take advantage of their designs rather than be hindered by their limitations.

Upvotes: 2

Related Questions