What is the best way to model an entity that can have different attributes?

Question

Let's say I am programming a MMORPG. I modeled a Entity character that can have a multitude of attributes like coating, strength, color and so on. Because I do not know these attributes in advance (what and how many of them), I thought I create an extra table for it, like so:

CREATE TABLE character (INTEGER id, VARCHAR name, INTEGER player_id);

and

CREATE TABLE attributes (INTEGER character_id, VARCHAR key, VARCHAR value);

I would then be able to introduce a multitude of new attributes. However, how would I query this construct? The query

SELECT * FROM character JOIN attributes ON character.id=attributes.character_id;

will obvioulsy only work for a single attribute. Do I have to JOIN the attributes table more than once or is there another solution?

Is there a way to have different types for the attribute.value Part? Doing it the way I am doing now would limit me to a VARCHAR representation.

user731136 · Accepted Answer

Another possibility is to use hstore instead of EAV model.

CREATE TABLE character (id INTEGER, name VARCHAR,
                        player_id INTEGER, attributes hstore);

That way you can store the attributes as a map (key - value).

insert into character (id, name, player_id, attributes)
values (1, 'test', 1, '"attribute1"=>"value1","attribute2"=>"value2"')
      ,(2, 'test', 1, '"attribute1"=>"value1","attribute3"=>"value3"');

select (each(attributes)).key, (each(attributes)).value 
from character where id = 1;

  key text    value text
  --------------------------
   attribute1   value1
   attribute2   value2

select id, attributes->'attribute3' as value 
from character WHERE exist(attributes,'attribute3');

  id    value
  ---------------
   2   "value3"

Hope this helps.

UPDATE

I made a small benchmark to compare hstore vs two tables.

CREATE OR REPLACE FUNCTION create_dummy_data()
RETURNS integer AS
$BODY$
DECLARE
   cont1       INTEGER;
   cont2       INTEGER;
   sqlInsert   VARCHAR;

BEGIN
   CREATE TABLE character (id INTEGER PRIMARY KEY
                          ,name VARCHAR
                          ,player_id INTEGER);

   CREATE TABLE attributes (character_id INTEGER
                           ,key VARCHAR
                           ,value VARCHAR
                           ,FOREIGN KEY (character_id) REFERENCES character);

   cont1 := 1;
   WHILE cont1 < 10000 LOOP
      sqlInsert := 'INSERT INTO character (id, name, player_id) VALUES (' || cont1 || ', ''character' || cont1 || ''', ' || cont1 || ');';
      EXECUTE sqlInsert;
      cont1 := cont1 + 1;
   END LOOP;

   cont1 := 1;
   WHILE cont1 < 10000 LOOP
      cont2 := 1;
      WHILE cont2 < 10 LOOP   
         sqlInsert := 'INSERT INTO attributes (character_id, key, value) VALUES (' || cont1 || ', ''key' || cont2 || ''', ' || cont2 || ');';
         EXECUTE sqlInsert;
         cont2 := cont2 + 1;
      END LOOP;       
      cont1 := cont1 + 1;
    END LOOP;

    CREATE TABLE character_hstore (id INTEGER
                                  ,name VARCHAR
                                  ,player_id INTEGER
                                  ,attributes hstore);
    cont1 := 1;
    WHILE cont1 < 10000 LOOP
       sqlInsert := 'INSERT INTO character_hstore (id, name, player_id, attributes) VALUES (' || cont1 || ', ''character' || cont1 || ''', ' || cont1 || ', ''"key1"=>"1","key2"=>"2","key3"=>"3","key4"=>"4","key5"=>"5"'');';
       EXECUTE sqlInsert;
       cont1 := cont1 + 1;
    END LOOP;   

    RETURN 1;
 END;
 $BODY$
 LANGUAGE plpgsql;

 select * from create_dummy_data();

 DROP FUNCTION create_dummy_data();

And I've got the following results:

explain analyze
SELECT ca.* 
FROM character ca
JOIN attributes at ON ca.id = at.character_id
WHERE at.value = '1';
"Hash Join  (cost=288.98..2152.77 rows=10076 width=21) (actual time=2.788..23.186 rows=9999 loops=1)"

CREATE INDEX ON attributes (value);

explain analyze
SELECT ca.* 
FROM character ca
JOIN attributes at ON ca.id = at.character_id
WHERE at.value = '1';
"Hash Join  (cost=479.33..1344.18 rows=10076 width=21) (actual time=4.330..13.537 rows=9999 loops=1)"

And using hstore:

explain analyze
SELECT * 
FROM character_hstore
WHERE attributes @> 'key1=>1';
"Seq Scan on character_hstore  (cost=0.00..278.99 rows=10 width=91) (actual time=0.012..3.530 rows=9999 loops=1)"

explain analyze
SELECT * 
FROM character_hstore
WHERE attributes->'key1' = '1';
"Seq Scan on character_hstore  (cost=0.00..303.99 rows=50 width=91) (actual time=0.016..4.806 rows=9999 loops=1)"

What is the best way to model an entity that can have different attributes?

Answers (2)

Related Questions