kidcapt
kidcapt

Reputation:

Hierarchical Data in MySQL

I've got a sort of tree like thing going on in my MySQL database.

I have a database that has categories, and each category has a subcat. I'm keeping all the categories in one table, so the columns are like this:

*categories table*
id | name  | parent_id
1  | Toys  | 0
2  | Dolls | 1
3  | Bikes | 1

Each item in my database is assigned to one of those categories:

*items table*
item   | category_id
barbie | 2
schwinn| 3

The problem is if someone wants to see all TOYS (the parent category) what is the best way to fetch the info from the items database? The only way I know how is to do something like

SELECT * 
FROM items 
WHERE category_id = 2 
JOIN SELECT * 
     FROM items 
     WHERE category_id = 3
     etc... 

But if I had like 10 categories under Toys, then I'd have to do this join and query 10 times.

Is there a better way to handle this?

Upvotes: 9

Views: 13015

Answers (8)

kta
kta

Reputation: 20110

I would like to share my idea with you.

Limitation of Adjacency Model: Follow the Managing Hierarchical Data in MySQL As you already described adjacency model has limitation that you have to know the level before you retrieve path.

Use Nested model: But if you convert the data structure your data structure to Nested set model then you can still use the self join to get the tree.

Converting Hierarchical model to Nested model: Now we need a tree travel algorithm to index the nested model. This can be implemented in mysql function (Sorry the conversion needs some algorithm implementation : A tree traversal algorithm. Not sure which one is best fitted).

Thanks :)

Upvotes: 0

Joshscorp
Joshscorp

Reputation: 1832

I am not familiar with MySQL, but here is the way I would do it in TSQL (SQL SERVER), maybe try to find an equivalent way of doing it in MySQL?

1) Loop through all categories to get the children for the specific item, in this case categorie id = 1

2) Filter the items to that relating to the children in the Hierarchy CTE(Common Table Expression).

With Hierarchy As
(

SELECT    id, name, parent_id
from         categories
where        id = 1
UNION ALL
SELECT      child.id, child.name, child.parent_id
from           categories child
inner join  Hierarchy parent on child.parent_id = parent.id
)
SELECT * FROM items
WHERE category_id IN 
(
   Select id
   from Hierarchy 
 )

Upvotes: 0

Jeff Meatball Yang
Jeff Meatball Yang

Reputation: 39017

You want to be given the parent ID:

So assume you are given

set @parentId = 1 /*toys*/

select 
  *
from
  Items i
inner join Categories c on c.id = i.categoryId
where
  c.parentId = @parentId

This will give you the items you want - with one major design flaw: it doesn't handle multiple levels of hierarchical categories.

Let's say you had this Categories table:

*Categories table*
id | name    | parentId
1  | Toys    | 0
2  | Dolls   | 1
3  | Bikes   | 1
4  | Models  | 2
5  | Act.Fig.| 2
6  | Mountain| 3
7  | BMX     | 3

And Items:

*items table*
item   | category_id
Barbie | 4
GIJoe  | 5
Schwinn| 6
Huffy  | 7

The only way to get all the relevant Items is do a self join:

select 
  *
from
  Items i 
inner join Categories c on c.id = i.categoryId
inner join Categories c2 on c.parentId = c2.id
where
  c2.parentId = @parentId

This pattern is not scalable - since you can have MULTIPLE levels of hierarchy.

One common way to deal with hierarchies is to build a "flattened" table: a row that links each node to ALL it's descendants.

In addition to a Categories table, you build a second table:

*CategoriesFlat table*  The Name column is here only for readability
id | name    | parentId
1  | Toys    | 1
-----------------
2  | Dolls   | 1
2  | Dolls   | 2
-----------------
4  | Models  | 1
4  | Models  | 2
4  | Models  | 4
5  | Act.Fig.| 1
5  | Act.Fig.| 2
5  | Act.Fig.| 5
-----------------
3  | Bikes   | 1
3  | Bikes   | 3
-----------------
6  | Mountain| 1
6  | Mountain| 3
6  | Mountain| 6
7  | BMX     | 1
7  | BMX     | 3
7  | BMX     | 7

So you can write:

select 
  *
from
  Items i
inner join CategoriesFlat c on c.id = i.categoryId
where
  c.parentId = @parentId

And get ALL the relevant Categories and Items.

Here's a great slideshow about SQL anti-patterns and solutions to them. (Hierarchical data in SQL is an anti-pattern, but don't be disheartened - we all run into this one)

Upvotes: 21

Reputation:

This thread may help: http://forums.mysql.com/read.php?10,32818,32818#msg-32818

What you really want is START WITH and CONNECT BY syntax, but this is only supported in Oracle, not MySQL.

Upvotes: 0

ChrisW
ChrisW

Reputation: 56083

But if I had like 10 categories under Toys, then I'd have to do this join and query 10 times. Is there a better way to handle this?

Yes, there's a way of storing data called "nested sets". It's a bit harder to insert the data, but simple to select an entire, multi-level branch using a single select statement.

Also, Celko has written a book about this subject, with a chapter about nested sets and other chapters about other methods.

Upvotes: 4

lc.
lc.

Reputation: 116438

I'm assuming you know how to get the ID number and that's not the point of the question. Also, parent_id should also be a FK referencing id, and I would use NULL for the topmost layer, not 0.

If your top-most categories have at most one level of sub-category, you can use this query to get all Toys:

SELECT *
FROM items
WHERE items.category_id IN (SELECT id FROM categories
                            WHERE categories.parent_id = 1
                            OR categories.id = 1);

If your categories can have nested sub-categories, you'll have to use a stored procedure and call it recursively. Pseudocode:

Procedure getItemsInCategory
Input: @category_id integer
Output: items rows
{
    For each item in (SELECT *
                      FROM items
                      WHERE items.category_id = @category_id):
        return the row;

    For each id in (SELECT id 
                    FROM categories
                    WHERE categories.parent_id = @category_id):
        return the rows in getItemsInCategory(id);
}

Upvotes: 3

aberrant80
aberrant80

Reputation: 12997

  1. Use the IN operator.
  2. Use a stored procedure.
  3. Optimise your tables to better reflect how they tend to get used.

Upvotes: 0

Terry G Lorber
Terry G Lorber

Reputation: 2962

Assuming you know the id of the Toys category, and nothing is in the top-level Toys category:

SELECT * FROM items WHERE category_id IN (SELECT id FROM categories WHERE parent_id = 1)

Upvotes: 0

Related Questions