PDA

View Full Version : Recursive Database Queries


ast3r3x
2008-03-08, 21:23
So I have a question about how to query a tree like structure using MySQL. I am trying to figure out how to find all the child nodes give a single node. The diagram below is a sample of how table is structured. The format is node# : parent#. I've been trying to figure out how to find all the children of 3.

http://codingalchemy.com/files/chart.png

I found something similar to what I want to do here (http://www.webinade.com/web-development/creating-recursive-sql-calls-for-tables-with-parent-child-relationships) but in MS SQL, I guess you can't have a table reference itself before it is created because…well it doesn't exist.

Does anyone have any ideas how you'd do this in MySQL without using a procedure?


If you'd like my test table to play around with…
CREATE TABLE category_hierarchy (
category INT,
parent_cat INT,
PRIMARY KEY(category, parent_cat)
);

INSERT INTO category_hierarchy VALUES (1, 0);
INSERT INTO category_hierarchy VALUES (2, 1);
INSERT INTO category_hierarchy VALUES (3, 1);
#INSERT INTO category_hierarchy VALUES (3, 10);
INSERT INTO category_hierarchy VALUES (4, 1);
INSERT INTO category_hierarchy VALUES (5, 3);
INSERT INTO category_hierarchy VALUES (5, 2);
INSERT INTO category_hierarchy VALUES (6, 3);
INSERT INTO category_hierarchy VALUES (7, 2);
INSERT INTO category_hierarchy VALUES (8, 2);
INSERT INTO category_hierarchy VALUES (9, 6);
INSERT INTO category_hierarchy VALUES (10, 6);

Brad
2008-03-08, 22:03
I'm pretty sure you can't do that without a procedure unless you have a predetermined maximum depth to traverse.

Brad
2008-03-08, 22:19
That said, here's the first non-function approach that came to mind for me. I'll see if I can think of something a little more elegant, though.

If you have a maximum depth of 1, you could do:

SELECT
DISTINCT c2.category
FROM
category_hierarchy c1
INNER JOIN category_hierarchy c2 ON (c1.category = c2.parent_cat)
WHERE
c1.category = 3

If you have a maximum depth of 2:

SELECT
DISTINCT category
FROM (
SELECT
c2.category
FROM
category_hierarchy c1
INNER JOIN category_hierarchy c2 ON (c1.category = c2.parent_cat)
WHERE
c1.category = 3
UNION
SELECT
c3.category
FROM
category_hierarchy c1
INNER JOIN category_hierarchy c2 ON (c1.category = c2.parent_cat)
INNER JOIN category_hierarchy c3 ON (c2.category = c3.parent_cat)
WHERE
c1.category = 3
) spluh

If you have a maximum depth of 3:

SELECT
DISTINCT category
FROM (
SELECT
c2.category
FROM
category_hierarchy c1
INNER JOIN category_hierarchy c2 ON (c1.category = c2.parent_cat)
WHERE
c1.category = 3
UNION
SELECT
c3.category
FROM
category_hierarchy c1
INNER JOIN category_hierarchy c2 ON (c1.category = c2.parent_cat)
INNER JOIN category_hierarchy c3 ON (c2.category = c3.parent_cat)
WHERE
c1.category = 3
UNION
SELECT
c4.category
FROM
category_hierarchy c1
INNER JOIN category_hierarchy c2 ON (c1.category = c2.parent_cat)
INNER JOIN category_hierarchy c3 ON (c2.category = c3.parent_cat)
INNER JOIN category_hierarchy c4 ON (c3.category = c4.parent_cat)
WHERE
c1.category = 3
) spluh

and so on. Keep unioning selects for deeper levels until you reach your maximum depth. In the worse case scenario, the maximum depth would be n-1 for n categories.

ast3r3x
2008-03-09, 11:55
Thanks Brad.

Blerg. Is this actually a reason to like MS SQL over MySQL?

chucker
2008-03-09, 12:09
CTEs (Common Table Expressions, i.e. what a WITH clause does) in T-SQL* are really just temporary tables, and you can simulate them by creating and filling the table manually.

They're syntactical sugar. I enjoy using them, but if that's all you're missing, you should be fine in MySQL.

*) that's the SQL dialect used by MS SQL Server.

Brad
2008-06-17, 16:56
Gah!

Today at work we were dealing with a similar situation of storing a tree structure in the database and my team's resident SQL master suggested a solution proposed by Joe Celko in "Trees and Hierarchies in SQL for Smarties" (http://www.amazon.com/Hierarchies-Smarties-Kaufmann-Management-Systems/dp/1558609202). It's brilliantly simple, but it denormalizes the data and requires you to update potentially the entire table when inserting or moving nodes.

He calls it the "Nested Set Model". http://www.intelligententerprise.com/001020/celko.jhtml

I remembered this thread and figured you might like to see this solution, ast3r3x. :)

We ended up not using this model because we realized that we only need to traverse the tree from the root node and so we could simply load the whole tree into memory (which is okay because it will likely have only ~200 nodes).