parent/child relationship in the same table.
Author |
Message |
Jason Diamo #1 / 24
|
 parent/child relationship in the same table.
Hi. I'm trying to create a table where each row can be the "child" of another row in the same table. Or in other words, a row could be the "parent" of multiple "child" rows. This sounded like a one-to-many relationship to me so I added a parent column to the table. Here's a simple example table: create table test (id int, parent int, description varchar(256)) I thought that I could assume that a row is not a child if it's parent value was 0. Here's some sample data: insert into test (id, parent, description) values (1, 0, 'one') insert into test (id, parent, description) values (2, 0, 'two') insert into test (id, parent, description) values (3, 1, 'child of one') insert into test (id, parent, description) values (4, 0, 'four') insert into test (id, parent, description) values (5, 3, 'child of child of one') When I execute a simple select on the table I get my data back in the order I entered it: select * from test returns: 1, 0, 'one' 2, 0, 'two' 3, 1, 'child of one' 4, 0, 'four' 5, 3, 'child of child of one' But what I really want is to get back each row with it's children immediately following it like this: 1, 0, 'one' 3, 1, 'child of one' 5, 3, 'child of child of one' 2, 0, 'two' 4, 0, 'four' I can't think of any way to do this with a simple order by clause--there's obviously no columns that are ordered in my desired results. Is there a better way to model this so that I can get the ordering I want with a single query? Thanks, Jason.
|
Tue, 04 May 2004 12:11:34 GMT |
|
 |
Jame #2 / 24
|
 parent/child relationship in the same table.
Quote: > But what I really want is to get back each row with it's children > immediately following it like this: > 1, 0, 'one' > 3, 1, 'child of one' > 5, 3, 'child of child of one' > 2, 0, 'two' > 4, 0, 'four'
T_ParChild Id ParId Data Here is the psuedo code: Call A(0) Function A(par) rs = db.execute("Select * from T_ParChild Where ParId = " & par) While (!rs.EOF) child = rs("Child") Print par, child, rs("Data") Call A(child) Loop End An oodb can perform the above much faster than relational dbs. In a relational db, the time to resolve the relationships increases with the number of records. In an oodb, the resolution time is not only quicker but constant regardless of number of "records". (http://www.xdb1.com)
|
Thu, 06 May 2004 01:46:50 GMT |
|
 |
D.. #3 / 24
|
 parent/child relationship in the same table.
Quote: >Here is the psuedo code: >Call A(0) >Function A(par) > rs = db.execute("Select * from T_ParChild Where ParId = " & par) > While (!rs.EOF) > child = rs("Child") > Print par, child, rs("Data") > Call A(child) > Loop >End >An oodb can perform the above much faster than relational dbs. In a >relational db, the time to resolve the relationships increases with >the number of records. In an oodb, the resolution time is not only >quicker but constant regardless of number of "records".
Not quite. A relational database has a concept of "pipelining": it could answer the query above in a single pass through the table. Object folks thinking is frozen in terms of nested loops that user has to write.
|
Thu, 06 May 2004 04:25:48 GMT |
|
 |
Jason Diamo #4 / 24
|
 parent/child relationship in the same table.
Quote: > Not quite. A relational database has a concept of "pipelining": it could answer > the query above in a single pass through the table. Object folks thinking is > frozen in terms of nested loops that user has to write.
Any pointers on pipelining? Thanks, Jason.
|
Thu, 06 May 2004 11:57:02 GMT |
|
 |
Jame #5 / 24
|
 parent/child relationship in the same table.
Quote: > >Here is the psuedo code: > >Call A(0) > >Function A(par) > > rs = db.execute("Select * from T_ParChild Where ParId = " & par) > > While (!rs.EOF) > > child = rs("Child") > > Print par, child, rs("Data") > > Call A(child) > > Loop > >End > >An oodb can perform the above much faster than relational dbs. In a > >relational db, the time to resolve the relationships increases with > >the number of records. In an oodb, the resolution time is not only > >quicker but constant regardless of number of "records". > Object folks thinking is frozen in terms of nested loops > that user has to write.
If you pulled a marble out of a bag and it was black, you would say that the remaining marbles in the bag are black. What are your chances of being correct? Quote: > Object folks thinking is frozen in terms of nested loops > that user has to write.
The solution was presented using standard SQL and programming techniques, which I believe will require nested loops? Can you provide an alternate practical solution using standard SQL and programming techniques? Quote: > Not quite.
Note quite what? 1. oodb is faster at resolving parent/child relationships? 2. resolution time in rdb increases with # of records? 3. resolution time in oodb is constant regardless of # of "recs"? Quote: > A relational database has a concept of "pipelining": > it could answer the query above in a single pass through the table.
What exactly is "pipelining"? Which databases is "pipelining" available in? Does "pipelining" internally not use any nested loops? Does "pipelining" disproves any of the 3 points above? How do I use "pipelining" to create the desired solution? Would you like to benchmark "pipelining" with XDb which resolves parent/child relationships at a rate of 100 mil/sec on a 800Mhz PC?
|
Thu, 06 May 2004 13:57:24 GMT |
|
 |
Kendal #6 / 24
|
 parent/child relationship in the same table.
Quote:
> Hi. > But what I really want is to get back each row with it's children > immediately following it like this: > 1, 0, 'one' > 3, 1, 'child of one' > 5, 3, 'child of child of one' > 2, 0, 'two' > 4, 0, 'four' > I can't think of any way to do this with a simple order by > clause--there's obviously no columns that are ordered in my desired > results.
There are really only two orders in SQL - numeric and lexicographic (char, or nchar, varchar, varbinary, etc.). You're going to have to come up with a number or a character expression which reflects the order you wish. A simple way is to materialize the paths for each element. For instance, with the above id's, you can construct a "path" column: 0 0.1 0.1.3 0.1.3.5 0.2 0.4 This is an example of lexicographic order. Please note that lexicographic order is the same as the order in which one would encounter the nodes in a depth-first search. Paths are fairly easy to build. One way is to start with the root node and iteratively set each encountered node's path to its parent's path concatenated with the node id, eg update table set path = '0.' where node_id = 0 update c set path = p.path + c.node_id from table p, table c where c.parent_id = p.node_id and p.path is not null. and c.path is null Iterate the second expression until no more rows are updated (most db's let you check the "rows updated" after each statement). The number of iterations required is the number of levels in the hierarchy (i.e., not too many). Or, if you have DB2, write a recursive expression which will construct the same thing in one statement.
|
Thu, 06 May 2004 16:00:38 GMT |
|
 |
Steve Jone #7 / 24
|
 parent/child relationship in the same table.
Quote: > Would you like to benchmark "pipelining" with XDb which resolves > parent/child relationships at a rate of 100 mil/sec on a 800Mhz PC?
Not bad, one 'resolution' every 8 clock cyles. I wonder what instructions are being executed within those eight cycles to support this amazing performance.
|
Thu, 06 May 2004 22:35:25 GMT |
|
 |
Jame #8 / 24
|
 parent/child relationship in the same table.
Quote: > > Object folks thinking is frozen in terms of nested loops > > that user has to write. > If you pulled a marble out of a bag and it was black, you would say > that the remaining marbles in the bag are black. What are your chances > of being correct?
My apologies, I am guilty of this also :) An error on my part, the below claims are for resolving a child's parent and not a parent's children which was the original topic. 1. oodb is faster at resolving parent/child relationships? 2. resolution time in rdb increases with # of records? 3. resolution time in oodb is constant regardless of # of "recs"? But I still believe, in general, an oodb will resolve relationships faster. Quote: > ...resolves parent/child relationships > at a rate of 100 mil/sec on a 800Mhz PC?
Please ignore this specs as it is a number under ideal test condition and not representative of real world situations. Would you like to benchmark a rdb, with or without "pipelining", with XDb in a specific case similar to the original problem?
|
Thu, 06 May 2004 23:49:12 GMT |
|
 |
MSherr.. #9 / 24
|
 parent/child relationship in the same table.
Quote: >This sounded like a one-to-many >relationship to me so I added a parent column to the table.
Aren't one-to-many relationships defined as invoving two tables? -- Mike Sherrill Information Management Systems
|
Fri, 07 May 2004 01:20:48 GMT |
|
 |
Jame #10 / 24
|
 parent/child relationship in the same table.
Quote: > > Would you like to benchmark "pipelining" with XDb which resolves > > parent/child relationships at a rate of 100 mil/sec on a 800Mhz PC? > Not bad, one 'resolution' every 8 clock cyles. I wonder what instructions > are being executed within those eight cycles to support this amazing > performance.
Assembly, load this, load that, move this, move that, jump here, jump there :) Core XDb (without extensions for COLEDateTime support) is approximately 110 KB. With GUI and statically linked support DLLs, it is approximately 540 KB.
|
Fri, 07 May 2004 07:18:16 GMT |
|
 |
Jame #11 / 24
|
 parent/child relationship in the same table.
Quote: > >This sounded like a one-to-many > >relationship to me so I added a parent column to the table. > Aren't one-to-many relationships defined as invoving two tables?
Generally yes, but since the parent and child are nearly identical, in this special case, they can be collapsed into one table.
|
Fri, 07 May 2004 07:24:10 GMT |
|
 |
D.. #12 / 24
|
 parent/child relationship in the same table.
says... Quote: >> Not quite. A relational database has a concept of "pipelining": it could answer >> the query above in a single pass through the table. Object folks thinking is >> frozen in terms of nested loops that user has to write. >Any pointers on pipelining?
Well, pipelining is about answeing query in a single scan of the table. But before we are getting into this lets return to your question. Is it just plain hierarchy that is surfacing once in a while in this thread? For hierarchy you have plenty of solutions: 1. Recursive SQL (SQL 99 standard, implemented in DB2) 2. Oracle "connect by" 3. Incremental evaluation of transitive closure.
|
Fri, 07 May 2004 10:10:37 GMT |
|
 |
D.. #13 / 24
|
 parent/child relationship in the same table.
Quote: >> >Here is the psuedo code: >> >Call A(0) >> >Function A(par) >> > rs = db.execute("Select * from T_ParChild Where ParId = " & par) >> > While (!rs.EOF) >> > child = rs("Child") >> > Print par, child, rs("Data") >> > Call A(child) >> > Loop >> >End >> >An oodb can perform the above much faster than relational dbs. In a >> >relational db, the time to resolve the relationships increases with >> >the number of records. In an oodb, the resolution time is not only >> >quicker but constant regardless of number of "records". >> Object folks thinking is frozen in terms of nested loops >> that user has to write. >If you pulled a marble out of a bag and it was black, you would say >that the remaining marbles in the bag are black. What are your chances >of being correct?
Well, but if those folks spend most of their time arguing whether it is safe to inherit circle from ellipse, you wouldn't expect innovative query optimization technique coming from that camp, right? Quote: >The solution was presented using standard SQL and programming >techniques, which I believe will require nested loops? Can you provide >an alternate practical solution using standard SQL and programming >techniques?
You call SQL from a function that you invoke recursively. That's just one and not very efficient possibility: essentially, you coded some execution plan for recursive SQL. Note, that if some incremental evaluation system is in place (for example, materialized paths, Joe Celko nested sets, etc), the optimiser can choose a different plan. Quote: >> Not quite. >Note quite what? >1. oodb is faster at resolving parent/child relationships? >2. resolution time in rdb increases with # of records? >3. resolution time in oodb is constant regardless of # of "recs"?
What is "resolution time" may I ask? OODB is losely defined entity to draw any conclusion what database is faster today. Given that RDB has some flexibility in choosing the access path to the data, many people believe that RDBMS are faster. Even on unconventional topics like calculating transitive closure. Quote: >> A relational database has a concept of "pipelining": >> it could answer the query above in a single pass through the table. >What exactly is "pipelining"? >Which databases is "pipelining" available in? >Does "pipelining" internally not use any nested loops? >Does "pipelining" disproves any of the 3 points above? >How do I use "pipelining" to create the desired solution? >Would you like to benchmark "pipelining" with XDb which resolves >parent/child relationships at a rate of 100 mil/sec on a 800Mhz PC?
So what? If one can cook some one-user in-memory "database", no doubt, there would be a benchmark where it rocks.
|
Fri, 07 May 2004 10:51:39 GMT |
|
 |
Earl Lew #14 / 24
|
 parent/child relationship in the same table.
Quote:
> >This sounded like a one-to-many > >relationship to me so I added a parent column to the table. > Aren't one-to-many relationships defined as invoving two tables?
Yes and no. There is the concept of a one to many self join. Obviously one instance of the table needs to be aliased so you could say it is another table but it's obviously not.
|
Sat, 08 May 2004 00:39:21 GMT |
|
 |
Jason Diamo #15 / 24
|
 parent/child relationship in the same table.
Hi. Quote: > A simple way is to materialize the paths for each element. For instance, > with the above id's, you can construct a "path" column: > 0 > 0.1 > 0.1.3 > 0.1.3.5 > 0.2 > 0.4
I had exactly this same thought after asking my original question. It seems to suffer from two problems, though. The first is that I'd still like to maintain the original "numeric" ordering and to do that I would have to prefix each step in the path with zeroes which would make the paths pretty verbose which leads to my next problem: given enough descendants, I can eventually exceed the length limit on the path column. Quote: > Paths are fairly easy to build. One way is to start with the root node > and iteratively set each encountered node's path to its parent's path > concatenated with the node id, eg > update table set path = '0.' where node_id = 0 > update c set path = p.path + c.node_id > from table p, table c > where c.parent_id = p.node_id > and p.path is not null. > and c.path is null > Iterate the second expression until no more rows are updated (most db's let you > check the "rows updated" after each statement). The number of iterations > required is the number of levels in the hierarchy (i.e., not too many).
This is really interesting. I was thinking that I would just build the path when I do the first insert. Of course, I'd have to know the ID before that so couldn't use an IDENTITY column. Quote: > Or, if you have DB2, write a recursive expression which will construct > the same thing in one statement.
I'm using SQL Server but want to make sure that it's portable. Thanks, Jason.
|
Sat, 08 May 2004 00:52:55 GMT |
|
|
Page 1 of 2
|
[ 24 post ] |
|
Go to page:
[1]
[2] |
|