Inefficient Database 
Author Message
 Inefficient Database

About a week back I asked if anyone knew of a Critical Path Database.
I got a response from Jens Teich who told me it could be done so I
developed a version using database definitions and calculation fields.
  Basically it has a CPTasks file and a CPLinks file that lists the
tasks that are predecessors and successors.

I sent Jens a copy and he noted my approach was inefficient and that I
should do it all in a script.  I think he underestimated the
inefficiency because while the program was slow with 10 nodes, it was
unbearable with 24 nodes.   The inefficiency seems to occur when I use
the MIN and MAX aggregate functions.  Jens notes

"Indeed the Min and Max functions are the problem. In your version
filemaker
does not know which nodes are able to be calculated. In forward
calculation
only those nodes can be calculated whose preceding ones are all
finished.
Presently you try to find them by accident. This ends up with many
useless
steps."

My problems are:
1.  Is there any way, other than following Jens recommendation to
script the whole algorithm, of helping FM do the  calculations in a
more efficient manner.

2.  When I open a layout that has the calcuation fields displayed, I
get an endless rotating clock.  The only way I can get out is Command
- Option - ESC and then FM says all licences are in use and I have to
reboot to get going again.  Any solutions to setting the licence
count?



Wed, 09 Feb 2005 23:46:59 GMT
 Inefficient Database

I for one have no idea what you mean by Critical Path Database, nor what is
meant by 'nodes'.

Min() and Max() are generally applied across relationships, which makes them
unstorable, which, in turn, forces them to re-calculate whenever needed.
They are needed wherever they appear on a layout. If there are many child
records, you can get the clock. Force quitting will in turn elicit the
license error message on a pre OS X Mac, but not on OS X or Windows (the
force quit leaves shutdown incomplete, and Filemaker thinks the license is
already in use).

Without knowing the specifics of the project, it is unlikely any of us can
guide you to better performance. In addition, you might provide us with OS's
on FMS and clients, if using Filemaker Server, as well as a rough idea of
your file structure.

--
John Weinshel
Datagrace
Vashon Island, WA
(206) 463-1634
Associate Member, Filemaker Solutions Alliance


Quote:
> About a week back I asked if anyone knew of a Critical Path Database.
> I got a response from Jens Teich who told me it could be done so I
> developed a version using database definitions and calculation fields.
>   Basically it has a CPTasks file and a CPLinks file that lists the
> tasks that are predecessors and successors.

> I sent Jens a copy and he noted my approach was inefficient and that I
> should do it all in a script.  I think he underestimated the
> inefficiency because while the program was slow with 10 nodes, it was
> unbearable with 24 nodes.   The inefficiency seems to occur when I use
> the MIN and MAX aggregate functions.  Jens notes

> "Indeed the Min and Max functions are the problem. In your version
> filemaker
> does not know which nodes are able to be calculated. In forward
> calculation
> only those nodes can be calculated whose preceding ones are all
> finished.
> Presently you try to find them by accident. This ends up with many
> useless
> steps."

> My problems are:
> 1.  Is there any way, other than following Jens recommendation to
> script the whole algorithm, of helping FM do the  calculations in a
> more efficient manner.

> 2.  When I open a layout that has the calcuation fields displayed, I
> get an endless rotating clock.  The only way I can get out is Command
> - Option - ESC and then FM says all licences are in use and I have to
> reboot to get going again.  Any solutions to setting the licence
> count?



Thu, 10 Feb 2005 00:27:21 GMT
 Inefficient Database
Thanks for posting that John, I thought it was just me ;-)

--
Michael Stout

http://www.fmpdev.com/

Quote:
> I for one have no idea what you mean by Critical Path Database, nor what is
> meant by 'nodes'.

> Min() and Max() are generally applied across relationships, which makes them
> unstorable, which, in turn, forces them to re-calculate whenever needed.
> They are needed wherever they appear on a layout. If there are many child
> records, you can get the clock. Force quitting will in turn elicit the
> license error message on a pre OS X Mac, but not on OS X or Windows (the
> force quit leaves shutdown incomplete, and Filemaker thinks the license is
> already in use).

> Without knowing the specifics of the project, it is unlikely any of us can
> guide you to better performance. In addition, you might provide us with OS's
> on FMS and clients, if using Filemaker Server, as well as a rough idea of
> your file structure.

> --
> John Weinshel
> Datagrace
> Vashon Island, WA
> (206) 463-1634
> Associate Member, Filemaker Solutions Alliance



>> About a week back I asked if anyone knew of a Critical Path Database.
>> I got a response from Jens Teich who told me it could be done so I
>> developed a version using database definitions and calculation fields.
>>   Basically it has a CPTasks file and a CPLinks file that lists the
>> tasks that are predecessors and successors.

>> I sent Jens a copy and he noted my approach was inefficient and that I
>> should do it all in a script.  I think he underestimated the
>> inefficiency because while the program was slow with 10 nodes, it was
>> unbearable with 24 nodes.   The inefficiency seems to occur when I use
>> the MIN and MAX aggregate functions.  Jens notes

>> "Indeed the Min and Max functions are the problem. In your version
>> filemaker
>> does not know which nodes are able to be calculated. In forward
>> calculation
>> only those nodes can be calculated whose preceding ones are all
>> finished.
>> Presently you try to find them by accident. This ends up with many
>> useless
>> steps."

>> My problems are:
>> 1.  Is there any way, other than following Jens recommendation to
>> script the whole algorithm, of helping FM do the  calculations in a
>> more efficient manner.

>> 2.  When I open a layout that has the calcuation fields displayed, I
>> get an endless rotating clock.  The only way I can get out is Command
>> - Option - ESC and then FM says all licences are in use and I have to
>> reboot to get going again.  Any solutions to setting the licence
>> count?



Thu, 10 Feb 2005 00:36:43 GMT
 Inefficient Database
Critical path problems...

A good way to solve it is to represent it as weighted directed acyclic
graph, with a single source vertex.
In database form that means represent the problem as a table of vertices
(dependancies), and a tableof the weighted edges (durations)

vertices (v_id, crit_path_distance, predecessor_edge_id, verts_in,
verts_out, sort_order)

verts_in = count(edges1::e_id)>0 edges1 is rel defined from vertices-edges
on v_id to desination_v_id
verts_out = count(edges2::e_id)>0 edges2 is rel defined from vertices-edges
on v_d to source_v_id
sort_order =
 if verts_in and not verts then 1
 if verts_out and verts_in then 2
 if verts_out and verts_in then 3
else 0 (shouldn't be necessary... every vertex should have at least one edge
:)

edges (e_id, source_v_id, destination_v_id, weight)

(This is all pretty analogous to the cptlinks/tasks tables you already set
up I'm sure.)

1) Sort the vertices table so that the vertices with no inbound edges are
first, and those with only inbound edges are last.

2) assign a -1 distance to every vertex (dist(v)=-1) , and a 0 distance to
the source.
3) for each vertex v in sorted order, for each outgoing edge e(v,u), if
dist(v) + weight(e) > dist(u), set dist(u)=dist(v) + weight(e) and the
predecessor of u to v.

Keep track of the highest distance, and the v_id of the associated vertex
in a global as you go to locate the solution you found when your done.

The time complexity is: O(VE) which is about as good as its going to get, I
think.
.
To answer your specific questions:
1) No. calculations are too static, and if you try to do anything fancy
you'll probably run into recursive definitions that filemaker will reject as
circular (it can't know that the recusions will bottom out...even if you
know they will)

2) Force-quitting FM causes it to not release the license properly.
Rebooting is the only solution.

-Dave



Thu, 10 Feb 2005 01:25:01 GMT
 Inefficient Database
Dave, wdf you talking about? Could you explain these terms to the rest of
us?

Thanks,

John


Quote:
> Critical path problems...

> A good way to solve it is to represent it as weighted directed acyclic
> graph, with a single source vertex.
> In database form that means represent the problem as a table of vertices
> (dependancies), and a tableof the weighted edges (durations)

> vertices (v_id, crit_path_distance, predecessor_edge_id, verts_in,
> verts_out, sort_order)

> verts_in = count(edges1::e_id)>0 edges1 is rel defined from vertices-edges
> on v_id to desination_v_id
> verts_out = count(edges2::e_id)>0 edges2 is rel defined from
vertices-edges
> on v_d to source_v_id
> sort_order =
>  if verts_in and not verts then 1
>  if verts_out and verts_in then 2
>  if verts_out and verts_in then 3
> else 0 (shouldn't be necessary... every vertex should have at least one
edge
> :)

> edges (e_id, source_v_id, destination_v_id, weight)

> (This is all pretty analogous to the cptlinks/tasks tables you already set
> up I'm sure.)

> 1) Sort the vertices table so that the vertices with no inbound edges are
> first, and those with only inbound edges are last.

> 2) assign a -1 distance to every vertex (dist(v)=-1) , and a 0 distance to
> the source.
> 3) for each vertex v in sorted order, for each outgoing edge e(v,u), if
> dist(v) + weight(e) > dist(u), set dist(u)=dist(v) + weight(e) and the
> predecessor of u to v.

> Keep track of the highest distance, and the v_id of the associated vertex
> in a global as you go to locate the solution you found when your done.

> The time complexity is: O(VE) which is about as good as its going to get,
I
> think.
> .
> To answer your specific questions:
> 1) No. calculations are too static, and if you try to do anything fancy
> you'll probably run into recursive definitions that filemaker will reject
as
> circular (it can't know that the recusions will bottom out...even if you
> know they will)

> 2) Force-quitting FM causes it to not release the license properly.
> Rebooting is the only solution.

> -Dave



Thu, 10 Feb 2005 02:00:39 GMT
 Inefficient Database
You're already getting some good advice.

A) Don't have any calcs that show on a viewable layout that is accessible in
real life. If you're going to use calcs, keep them on a developer layout that
you only use with a few records for checking the accuracy of your logic.

B) Your real world display should only be brain dead fields that are populated
via script. Frankly, I'd test the logic in a developer file in calcs, then move
the finalized calcs into scripts in the real world file.

B.i) Alternatively, create a "processing" file that has your calcs, but capture
the aggregate information from your main file (making sure to avoid any record
locking scenarios) via script, and populating your results screen also via
script. The calcs in the "processing" file, of course, remain hidden from the
user.

Basically, FMP gives you a robust sets of tools to work with, but
unfortunately, when it comes to scalability, one often has to learn the ropes
by the seat of their pants. A solution may hummm in the lab with 50, then 500,
then 5000 test records, but may crawl on the client's network with 50,000
records, ultimately requiring rethinking the original metaphor. The thresholds
for scalability vary by solution, including platform, hardware specs, network,
etc. The real world is fraught with compromises in database solution
deployment. It sounds like you ought to keep those thick seated pants on for a
while they get some wear and tear. Good luck.

Sincerely,

        Roger E. Grodin
        REDWING FINANCIAL GROUP
        "Where OUR business is the ultimate database for YOUR business"

        ==========================================================



Thu, 10 Feb 2005 02:45:38 GMT
 Inefficient Database
Its not really database stuff... its graph theory and computational
complexity theory. We don't get much discussion of this sort of stuff in
here... it was a nice diversion. :)

A critical path is the weightiest (or "longest") route through a weighted
directed acyclic graph (weighted DAG)....

A weighted directed acyclic graph needs to be defined:

graph... pretty familiar... a collection of vertices and a collection of
edges connecting pairs of vertices. (lets start a street analagy... the
edges are the streets and the vertices are the intersections)

directed graph ... same as above except the edges represent a one-way
constraint (like one way streets instead of two-way streets) a directed
graph has *only* directed edges (although there could be two edges between a
pair of vertices one going each way... or two going the same way between the
same pair of vertices for that matter...)

acyclic (no cycles)... refers to a directed graph that doesn't loop back on
itself anywhere (Imagine a series of streets through wich you could only
move forward through and there was no way to get back to any place you
previously visited, no matter what route you choose). This is important in
critical path problems because otherwsie the longest route through it would
be arbitrarilily large as you could just loop as often as you want.

weighted... each edge has a weight associated with it. (to continue the
streets analagy the length of the street would be the edge weight) In some
problems its possible for edges to have negative weights (obviously its not
for our street analagy), and its worth noting that my solution can fail if
negative weights are used. It would be trivially easy to fix it...but the
particular problem the OP alluded to would be unlikely to have negative
weights, so I skipped that extra complexity.

I provided the OP a database framework and an algorithm that could be easily
scripted for determining the critical path through a weighted DAG, using a
filemaker represenation of the graph augmented with some utility fields for
the algorithms use.

The algorithm itself is a pretty standard variation of the more common
shortest-path problem. (Finding the shortest route through a weighted DAG)

One common practical use for this particular problem is to determine the
mimimum length of time a complex project will take, assuming each
sub-project (or task) takes some known (or estimated) duration and which
cannot  start until zero or more other tasks start, and upon which zero or
more other projects depend. (your probably familiar with a gantt chart?
that's what I'm talking about)... anyways the relationship between the tasks
form a weighted DAG, and the critical path represents the solution to the
minimum time it will take to complete the project.

Intuitively it makes sense: the minimum time is equal to the longest chain
of events that must take place. Determining that longest chain of events is
a little complicated though.

The bit about O(VE) is a mathematical expression which describes the worst
case time complexity (performance) of the algorithm as being bounded by the
product of vertices to edges as the problem grows large.

I know that isn't the clearest explanation... let me know if still needs
more...
-Dave


Quote:
> Dave, wdf you talking about? Could you explain these terms to the rest of
> us?

> Thanks,

> John



> > Critical path problems...

> > A good way to solve it is to represent it as weighted directed acyclic
> > graph, with a single source vertex.
> > In database form that means represent the problem as a table of vertices
> > (dependancies), and a tableof the weighted edges (durations)

> > vertices (v_id, crit_path_distance, predecessor_edge_id, verts_in,
> > verts_out, sort_order)

> > verts_in = count(edges1::e_id)>0 edges1 is rel defined from
vertices-edges
> > on v_id to desination_v_id
> > verts_out = count(edges2::e_id)>0 edges2 is rel defined from
> vertices-edges
> > on v_d to source_v_id
> > sort_order =
> >  if verts_in and not verts then 1
> >  if verts_out and verts_in then 2
> >  if verts_out and verts_in then 3
> > else 0 (shouldn't be necessary... every vertex should have at least one
> edge
> > :)

> > edges (e_id, source_v_id, destination_v_id, weight)

> > (This is all pretty analogous to the cptlinks/tasks tables you already
set
> > up I'm sure.)

> > 1) Sort the vertices table so that the vertices with no inbound edges
are
> > first, and those with only inbound edges are last.

> > 2) assign a -1 distance to every vertex (dist(v)=-1) , and a 0 distance
to
> > the source.
> > 3) for each vertex v in sorted order, for each outgoing edge e(v,u), if
> > dist(v) + weight(e) > dist(u), set dist(u)=dist(v) + weight(e) and the
> > predecessor of u to v.

> > Keep track of the highest distance, and the v_id of the associated
vertex
> > in a global as you go to locate the solution you found when your done.

> > The time complexity is: O(VE) which is about as good as its going to
get,
> I
> > think.
> > .
> > To answer your specific questions:
> > 1) No. calculations are too static, and if you try to do anything fancy
> > you'll probably run into recursive definitions that filemaker will
reject
> as
> > circular (it can't know that the recusions will bottom out...even if you
> > know they will)

> > 2) Force-quitting FM causes it to not release the license properly.
> > Rebooting is the only solution.

> > -Dave



Thu, 10 Feb 2005 03:18:21 GMT
 Inefficient Database
Glad I asked.

Among the many things I don't understand is:

vertices (v_id, crit_path_distance, predecessor_edge_id, verts_in,
verts_out, sort_order)

That looks a lot like a programming expression, wherein 'vertices' is being
defined as a function with the arguments enclosed by parens, but I get lost
in translating that to the functions available in FMP, which I understand to
be Tom's goal.

I don't want to start a long thread, asking you to explain what is obviously
a whole world in itself, but I would like to understand the idea just a bit
more. If I follow, 'weighted edges' implies the edges are actually vectors,
and the weight is a measure of their force? If so, are those forces
essentially representations of how well/poorly the edge (which must be a
task) leads to the goal?

--
John Weinshel
Datagrace
Vashon Island, WA
(206) 463-1634
Associate Member, Filemaker Solutions Alliance


Quote:
> Its not really database stuff... its graph theory and computational
> complexity theory. We don't get much discussion of this sort of stuff in
> here... it was a nice diversion. :)

> A critical path is the weightiest (or "longest") route through a weighted
> directed acyclic graph (weighted DAG)....

> A weighted directed acyclic graph needs to be defined:

> graph... pretty familiar... a collection of vertices and a collection of
> edges connecting pairs of vertices. (lets start a street analagy... the
> edges are the streets and the vertices are the intersections)

> directed graph ... same as above except the edges represent a one-way
> constraint (like one way streets instead of two-way streets) a directed
> graph has *only* directed edges (although there could be two edges between
a
> pair of vertices one going each way... or two going the same way between
the
> same pair of vertices for that matter...)

> acyclic (no cycles)... refers to a directed graph that doesn't loop back
on
> itself anywhere (Imagine a series of streets through wich you could only
> move forward through and there was no way to get back to any place you
> previously visited, no matter what route you choose). This is important in
> critical path problems because otherwsie the longest route through it
would
> be arbitrarilily large as you could just loop as often as you want.

> weighted... each edge has a weight associated with it. (to continue the
> streets analagy the length of the street would be the edge weight) In some
> problems its possible for edges to have negative weights (obviously its
not
> for our street analagy), and its worth noting that my solution can fail if
> negative weights are used. It would be trivially easy to fix it...but the
> particular problem the OP alluded to would be unlikely to have negative
> weights, so I skipped that extra complexity.

> I provided the OP a database framework and an algorithm that could be
easily
> scripted for determining the critical path through a weighted DAG, using a
> filemaker represenation of the graph augmented with some utility fields
for
> the algorithms use.

> The algorithm itself is a pretty standard variation of the more common
> shortest-path problem. (Finding the shortest route through a weighted DAG)

> One common practical use for this particular problem is to determine the
> mimimum length of time a complex project will take, assuming each
> sub-project (or task) takes some known (or estimated) duration and which
> cannot  start until zero or more other tasks start, and upon which zero or
> more other projects depend. (your probably familiar with a gantt chart?
> that's what I'm talking about)... anyways the relationship between the
tasks
> form a weighted DAG, and the critical path represents the solution to the
> minimum time it will take to complete the project.

> Intuitively it makes sense: the minimum time is equal to the longest chain
> of events that must take place. Determining that longest chain of events
is
> a little complicated though.

> The bit about O(VE) is a mathematical expression which describes the worst
> case time complexity (performance) of the algorithm as being bounded by
the
> product of vertices to edges as the problem grows large.

> I know that isn't the clearest explanation... let me know if still needs
> more...
> -Dave



> > Dave, wdf you talking about? Could you explain these terms to the rest
of
> > us?

> > Thanks,

> > John



> > > Critical path problems...

> > > A good way to solve it is to represent it as weighted directed acyclic
> > > graph, with a single source vertex.
> > > In database form that means represent the problem as a table of
vertices
> > > (dependancies), and a tableof the weighted edges (durations)

> > > vertices (v_id, crit_path_distance, predecessor_edge_id, verts_in,
> > > verts_out, sort_order)

> > > verts_in = count(edges1::e_id)>0 edges1 is rel defined from
> vertices-edges
> > > on v_id to desination_v_id
> > > verts_out = count(edges2::e_id)>0 edges2 is rel defined from
> > vertices-edges
> > > on v_d to source_v_id
> > > sort_order =
> > >  if verts_in and not verts then 1
> > >  if verts_out and verts_in then 2
> > >  if verts_out and verts_in then 3
> > > else 0 (shouldn't be necessary... every vertex should have at least
one
> > edge
> > > :)

> > > edges (e_id, source_v_id, destination_v_id, weight)

> > > (This is all pretty analogous to the cptlinks/tasks tables you already
> set
> > > up I'm sure.)

> > > 1) Sort the vertices table so that the vertices with no inbound edges
> are
> > > first, and those with only inbound edges are last.

> > > 2) assign a -1 distance to every vertex (dist(v)=-1) , and a 0
distance
> to
> > > the source.
> > > 3) for each vertex v in sorted order, for each outgoing edge e(v,u),
if
> > > dist(v) + weight(e) > dist(u), set dist(u)=dist(v) + weight(e) and the
> > > predecessor of u to v.

> > > Keep track of the highest distance, and the v_id of the associated
> vertex
> > > in a global as you go to locate the solution you found when your done.

> > > The time complexity is: O(VE) which is about as good as its going to
> get,
> > I
> > > think.
> > > .
> > > To answer your specific questions:
> > > 1) No. calculations are too static, and if you try to do anything
fancy
> > > you'll probably run into recursive definitions that filemaker will
> reject
> > as
> > > circular (it can't know that the recusions will bottom out...even if
you
> > > know they will)

> > > 2) Force-quitting FM causes it to not release the license properly.
> > > Rebooting is the only solution.

> > > -Dave



Thu, 10 Feb 2005 04:14:02 GMT
 Inefficient Database
Quote:

> About a week back I asked if anyone knew of a Critical Path Database.  
> I got a response from Jens Teich who told me it could be done so I
> developed a version using database definitions and calculation fields.
>   Basically it has a CPTasks file and a CPLinks file that lists the
> tasks that are predecessors and successors.

.........

Sorry for not giving a better description of Critical Path.  I am
using it to calculate the early and late start and end dates for a
network of tasks that are required to complete a project.  In my use,
the nodes are the tasks and each task has a duration (I use days).
Each task is "linked" to zero or more predecessor tasks (work that
must be done before the task in question can proceed)  and zero or
more successor tasks.  Tasks with no predecessors start the project
and tasks with zero successors end the the project.

I start out by calculating the lengths of each path of tasks from the
first to the last task to see how long the project will take.

If there is more than path through the network, one of the paths will
be critical - will take the longest to complete. The other paths will
have some slack time that is they can be delayed and not affect the
final completion date.  I calculate the slack time by starting at the
back with the largest completion date and working to the front to see
how much slack time is on the no critical paths.  The critical path
tasks will have zero slack time.

FM allowed me to do all these calculations with data definitions only.
 Unfortunately, in a real world case, it is taking too long, hence my
questions.  The extra theory provided by D. Barreto is much
appreciated as are the suggestions by Redwing and Jens  Teich.  
Planning to buy the latest copy of Scriptology as suggested by Jens
and recommended in other posts.

FMPro allows me to do better estimates on the work and duration of the
tasks as well as calculating the work loads each of the resources
have.  Programs like MS Project don't provide great support for
getting good estimates.   I still use the original MAC Project as it
is not overloaded with a thousand feature simple to medium projects do
not require.

I am using FMPro 5.0v1 on MAC  OS9.2 and going to install OS10.2 this
week so my problem of having to restart when the database runs away
should stop.

Tom



Thu, 10 Feb 2005 23:06:36 GMT
 Inefficient Database


Quote:
> Glad I asked.

> Among the many things I don't understand is:

> vertices (v_id, crit_path_distance, predecessor_edge_id, verts_in,
> verts_out, sort_order)

> That looks a lot like a programming expression, wherein 'vertices' is
being
> defined as a function with the arguments enclosed by parens, but I get
lost
> in translating that to the functions available in FMP, which I understand
to
> be Tom's goal.

It probably wouldn't have needed explanation in any other thread :)
Its just a proposed table schema for the vertices table.

Quote:
> I don't want to start a long thread, asking you to explain what is
obviously
> a whole world in itself, but I would like to understand the idea just a
bit
> more. If I follow, 'weighted edges' implies the edges are actually
vectors,
> and the weight is a measure of their force? If so, are those forces
> essentially representations of how well/poorly the edge (which must be a
> task) leads to the goal?

Well... the edges have direction and magnitude, so they could be called
vectors, but its not really a representation of how well the task leads to
the goal, as every task is required to complete the goal, and no task is
more or less important on its own. Its literally the amount of time that
task will take. Perhaps an example would make it clear.

Simple Project X

Tasks:
Requirements Gathering (RG) 1 hour, requires nothing
Design Phase (DP) 6 hours, requires requirements
Coding Phase (CP) 2 hours, requires design
Debugging Phase (DP) 2 hours, requires coding
Write User Documentation (UD) 8 hours, requires design
Package Design (PD) 3 hours, requires design

Now every task must get done for this project to ship...and this project,
assuming maximum possible concurrency, of tasks (according to dependancies)
is exploited will require 15 hours. The longest chain of events is RG(1) +
DP(2) + UD(8) = 15. This is the critical path. If any of these tasks take
extra time the project completion will be shifted back. The path which leads
to the debugging phase for example is only RG+DP+CP+DP = 11 so the CP & DP
have a slack time of 4 hours that they can overrun and still not delay the
project (RG & DP don't have access to this slack time because they are on
the critical path, of course)

I can't figure out a way of actually drawing the DAG for this project for
the newsgroup... but I'll give it a shot.

v1--RG(1)->v2--DP(6)->v3 -- see below (3 branches.)

v3--CP(2)->v4--DP(2)->v5

v3--UD(8)->v6

v3--PD(3)->v7

So we assign this dag a source vertex v1, and then there are three terminal
ones (v5, v6, and v7) we know this because 5,6,&7 have no outbound edges.

as a database this is represented as:
vertices (v_id)
v1
v2
v3
v4
v5
v6
v7

edges (e_id, v_source, v_dest, weight)
RG (v1, v2), 1
DP (v2,v3), 6
CP (v3,v4), 2
DP (v4,v5), 2
UD (v3,v6), 8
PD (v3,v7), 3

I then augmented the vertices table with: crit_path_distance,
predecessor_edge_id, verts_in, verts_out, and sort_order

verts_in and _out  are used to determine the sort_order. We want to sort it
so that vertices with no inbound edges are first, and no outbound edges are
last.

crit_path_distance is the critical path distance to that vertex, and
predessor_edge is the inbound edge that is part of that critical path. These
are filled out as the algorithm proceeds.

My example was pretty simple, but it can get much more complicated and
interconnected... even adding just a few edges:
E1 (v1,v8), 8
E2 (v8,v4), 6
E3 (v8,v3), 4
E4 (v6, v7), 6
E5 (v2, v7), 12

Makes the problem much more tedious to solve by hand.

I hope that clears it up some.
-Dave



Fri, 11 Feb 2005 05:46:42 GMT
 Inefficient Database
Beautiful-- helps a lot, along with Tom's post. Thank you.

--
John Weinshel
Datagrace
Vashon Island, WA
(206) 463-1634
Associate Member, Filemaker Solutions Alliance


Quote:



> > Glad I asked.

> > Among the many things I don't understand is:

> > vertices (v_id, crit_path_distance, predecessor_edge_id, verts_in,
> > verts_out, sort_order)

> > That looks a lot like a programming expression, wherein 'vertices' is
> being
> > defined as a function with the arguments enclosed by parens, but I get
> lost
> > in translating that to the functions available in FMP, which I
understand
> to
> > be Tom's goal.

> It probably wouldn't have needed explanation in any other thread :)
> Its just a proposed table schema for the vertices table.

> > I don't want to start a long thread, asking you to explain what is
> obviously
> > a whole world in itself, but I would like to understand the idea just a
> bit
> > more. If I follow, 'weighted edges' implies the edges are actually
> vectors,
> > and the weight is a measure of their force? If so, are those forces
> > essentially representations of how well/poorly the edge (which must be a
> > task) leads to the goal?

> Well... the edges have direction and magnitude, so they could be called
> vectors, but its not really a representation of how well the task leads to
> the goal, as every task is required to complete the goal, and no task is
> more or less important on its own. Its literally the amount of time that
> task will take. Perhaps an example would make it clear.

> Simple Project X

> Tasks:
> Requirements Gathering (RG) 1 hour, requires nothing
> Design Phase (DP) 6 hours, requires requirements
> Coding Phase (CP) 2 hours, requires design
> Debugging Phase (DP) 2 hours, requires coding
> Write User Documentation (UD) 8 hours, requires design
> Package Design (PD) 3 hours, requires design

> Now every task must get done for this project to ship...and this project,
> assuming maximum possible concurrency, of tasks (according to
dependancies)
> is exploited will require 15 hours. The longest chain of events is RG(1) +
> DP(2) + UD(8) = 15. This is the critical path. If any of these tasks take
> extra time the project completion will be shifted back. The path which
leads
> to the debugging phase for example is only RG+DP+CP+DP = 11 so the CP & DP
> have a slack time of 4 hours that they can overrun and still not delay the
> project (RG & DP don't have access to this slack time because they are on
> the critical path, of course)

> I can't figure out a way of actually drawing the DAG for this project for
> the newsgroup... but I'll give it a shot.

> v1--RG(1)->v2--DP(6)->v3 -- see below (3 branches.)

> v3--CP(2)->v4--DP(2)->v5

> v3--UD(8)->v6

> v3--PD(3)->v7

> So we assign this dag a source vertex v1, and then there are three
terminal
> ones (v5, v6, and v7) we know this because 5,6,&7 have no outbound edges.

> as a database this is represented as:
> vertices (v_id)
> v1
> v2
> v3
> v4
> v5
> v6
> v7

> edges (e_id, v_source, v_dest, weight)
> RG (v1, v2), 1
> DP (v2,v3), 6
> CP (v3,v4), 2
> DP (v4,v5), 2
> UD (v3,v6), 8
> PD (v3,v7), 3

> I then augmented the vertices table with: crit_path_distance,
> predecessor_edge_id, verts_in, verts_out, and sort_order

> verts_in and _out  are used to determine the sort_order. We want to sort
it
> so that vertices with no inbound edges are first, and no outbound edges
are
> last.

> crit_path_distance is the critical path distance to that vertex, and
> predessor_edge is the inbound edge that is part of that critical path.
These
> are filled out as the algorithm proceeds.

> My example was pretty simple, but it can get much more complicated and
> interconnected... even adding just a few edges:
> E1 (v1,v8), 8
> E2 (v8,v4), 6
> E3 (v8,v3), 4
> E4 (v6, v7), 6
> E5 (v2, v7), 12

> Makes the problem much more tedious to solve by hand.

> I hope that clears it up some.
> -Dave



Fri, 11 Feb 2005 06:22:20 GMT
 
 [ 11 post ] 

 Relevant Pages 

1. Fixing an inefficient database

2. Help Adding To Table - Very inefficient

3. My SQL query is inefficient, need help!

4. inefficient join

5. Optimizer Chooses Inefficient plan when selecting on a date colum n

6. WORKIN BUT INEFFICIENT CURSOR

7. A WORKING CURSOR BUT INEFFICIENT, ANY HELP PLEASE

8. Inefficient IN clause?

9. How inefficient are BLOBS, really?

10. How inefficient is this query?

11. Inefficient handling of LO-restore + Patch

12. Inefficient handling of LO-restore + Patch


 
Powered by phpBB® Forum Software