Hash Join vs Nested Loops in 7.2.1 ... 
Author Message
 Hash Join vs Nested Loops in 7.2.1 ...

Quote:

> I have a 7.2.1 query with two peculiar characteristics and wondered if
> anyone could offer some insight.
> First, my query takes 90 seconds with a hash join, but only 1 second with
> nested loops.

Probably because the EXISTS subplans only need to fetch one row from
each table they access; that's more or less an ideal case for nestloop
indexscans.  Nestloops do not scale very well to large retrieval sets,
however...

Quote:
> Second, the same query sometimes takes 10-50 seconds shortly after
> possibly a dump or other high-data-volume queries are executed, after
> which it then returns to 1 second execution time.  Getting crowded out
> of shared memory?

Sounds like it.  What shared-buffers setting are you using?  How much
RAM in the box?

Quote:
> Finally, I am inclined to turn off hash joins altogether.

That would be a remarkably foolish thing to do.  Certainly this query
is not a reason to do so; AFAICS the planner will do this one just fine
without any thumb on the scales.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command



Sat, 25 Sep 2004 13:08:58 GMT
 Hash Join vs Nested Loops in 7.2.1 ...

Quote:

>>Second, the same query sometimes takes 10-50 seconds shortly after
>>possibly a dump or other high-data-volume queries are executed, after
>>which it then returns to 1 second execution time.  Getting crowded out
>>of shared memory?

> Sounds like it.  What shared-buffers setting are you using?  How much
> RAM in the box?

shared_buffers = 256
max_fsm_relations = 500
(defaults for the rest)

RAM:  2.4GB, maybe?  Not that familiar with HPUX mem setup...

(OS: HP-UX B.11.00 U 9000/800)
$ swapinfo -mt
              Mb      Mb      Mb   PCT  START/      Mb
TYPE      AVAIL    USED    FREE  USED   LIMIT RESERVE  PRI  NAME
dev        2048     147    1901    7%       0       -    1  /dev/vg00/lvol2
reserve       -     312    -312
memory      369     351      18   95%
total      2417     810    1607   34%       -       0    -

Quote:
> AFAICS the planner will do this one just fine
> without any thumb on the scales.

How to I find the thumb?

Ed

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Sat, 25 Sep 2004 13:40:45 GMT
 Hash Join vs Nested Loops in 7.2.1 ...

Quote:


> Second, the same query sometimes takes 10-50 seconds shortly after
> possibly a dump or other high-data-volume queries are executed, after
> which it then returns to 1 second execution time.  Getting crowded out
> of shared memory?

>> Sounds like it.  What shared-buffers setting are you using?  How much
>> RAM in the box?
> shared_buffers = 256

That's not a lot --- 256*8K = 2MB.  You might try something in the low
thousands.

Quote:
> RAM:  2.4GB, maybe?  Not that familiar with HPUX mem setup...

swapinfo won't tell you anything about physical RAM.  If you poke around
in SAM, I think there's some displays in there about hardware ...

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster



Sat, 25 Sep 2004 21:22:56 GMT
 Hash Join vs Nested Loops in 7.2.1 ...

Quote:

>>> Second, the same query sometimes takes 10-50 seconds shortly after
>>> possibly a dump or other high-data-volume queries are executed, after
>>> which it then returns to 1 second execution time.  Getting crowded out
>>> of shared memory?

>>>> Sounds like it.  What shared-buffers setting are you using?  How much
>>>> RAM in the box?

>>> shared_buffers = 256

>> That's not a lot --- 256*8K = 2MB.  You might try something in the low
>> thousands.

> SAM indicates 512MB of RAM.  I upped the shared buffers from 256 to
> 4096, and the hashjoin query came down from ~90 seconds to 10, still 10x
> slower than the 1-sec nested loops.  Is that a performance difference
> you'd expect between hash and nested loops on this query because of EXISTS?

What I neglected to mention was that the planner was *choosing* the slower
hashjoin plan over the much faster nested loop plan without any PGOPTIONS set
or any postgresql.conf changes to enable_*, thus the motivation for a "thumb
on the scales."  After upping the number of shared buffers, it has begun
choosing the smart plan 1-second plan, apparently after a restart, not sure.
  Thanks, Tom.

Ed

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate

message can get through to the mailing list cleanly



Sun, 26 Sep 2004 03:02:30 GMT
 
 [ 4 post ] 

 Relevant Pages 

1. NESTED LOOPS VS. HASH JOINS

2. Hash Join vs. Nested Loops

3. Hash Match / Nested Loop inner join

4. 7.1.2 query now uses hash-join instead of nested loop

5. Hash vs Nested Loop

6. Right Join / Hash Joins / Nested Joins

7. cannot switch from LOOP JOIN to MERGE/HASH JOIN

8. NEEDED: BETTER JOIN PERFORMANCE ON BIG TABLE WITH BIG INDEXES (NESTED LOOPS AND MERGE JOINS)

9. HELP: pb with a NESTED LOOPS join vs HASH JOIN...

10. Hash join -vs- merge sort join

11. Joins: Loop, Merge, Hash


 
Powered by phpBB® Forum Software