Yet again on indices... 
Author Message
 Yet again on indices...

Ok,

I'm working on query analysis for a program in ecpg for business puposes. Look
at what I found on with PG 7.2: Please be cool with my french2english processor,
I got few bogomips in my brain dedicated to english (should have listen more in
class..):
----

line 962 (in the ecpg source..)

EXPLAIN SELECT t12_bskid, t12_pnb, t12_lne, t12_tck
FROM T12_20011231
WHERE t12_bskid >= 1  
ORDER BY t12_bskid, t12_pnb, t12_tck, t12_lne;

NOTICE:  QUERY PLAN:

Sort  (cost=3006.13..3006.13 rows=25693 width=46)
  ->  Seq Scan on t12_20011231  (cost=0.00..1124.20 rows=25693 width=46)

=> not good, table t12_20011231 as 26K tuples :-(

=> create index t12_idx_bskid_20011231 on t12_20011231 (t12_bskid);

Sort  (cost=3006.13..3006.13 rows=25693 width=46)
  ->  Seq Scan on t12_20011231  (cost=0.00..1124.20 rows=25693 width=46)

=> probably statistic refresh to be done:
$ /usr/local/pgsql/bin/vacuumdb --analyze dbks

Sort  (cost=3006.13..3006.13 rows=25693 width=46)
  ->  Seq Scan on t12_20011231  (cost=0.00..1124.20 rows=25693 width=46)

=> Uh? Seq scan cheaper than index???  

=> let's disable seqscan to read cost of index:
postgresql.conf : enable_seqscan = false

Sort  (cost=3126.79..3126.79 rows=25693 width=46)
  ->  Index Scan using t12_idx_bskid_20011231 on t12_20011231
(cost=0.00..1244.86 rows=25693 width=46)

=> Uh? seq scan'cost is lower than index scan??  => mailto hackers

----

What's your opinion?

I have to tell that this select opperates in a forloop statment .
I hardly believe reading 26K tuples is cheaper thant index reading, but maybe
you'll ask me about buffers that should store de 26K tuples?...

But just after this query, there is another one that maybe will put data in
buffers, kicking t12_20011231 data blocks...

Well I feel a little stuck there. I'll continue with enable_scans=false, but
I feel bad beeing forced to do so... and still asking myself if this is good
idea.

Thanks for support, best regards.

--
Jean-Paul ARGUDO

---------------------------(end of broadcast)---------------------------



Sun, 15 Aug 2004 17:52:52 GMT
 Yet again on indices...

Quote:
> > postgresql.conf : enable_seqscan = false
> You could just do
> set enable_seqscan to 'off'
> in sql

thanks for the tip :-)

Quote:
> > => Uh? seq scan'cost is lower than index scan??  => mailto hackers
> It often is. Really.
> > What's your opinion?
> What are the real performance numbers ?

Finally, testing and testing again shows the choice of table scan is faster than
index scan on this 26K tuples table. really impresive.

I posted another mail about Oracle vs PG results in a comparative survey I'm
currently working on for 1 month. Please read it, I feel a bit disapointed with
Oracle's 1200 tps..

Thanks for your support Hannu!

--
Jean-Paul ARGUDO

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command



Sun, 15 Aug 2004 23:55:43 GMT
 Yet again on indices...

Quote:

> EXPLAIN SELECT t12_bskid, t12_pnb, t12_lne, t12_tck
> FROM T12_20011231
> WHERE t12_bskid >= 1  
> ORDER BY t12_bskid, t12_pnb, t12_tck, t12_lne;
> Sort  (cost=3006.13..3006.13 rows=25693 width=46)
>   ->  Seq Scan on t12_20011231  (cost=0.00..1124.20 rows=25693 width=46)
> => Uh? Seq scan cheaper than index???  

For that kind of query, very probably.  How much of the table is
actually selected by "WHERE t12_bskid >= 1"?

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command



Mon, 16 Aug 2004 01:18:38 GMT
 Yet again on indices...

Quote:

> EXPLAIN SELECT t12_bskid, t12_pnb, t12_lne, t12_tck
> FROM T12_20011231
> WHERE t12_bskid >= 1
> ORDER BY t12_bskid, t12_pnb, t12_tck, t12_lne;

> NOTICE:  QUERY PLAN:

> Sort  (cost=3006.13..3006.13 rows=25693 width=46)
>   ->  Seq Scan on t12_20011231  (cost=0.00..1124.20 rows=25693 width=46)

> => not good, table t12_20011231 as 26K tuples :-(

> => create index t12_idx_bskid_20011231 on t12_20011231 (t12_bskid);

> Sort  (cost=3006.13..3006.13 rows=25693 width=46)
>   ->  Seq Scan on t12_20011231  (cost=0.00..1124.20 rows=25693 width=46)

> => probably statistic refresh to be done:
> $ /usr/local/pgsql/bin/vacuumdb --analyze dbks

> Sort  (cost=3006.13..3006.13 rows=25693 width=46)
>   ->  Seq Scan on t12_20011231  (cost=0.00..1124.20 rows=25693 width=46)

> => Uh? Seq scan cheaper than index???

> => let's disable seqscan to read cost of index:
> postgresql.conf : enable_seqscan = false

> Sort  (cost=3126.79..3126.79 rows=25693 width=46)
>   ->  Index Scan using t12_idx_bskid_20011231 on t12_20011231
> (cost=0.00..1244.86 rows=25693 width=46)

> => Uh? seq scan'cost is lower than index scan??  => mailto hackers

> ----

> What's your opinion?

Well you didn't send the schema, or explain analyze results to show
which is actually faster, but...

Sequence scan *can be* faster than index scan when a large portion of the
table is going to be read.  If the data is randomly distributed,
eventually you end up reading most/all of the table blocks anyway to get
the validity information for the rows and you're doing it in random order,
plus you're reading parts of the index as well. How many rows are in
the table, and how many match t12_bskid >=1?

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster



Mon, 16 Aug 2004 02:24:38 GMT
 
 [ 4 post ] 

 Relevant Pages 

1. Indexes yet again...

2. Q: Oracle8i Installation Problem (yet again!)

3. errors in sps, yet again

4. Setnet yet again

5. BULK LOAD, yet again...

6. And yet again.........

7. SQL Server (Me yet again)

8. Yet Again.. Could not find installable ISAM

9. Finding related records yet again

10. Reporting Tools ... yet again

11. Access97 passwords...yet again

12. Error reading from D: drive (yet again)


 
Powered by phpBB® Forum Software