On using "date 'XXX' + interval 'XXX'" vs "date 'XXX'" 
Author Message
 On using "date 'XXX' + interval 'XXX'" vs "date 'XXX'"

Hello-

In rewriting some queries I noticed a huge performance penalty when
using a "date + interval" summation in the SELECT statement, versus a
single simple "date".  It is almost as though the "date + interval" is
being calculated for each row...

Much easier to demonstrate than it is to explain (note the runtimes of
the queries, 72ms versus 482ms):

===============================================
main_v0_8=# explain analyze select count(*) from sample_tracker where
initdate>=date '2002-02-01';
NOTICE:  QUERY PLAN:

Aggregate  (cost=607.24..607.24 rows=1 width=0) (actual
time=72.08..72.08 rows=1 loops=1)
   ->  Seq Scan on sample_tracker  (cost=0.00..595.74 rows=4600 width=0)
(actual time=0.04..63.62 rows=4266 loops=1)
Total runtime: 72.20 msec

EXPLAIN

===============================================
main_v0_8=# explain analyze select count(*) from sample_tracker where
initdate>=date '2002-01-01' + interval '1 month';
NOTICE:  QUERY PLAN:

Aggregate  (cost=738.23..738.23 rows=1 width=0) (actual
time=482.49..482.49 rows=1 loops=1)
   ->  Seq Scan on sample_tracker  (cost=0.00..723.98 rows=5700 width=0)
(actual time=0.13..470.94 rows=4266 loops=1)
Total runtime: 482.62 msec

EXPLAIN

===============================================
main_v0_8=# explain ANALYZE select date '2002-01-01' + interval '1 month';
NOTICE:  QUERY PLAN:

Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.04..0.05 rows=1
loops=1)
Total runtime: 0.09 msec

===============================================
main_v0_8=# select version();
                            version
-------------------------------------------------------------
  PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.96

Seems like this could be something ripe for optimization...

--

-**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*---
  Jon Lapham
  Extracta Molculas Naturais, Rio de Janeiro, Brasil

***-*--*----*-------*------------*--------------------*---------------

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.***.com/



Fri, 05 Nov 2004 23:03:05 GMT
 On using "date 'XXX' + interval 'XXX'" vs "date 'XXX'"

Quote:

> In rewriting some queries I noticed a huge performance penalty when
> using a "date + interval" summation in the SELECT statement, versus a
> single simple "date".  It is almost as though the "date + interval" is
> being calculated for each row...

Try coercing the sum result back to a date.

It's a little easier to see what's happening in current sources:

regression=# create table sample_tracker (initdate date primary key);

regression=# set enable_seqscan TO 0;
SET
regression=# explain  select count(*) from sample_tracker where
regression-# initdate>=date '2002-02-01';
                                            QUERY PLAN
---------------------------------------------------------------------------------------------------
 Aggregate  (cost=46.33..46.33 rows=1 width=0)
   ->  Index Scan using sample_tracker_pkey on sample_tracker  (cost=0.00..45.50 rows=333 width=0)
         Index Cond: (initdate >= '2002-02-01'::date)
(3 rows)

regression=# explain  select count(*) from sample_tracker where
regression-# initdate>=date '2002-01-01' + interval '1 month';
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
 Aggregate  (cost=100000025.83..100000025.83 rows=1 width=0)
   ->  Seq Scan on sample_tracker  (cost=100000000.00..100000025.00 rows=333 width=0)
         Filter: ("timestamp"(initdate) >= '2002-02-01 00:00:00'::timestamp without time zone)
(3 rows)

regression=#

Writing date(date '2002-01-01' + interval '1 month') gets me back to
the first plan.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster



Fri, 05 Nov 2004 23:32:55 GMT
 On using "date 'XXX' + interval 'XXX'" vs "date 'XXX'"

Quote:

> Try coercing the sum result back to a date.

Yup, that does the trick.

As usual, thanks Tom.

-Jon

PS: Is this optimizable?  (ie: have the coersion be implicit in
situations like this).  It sure seems like it is a good candidate...

--

-**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*---
  Jon Lapham
  Extracta Molculas Naturais, Rio de Janeiro, Brasil

***-*--*----*-------*------------*--------------------*---------------

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command



Sat, 06 Nov 2004 00:08:31 GMT
 On using "date 'XXX' + interval 'XXX'" vs "date 'XXX'"

Quote:
> > Try coercing the sum result back to a date.
...
> PS: Is this optimizable?  (ie: have the coersion be implicit in
> situations like this).  It sure seems like it is a good candidate...

Not at the parser or optimizer level. You *could* have had hours,
minutes, or seconds in that interval value you specified, in which case
you would be truncating to get back to date. We'd need more
infrastructure to somehow know how to optimize something like that.

Or, we could split the INTERVAL type into the (ugh) bunch-o-types
envisioned by the SQL standard. YEAR, MONTH, YEAR TO MONTH, and DAY
intervals could be converted directly to dates rather than timestamps.
This would allow the optimizer to know what the output range would be,
whereas now the range info is just used for input and output (and is
usually a don't-care internally since the other fields are zeros).

                  - Thomas

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Sat, 06 Nov 2004 01:16:35 GMT
 
 [ 4 post ] 

 Relevant Pages 

1. application error 'at "0x001ec000"'

2. BUG DTS 'use transactions' causes DTSLookups("XXX") errors (X-Post)

3. DTS 'use transactions' causes DTSLookups("XXX") errors

4. [SQLServer]Login failed for user 'xxx'.

5. Help {Encrypt N 'xxx'} Function

6. Locked on user 'XXX'

7. Formating 'Created Date"

8. The process could not bulk copy into table '"dbo"."MSmerge_contents"'. (20037)

9. The process could not bulk copy into table '"dbo"."MSmerge_contents"'.

10. The process could not bulk copy into table '"dbo"."MSmerge_genhistory"'.

11. Montreal: "ORACLE", programmeurs en 'C' *** Programmers in 'C'

12. Newline character and ado Save("xxx", adPersistXML)


 
Powered by phpBB® Forum Software