Logical recovery during startup takes long 
Author Message
 Logical recovery during startup takes long

We stopped our Informix server (7.31.FC6) and started it again a bit
later. The engine needed more than an hour to finish the logical
recovery. It really rolled back one transaction and the only reason I
can accept that this might have been a long one.
Questions:
1. Is it a normal behaviour? In case it really had a relative long
uncomplete transaction should it not have rolled it back at shutdown?
2. How can this be prevented? Let us suppose you have a half an hour
time-window to do a quick maintenance at your productive engine, you
shut Online down, and later you start it again. But your startup is
delayed because it needs a long, long time to make the logical recovery.
I mean it may be very difficult to explain your users...

Many thanks for any reply.
Regards
Peter Dzvonyar
SAP-Consultant R/3 BC

------

Here is the relevant part of the message log:

....
18:17:35  Checkpoint Completed:  duration was 1 seconds.
18:17:36  Informix Dynamic Server Stopped.

Tue Nov  7 20:06:51 2000

20:06:51  Event alarms enabled.  ALARMPROG =
/informix/Q12/etc/log_full_muell.sh'
.....
20:07:05  Informix Dynamic Server Initialized -- Shared Memory
Initialized.
20:07:05  Physical Recovery Started.
20:07:06  Physical Recovery Complete: 0 Pages Restored.
20:07:06  Logical Recovery Started.
20:17:21  Checkpoint Completed:  duration was 5 seconds.
20:59:59  Checkpoint Completed:  duration was 6 seconds.
21:23:11  Logical Recovery Complete.
        0 Committed, 1 Rolled Back, 0 Open, 0 Bad Locks



Wed, 18 Jun 1902 08:00:00 GMT
 Logical recovery during startup takes long

Can you give us some more to go on.  the $ONCONFIG, hardware info and the
size of the instance.

--
Tony Flaherty
Snr. A/P, Informix DBA, HpUx Admin, Gimmi a broom!
MFS Ltd.

Quote:

>We stopped our Informix server (7.31.FC6) and started it again a bit
>later. The engine needed more than an hour to finish the logical
>recovery. It really rolled back one transaction and the only reason I
>can accept that this might have been a long one.
>Questions:
>1. Is it a normal behaviour? In case it really had a relative long
>uncomplete transaction should it not have rolled it back at shutdown?
>2. How can this be prevented? Let us suppose you have a half an hour
>time-window to do a quick maintenance at your productive engine, you
>shut Online down, and later you start it again. But your startup is
>delayed because it needs a long, long time to make the logical recovery.
>I mean it may be very difficult to explain your users...

>Many thanks for any reply.
>Regards
>Peter Dzvonyar
>SAP-Consultant R/3 BC

>------

>Here is the relevant part of the message log:

>....
>18:17:35  Checkpoint Completed:  duration was 1 seconds.
>18:17:36  Informix Dynamic Server Stopped.

>Tue Nov  7 20:06:51 2000

>20:06:51  Event alarms enabled.  ALARMPROG =
>/informix/Q12/etc/log_full_muell.sh'
>.....
>20:07:05  Informix Dynamic Server Initialized -- Shared Memory
>Initialized.
>20:07:05  Physical Recovery Started.
>20:07:06  Physical Recovery Complete: 0 Pages Restored.
>20:07:06  Logical Recovery Started.
>20:17:21  Checkpoint Completed:  duration was 5 seconds.
>20:59:59  Checkpoint Completed:  duration was 6 seconds.
>21:23:11  Logical Recovery Complete.
>        0 Committed, 1 Rolled Back, 0 Open, 0 Bad Locks



Wed, 18 Jun 1902 08:00:00 GMT
 Logical recovery during startup takes long

Quote:

>Can you give us some more to go on.  the $ONCONFIG, hardware info and the
>size of the instance.

>--
>Tony Flaherty
>Snr. A/P, Informix DBA, HpUx Admin, Gimmi a broom!
>MFS Ltd.


>>We stopped our Informix server (7.31.FC6) and started it again a bit
>>later. The engine needed more than an hour to finish the logical
>>recovery. It really rolled back one transaction and the only reason I
>>can accept that this might have been a long one.
>>Questions:
>>1. Is it a normal behaviour? In case it really had a relative long
>>uncomplete transaction should it not have rolled it back at shutdown?

Depends how the engine was shutdown

onmode -yuk

and waiting for the engine to completely die it the correct way.
Check the FAQ at www.smooth1.demon.co.uk

Quote:
>>2. How can this be prevented? Let us suppose you have a half an hour
>>time-window to do a quick maintenance at your productive engine, you
>>shut Online down, and later you start it again. But your startup is
>>delayed because it needs a long, long time to make the logical recovery.
>>I mean it may be very difficult to explain your users...

Make sure no long transactions are active before shutting down.
I think there is something in the FAQ about checking how much of
the logical logs are currently in use.

- Show quoted text -

Quote:
>>Many thanks for any reply.
>>Regards
>>Peter Dzvonyar
>>SAP-Consultant R/3 BC

>>------

>>Here is the relevant part of the message log:

>>....
>>18:17:35  Checkpoint Completed:  duration was 1 seconds.
>>18:17:36  Informix Dynamic Server Stopped.

>>Tue Nov  7 20:06:51 2000

>>20:06:51  Event alarms enabled.  ALARMPROG =
>>/informix/Q12/etc/log_full_muell.sh'
>>.....
>>20:07:05  Informix Dynamic Server Initialized -- Shared Memory
>>Initialized.
>>20:07:05  Physical Recovery Started.
>>20:07:06  Physical Recovery Complete: 0 Pages Restored.
>>20:07:06  Logical Recovery Started.
>>20:17:21  Checkpoint Completed:  duration was 5 seconds.
>>20:59:59  Checkpoint Completed:  duration was 6 seconds.
>>21:23:11  Logical Recovery Complete.
>>        0 Committed, 1 Rolled Back, 0 Open, 0 Bad Locks



Wed, 18 Jun 1902 08:00:00 GMT
 Logical recovery during startup takes long

Quote:

> We stopped our Informix server (7.31.FC6) and started it again a bit
> later. The engine needed more than an hour to finish the logical
> recovery. It really rolled back one transaction and the only reason I
> can accept that this might have been a long one.
> Questions:
> 1. Is it a normal behaviour? In case it really had a relative long
> uncomplete transaction should it not have rolled it back at shutdown?
> 2. How can this be prevented? Let us suppose you have a half an hour
> time-window to do a quick maintenance at your productive engine, you
> shut Online down, and later you start it again. But your startup is
> delayed because it needs a long, long time to make the logical recovery.
> I mean it may be very difficult to explain your users...

One could always get into a situation like this because irrespective of
whether Informix rolls back the transaction during shutdown or during "fast
recovery" at startup, the time taken is (largely) dependent on the size (in
terms of logical log rows) of the "Open" transaction.

You could get some clue to the time that the shutdown/startup is going to
take by determining (before shutting down), the earliest log used by the
current open transactions and comparing it to the current log. Try

select logbeg, nlocks from sysmaster:systxptab
where logbeg > 0
order by logbeg;

or retrieving it using the "onstat -x" command.

If you find one which appears to use a large amount of log space, you could
kill it before-hand (if you have the guts), or caution your users about the
possibility of the delay.

Rudy



Wed, 18 Jun 1902 08:00:00 GMT
 Logical recovery during startup takes long

Peter

1.      Not normal for most cases.  However if you stop the engine with a very
long transaction in progress, without first completing that transaction,
then at startup the engine rolls the transaction back.  This may take some
time.  I have seen a case of over 3 hours - but I was aware of a huge
transaction  involving temp tables of hundreds Mb.

2.      Dont shut down with a big transaction in progress.

MW

Quote:
-----Original Message-----


Sent: Thursday, November 09, 2000 1:58 AM

Subject: Logical recovery during startup takes long

We stopped our Informix server (7.31.FC6) and started it again a bit
later. The engine needed more than an hour to finish the logical
recovery. It really rolled back one transaction and the only reason I
can accept that this might have been a long one.
Questions:
1. Is it a normal behaviour? In case it really had a relative long
uncomplete transaction should it not have rolled it back at shutdown?
2. How can this be prevented? Let us suppose you have a half an hour
time-window to do a quick maintenance at your productive engine, you
shut Online down, and later you start it again. But your startup is
delayed because it needs a long, long time to make the logical recovery.
I mean it may be very difficult to explain your users...

Many thanks for any reply.
Regards
Peter Dzvonyar
SAP-Consultant R/3 BC

------

Here is the relevant part of the message log:

....
18:17:35  Checkpoint Completed:  duration was 1 seconds.
18:17:36  Informix Dynamic Server Stopped.

Tue Nov  7 20:06:51 2000

20:06:51  Event alarms enabled.  ALARMPROG =
/informix/Q12/etc/log_full_muell.sh'
.....
20:07:05  Informix Dynamic Server Initialized -- Shared Memory
Initialized.
20:07:05  Physical Recovery Started.
20:07:06  Physical Recovery Complete: 0 Pages Restored.
20:07:06  Logical Recovery Started.
20:17:21  Checkpoint Completed:  duration was 5 seconds.
20:59:59  Checkpoint Completed:  duration was 6 seconds.
21:23:11  Logical Recovery Complete.
        0 Committed, 1 Rolled Back, 0 Open, 0 Bad Locks



Mon, 28 Apr 2003 04:30:34 GMT
 Logical recovery during startup takes long

Quote:
> You could get some clue to the time that the shutdown/startup is going to
> take by determining (before shutting down), the earliest log used by the
> current open transactions and comparing it to the current log. Try

> select logbeg, nlocks from sysmaster:systxptab
> where logbeg > 0
> order by logbeg;

Thanks Rudy, that is what I looked for. With this query you have at least the
chance to prepare yourself for a longer startup. But this case should occur
very seldom: we have had Informix for 4 years and we have never had something
like this.

Quote:
> If you find one which appears to use a large amount of log space, you could
> kill it before-hand (if you have the guts), or caution your users about the
> possibility of the delay.

I suppose you have to use the "onmode -z" command. How can you identify the
session to kill?
What happens if you manually cancel the transaction? Is it cancelled
immediately without any roll back?

Thanks
Peter
---
Peter Dzvonyar
SAP-Consultant R/3 BC



Wed, 18 Jun 1902 08:00:00 GMT
 Logical recovery during startup takes long

Quote:

> I suppose you have to use the "onmode -z" command. How can you identify the
> session to kill?
> What happens if you manually cancel the transaction? Is it cancelled
> immediately without any roll back?

onmode -z <session> will terminate the session "gracefully", rolling back all
the changes it made. Note that this could take longer than if it were rolled
back during a shutdown or fast recovery (because resources aren't dedicated to
it).

Here's a script that determines the session  "occupying" the maximum log space.
Its  driven by the "onstat -x" command

#!/bin/ksh
# Extracts the following
#  1. The earliest log with an open trx.
#  2. The session information of the user who is running it

EARLY_WARNING=100
OPENTRX=`onstat -x | \
   grep "^.............-" | \
   grep -v "^................................0" | \
   awk '{print $5, $3}' | sort -n | head -1`

if [ "$OPENTRX" = "" ]; then
   echo No open transactions found.
   exit 0
fi

set $OPENTRX
EARLIEST_USED_LOG=$1
set `onstat -u | grep $2 | awk '{print $3, $4}'`
CULPRIT_SESSION=$1
CULPRIT_USER=$2
CURRENT_LOG=`onstat -l | grep "^.....................-C" | awk '{print $4}'`
LTXHWM=`onstat -c | grep "^LTXHWM" | awk '{print $2}'`
NO_OF_LOGS=`onstat -l | grep "\-.\-.\-" | wc -l`
LOGS_USED_BY_CULPRIT=`expr $CURRENT_LOG - $EARLIEST_USED_LOG`
PROBLEM_THRESHOLD=`expr $NO_OF_LOGS \* $LTXHWM \* $EARLY_WARNING  / 10000`

clear
echo "
Earliest Log with Open Transaction :  $EARLIEST_USED_LOG
Current Log                        :  $CURRENT_LOG
Logs used by Culprit               :  $LOGS_USED_BY_CULPRIT
User with Culprit Transaction      :  $CULPRIT_USER
Session with Culprit Transaction   :  $CULPRIT_SESSION

High Water Mark                    :  $LTXHWM Percent
Number of Logs                     :  $NO_OF_LOGS
Problem Threshold # of logs        :  $PROBLEM_THRESHOLD
"
if [ $LOGS_USED_BY_CULPRIT -gt $PROBLEM_THRESHOLD ]; then
   echo
   echo
   echo PROBLEMS AHOY!!!
fi

onstat -g sql $1

###### End #####

Cheers
Rudy



Wed, 18 Jun 1902 08:00:00 GMT
 
 [ 7 post ] 

 Relevant Pages 

1. Error during starting a crashed Server cause logical recovery

2. Logical recovery during database start using oninit...

3. EVEN MORE URGENT: assertion failure during logical recovery

4. HELP: Server stalling at startup during Database Recovery...

5. RECOVERY TAKING A LONG TIME

6. checkpoint during long insert blocks on sp calling long insert

7. SQL Server startup takes "forever"

8. Slow startup takes "forever"

9. Please help - Recovery taking forever

10. Counting logical io during performance benchmarking

11. HELP: "Database file...is subject to logical recovery

12. Database file is subject to logical recovery


 
Powered by phpBB® Forum Software