ORA-01555 query duration 0 seconds with Dataguard

How many times we have calls from users complaining about their process that failed due to an ORA-01555 error?

We know that if the queries are not well tuned and they modify a lot of data, the image held in the UNDO Tablespace could not be consistent with the real data. But have you ever seen this error right away after executing a SQL statement against a table?
I just did couple of days ago. Here’s the story:
ORA-01555 error appeared in the alertlog’s database with a query duration of 0 seconds.
ORA-01555 caused by SQL statement below (SQL ID: d3rt4tyudufeu, Query Duration=0 sec, SCN: 0x034f.34f660b4)
Any queries plus an analyze table failed right away with ORA-01555:
ERROR at line 1: ORA-0155: snapshot too old: rollback segment number 10 with name “SYSSMU11_1072300523734$” too small

So weird.
After researching a bit on MOS, found a note regarding a bug.
Some minutes later we also started to receive ORA-600 errors related so scn numbers.
ORA-error stack (00600[ktbdchk1: bad dscn])
The MOS note mentions the ORA-01555 and the ORA-600 errors as part of bug 22241601 with a Dataguard configuration. Is worth to mention that yes, we were doing switchover testing recently in this 12.1.0.2 environment.
ALERT Bug 22241601 ORA-600 [kdsgrp1] / ORA-1555 / ORA-600 [ktbdchk1: bad dscn] / ORA-600 [2663] due to Invalid Commit SCN in INDEX (Doc ID 1608167.1)
The solution is to apply the patch but there’s also a tested workaround that is to rebuild online all the indexes of that table.
Hope this helps.

Alfredo

WAIT_FOR_GAP. How to restore missing archivelogs from backup?

In a Dataguard configuration, Oracle’s RFS (Remote File Server) writes redo data to the standby. When for any reason it can’t write this data, MRP (Managed Recovery Process) will wait for the archivelog to be applied and have the status “WAIT_FOR_LOG”. This will lead the standby to be out-of-sync with the primary database.
Sometimes some archivelogs can’t be transferred from primary database to the standby leaving a gap in the archivelog sequence. The MRP process will have the status “WAIT_FOR_GAP”.  
In order to fix the archivelog gap we have to manually transfer the archivelogs missing.
To find the gap you can query v$archive_gap (gv$archive_gap for RAC).
SELECT INST_ID, THREAD#, HIGH_SEQUENCE#, LOW_SEQUENCE# FROM GV$ARCHIVE_GAP;
INST_ID       THREAD#       HIGH_SEQUENCE#       LOW_SEQUENCE#
————- ————— ———————— ————————
2             2             823                  811
You can see that we are missing archivelogs from sequence 811 to 823 for thread 2. If these archivelogs are not available in the primary we have to restore them from backup.
RMAN> RESTORE ARCHIVELOG FROM SEQUENCE 811 UNTIL SEQUENCE 823 THREAD=2;
Keep in mind that parameter THREAD defaults to 1, so you must specify the thread number when you are trying to restore from a different thread.
After restoring these archivelogs the RFS process should transfer them automatically to the standby. Verify if the gap is fixed.
Thanks,

Alfredo

MRP0: Background Media Recovery terminated with error 1237

Some days back checking an Oracle physical standby database found that the DB was some hours back than the primary database.

alter session set nls_date_format=’DD-MM-yyyy HH24:MI:SS’;
show parameter dest
select thread#,max(sequence#) from gv$log_history group by thread#;
select (a.amct-b.bmct)*24 “Hours Standby is Behind: ”
from (select max(completion_time) amct from v$archived_log) a,
(select max(completion_time) bmct from v$archived_log where applied=’YES’) b;
Hours Standby is Behind:
————————-
45.000054
The very next thing to check is what is going on with the MRP process.
select inst_id, process,status,sequence#, thread# from gv$managed_standby where process=’MRP0′;
no rows selected
So, the MRP process wasn’t running in the standby database. Let’s check the alert.log file.
MRP0: Background Media Recovery terminated with error 1237
ORA-01237: cannot extend datafile 13
The mount point where the datafile 13 resides is 100% full, that’s why the MRP couldn’t resize the datafile and was terminated by the instance.
In order to fix this you should increase the size of the mount point or if you have another mount point with enough free space you can do the following:

      ·        Shutdown standby database

SQL> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
·                  ·         Copy the datafile to the new location
#> cp –p users03.dbf /u02/oradata/test/users03.dbf
·          Startup mount standby database
SQL> startup nomount
ORACLE instance started.
SQL> alter database mount standby database;
Database altered.
·                    ·        Modify “standby_file_management” parameter to manual
As per Oracle documentation:
STANDBY_FILE_MANAGEMENT enables or disables automatic standby file management. When automatic standby file management is enabled, operating system file additions and deletions on the primary database are replicated on the standby database.
SQL> alter system set standby_file_management=’MANUAL’ scope=both;
System altered.
·          Rename the datafile in order to reflect the changes in the standby control file.
SQL> alter database rename file ‘/u01/oradata/test/users03.dbf’ to ‘/u02/oradata/test/users03.dbf’;
Database altered.
·          Now let’s reset “standby_file_management” to AUTO.
SQL> alter system set standby_file_management=’AUTO’ scope=both;
System altered.
·          And start the MRP process again.
SQL> alter database recover managed standby database disconnect from session;
Database altered.
After this MRP was able to successfully apply archive logs from primary database.
We have to be sure that every time we increase the size of a datafile in the primary database, have enough free space in the standby server to fit the new size of the datafile.
Thanks,

Alfredo