DataGuard Switchover Recipe MRC

Currently there is a status of UNRESOLVABLE GAP
on database DBFSXD01 and DBFSXD02

mrcche1de [DBFSXD021]> sql

SQL*Plus: Release 12.1.0.2.0 Production on Mon Jul 23 14:54:50 2018
Copyright (c) 1982, 2014, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.

Total System Global Area 8589934592 bytes
Fixed Size                  7663544 bytes
Variable Size            1862271048 bytes
Database Buffers         6710886400 bytes
Redo Buffers                9113600 bytes
ORA-01105: mount is incompatible with mounts by other instances
ORA-01677: standby file name conversion parameters differ from other instance

status for listener on this node

Service "DBFSXD02.mrcconsulting.com" has 3 instance(s).
Instance "DBFSXD021", status BLOCKED, has 1 handler(s) for this service...
Instance "DBFSXD022", status READY, has 1 handler(s) for this service...
Instance "DBFSXD023", status READY, has 1 handler(s) for this service...
BLOCKED

During rolling patching of the cluster

First stop the service, then disable the service, then shutdown the instance

Non-rolling we will stop database

To stop service and instance on cluster

   srvctl stop service -d <DB_NAME> -s <SERVICE_NAME> -i <INSTANCE_NAME>
   srvctl stop service -d DBFSXB01 -s <SERVICE_NAME>  -i DBFSXB011
   srvctl disable service -d <DB_NAME> -s <SERVICE_NAME> -i <INSTANCE_NAME>
   srvctl stop instance -d <DB_NAME> -n <HOSTNAME>

To start service and instance on cluster

   srvctl start instance -d <DB_NAME> -n <HOSTNAME>
   srvctl enable service -d <DB_NAME> -s <SERVICE_NAME> -i <INSTANCE_NAME>
   srvctl start service -d <DB_NAME> -s <SERVICE_NAME> -i <INSTANCE_NAME>

Then we stop cluster as well
This is for rolling patching only

For non-rolling commands

Stop DB and service on the cluster

   srvctl stop service -d <DB_NAME>
   srvctl disable service -d <DB_NAME> -s <SERVICE_NAME>
   srvctl stop database -d <DB_NAME>

To start DB and instance on cluster

   srvctl start database -d <DB_NAME>
   srvctl enable service -d <DB_NAME> -s <SERVICE_NAME>
   srvctl start service -d <DB_NAME>

After stopping database/service we can stop CRS (cluster).
and is this on node for MRP

   alter database recover managed standby database cancel;

to disconnect don't run anything just stop service and DB

DG is looking for thread1 archive 926 which is applied earlier and got deleted
so we need to re-image
As we don't have backup we need to restore DG from primary
we just have it doing DBFS

And increase archive retention from 7 days to 15 days so that it will kept for long
it may take 2 hr est if no issues

   sqlplus "/as sysdba"
   startup nomount pfile='/oracle/admin/DBFSXB01/pfile/initstdby.ora'

   and then

   rman target sys/password@DBFSXD01 auxiliary sys/password@DBFSXB01
   duplicate target database for standby from active database;

RMAN completed

now both Primary and standby are in sync

Primary : DBFSXD01
Standby : DBFSXB01

We already recreated standby for D01 to correct the issue

Rebuilt standby from primary using RMAN duplicate

if we have a failure how long DB duplicate from active will take
Actual copy took less than 30 mins. as DB is very small
Total time it took around 2 hrs.

i think that if we have the message we got at first we will have to rebuild that database
in the future and to prevent it from happening again
we will need to make sure we shutdown and startup properly

increase archive retention to 15 days. so that we don't need to rebuild as long as archives are not deleted in that 15 days.
or we make sure they are not deleted before being applied to the standby side
current script already checking applied and more than 7 days old
in this case it was applied and got deleted after 7 days was applied?

due to instance crash, it is looking for instance1 archive while recovering instance2 archive for crash recovery
but instane1 archive got deleted as it was applied previously

increasing archive retention to 15 days will avoid this issue

on node 1 of VA

I should first take down the mrp process on the standby side

And make sure no lag between primary and standby before switchover

First we need to stop all instances except first node on each side that is both primary and standby, normally MRP will run on standby first node.

After stopping other instances on both primary and standby

Run this command on primary :

   SELECT DATABASE_ROLE from v$DATABASE;
   SELECT SWITCHOVER_STATUS FROM V$DATABASE;
   ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY WITH SESSION SHUTDOWN;

after this is done we need to run below commands on standby side

   SELECT SWITCHOVER_STATUS FROM V$DATABASE;
   ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;
   alter database open;

and then modify CRS configuration with SRVCTL as primary and standby.

This command we need to run on primary

   srvctl modify database -d <new_primary> -r primary -s open

Now we need to start instances on all nodes of primary with SRVCTL
on new standby we need to stop existing instance and CRS configuration as standby by using below command

   srvctl modify database -d <new_standby_db_name> -r physical_standby -s open

Now start DB with SRVCTL command

Recap again:

unmount all DBFS filesystems first on each node

fusermount -uz /dbfs_test01

First on the standby:

srvctl stop instance -d DBFSXB01 -i DBFSXB012,DBFSXB013

Second on the primary

srvctl stop instance -d DBFSXD01 -i DBFSXD012,DBFSXD013

once you have started all instances with SRVCTL
then we need to start MRP on standby


srvctl modify database -d DBFSXB01 -r physical_standby -s open

Here we are modifying the shell of what the name was previously


srvctl modify database -d DBFSXD01 -r physical_standby -s open

D01 and B02

D01 primary for B01

D02 primary for B02

i had failed over D01

D01 Primary for B01

we don't need to any thing with B01 on TX side just we need to make D01 as standby with this command
Previously this was the primary database


srvctl modify database -d DBFSXD01 -r physical_standby -s open

then


srvctl start database -d DBFSXD01

then we need to start MRP on just one node

And we need to open new primary with below command once open DB and then stop


srvctl modify database -d DBFSXB01 -r primary -s open

then we need to start DB on all nodes with SRVCTL
then we need to start MRP on Standby side

mrche1de [DBFSXB011]-> srvctl status database -d DBFSXB01
Instance DBFSXB011 is running on node mrche1de
Instance DBFSXB012 is running on node mrche1df
Instance DBFSXB013 is running on node mrche1dg

this is on VA cluster

Then we can start MRP on node 1 in TX in D01


alter database recover managed standby database using current logfile disconnect from session;

ITRemote

Ready for Action?