GoldenGate | IT Remote

This post deals with the failover strategy I decided to use for a GoldenGate Hub architecture
It works like this:

The technologies involved

Overview Of GG Component Roles
Since Oracle has all of the components to effectively implement this I decided to stay within the Oracle stack completely

Oracle RAC
Oracle DataGuard
Oracle Clusterware (CRS)
Oracle Grid Infrastructure Agents / XAG (9.1)
Oracle DBFS
Oracle GoldenGate

Oracle Real Application Clusters (RAC)
GoldenGate uses database and GRID infrastructure for several key processes.
Clusterware resources are defined for both DBFS and GoldenGate.
Definition for failover is also located here for the resiliency of the databases that are hosted here for DataGuard.
File systems are defined in the database with ability to be switched and failed over via Dataguard.

Oracle DataGuard
Oracle’s recommended strategy for failover of GoldenGate processes and DBFS.
ACFS is explicitly not recommended due to lack of geographic failover in automated fashion.
Seamless integration w/Oracle database
Can be mostly automated
Well understood process for most Oracle DBA’s
Used on virtually all resiliency patterns for Oracle databases

Oracle Clusterware (CRS) Cluster Ready Services
Used for defining service architecture
GoldenGate requires service-based architecture to achieve local and geographic resiliency
XAG is a pass-thru for CRS definitions for GoldenGate
Allows clusterware to detect state transitions between primary and standby
Role based startup and shutdown during failover/switchover
Establishes dependencies on DBFS host database

Oracle Grid Infrastructure Agents / XAG (ver 9.1)

The GRID Infrastructure components that provide HA to application resources like GoldenGate
Provides a “ready-to-use” template for application HA
Could be defined as a wrapper for CRS definition of resources for GoldenGate
Pre-defined clusterware resources for GoldenGate, simplifies dependency creation, however does not eliminate knowing the mix of parameters to make it all work properly
Provides a replacement interface (AGCTL) to start, stop, relocate, and manage the location of the environment.
Operates based on the status and role of the database (database up,down and primary or standby)
Forces the use of defining GoldenGate “instances”

Oracle Database File System (DBFS)
Oracle Database File System
Provides file system entirely hosted within database
Allows for DataGuard to failover easily without additional work
Simply provides resiliency to file systems hosting GoldenGate architecture components and trail file and recovery data

Oracle Overall Product Stack Integration

100% Oracle technology stack
Robust resiliency
Downside is complexity for configuration
Once setup it works !!!!!
Takes a lot of testing (weeks and months) of many scenarios to get it right
Very little to no documentation on the whole recipe from Oracle
Takes understanding of all of the components of the stack to get it right
Once it’s right, it rocks!!!!

DBFS and GoldenGate Hub

There are (2) separate classes of DBFS disk, discovered as a result of the Oracle consulting engagement
GoldenGate “common” disk per instance of GoldenGate
Project specific disk per project and failover clusters for GoldenGate
DBFS common disk for certain GoldenGate directories in an admin capacity
DBFS specific disk allocated to trail files for specific projects
Mountpoint will be identified by /dbfs_XXXXXXXX
Mountpoint = /dbfs_xxxxxxxx filesystem =
Naming is “not” nodally dependent and has no relationship to nodal affinity, any instance of GoldenGate can and should be able to run on any node in the local or remote cluster.
File system naming will be consistent to identify GoldenGate instances
ggxxyzzz format
gg = GoldenGate instance
xx = geographic location of the instance like “tx” = Texas
y = environment type like “d” for development
zzz = sequential number “001” 1st environment

DBFS Disk in GG Hub Architecture

Each node running GoldenGate can have one or more “instances” of GoldenGate running, at a minimum each node in a cluster will run one instance of GoldenGate
Having multiple instances per node can be a by-product or a demand of isolation or scalability.
Each instance of GoldenGate needs it own DBFS filesystem from an admin and overhead standpoint. This is strictly for overhead components of the instance and not for client consumption.
Each project will need to estimate the amount of volume for DBFS project specific data, for trail files and recovery. Since this is project specific the GoldenGate can add whatever is determined to be appropriate for trail file definition.
For example a 6 node clustered pair will require minimum of 7 DBFS mounts, one for each of the 6 instances, and at least one for project specific trail files. If another version of GoldenGate were to be installed, this example doubles, as the admin DBFS cannot be utilized by a different set of binaries.

Code trees

There will need to be a code tree for each node and replicated on every other node.
For example: The directory /dbfs_ggtxd001/ggtxd001 will need to exist on all 6 nodes in a cluster pair. This is due to the failover scenarios could allow for the instance to run on any of the local cluster nodes, in addition to any of the nodes on the geographic failover cluster (only when in a cluster failover state)

Version Support

On each node there must be a unique DBFS common file system and code tree for each version of GoldenGate.
There must be a “new” install of GoldenGate for each version of source database supported.
Example:
Database Source version = 12.1
GoldenGate Version = 12.2
This equals one directory code tree, furthermore must be copied to “all” other nodes in the cluster pair, not just local cluster nodes. If we now want to support a 11g database on the source with the same version of GoldenGate we will need a complete new install of the code tree and copied to all clustered pairs. This includes setup of all resiliency components.

Failover Limitations

Currently the failover scenarios provide for:
Node-to-node failover
This would include all DBFS filesystem mounts on this node failing over to another node. If there are multiple applications / replications running in the instance, all of the replications are failed over and affected. Individual failover of a single set of replications within a GoldenGate instance is currently “not supported”, and is not being planned for at this time. The detail and effort to support this currently would be overwhelming, in addition, all of the process would be manual and largely susceptible to human error.
Geographic Cluster Failover
This would entail the “entire” cluster DBFS database which contains “all” common data for instances running on all of the nodes and project specific DBFS disk failing over. There is no individual application replication group failover. This is a complete failover of the entire all customers would failover running on that cluster. The GoldenGate hub was designed to achieve resiliency and not provide graceful “individual” application level switchover. The GoldenGate code tree itself is not failed over only the DBFS filesystems which has “state” data. It is unnecessary to failover code trees since they are duplicated on all available failover nodes. To date, there has been no design requirement to achieve individual application level switchover as far as GoldenGate processes are concerned.

Planning GoldenGate Hub Onboarding

Placement
Which clusters? Nodes? Geographic location?
Capacity Planning
Which nodes are underutilized, make sure to leave allocation for nodes to fail over to “this” node
How much CPU is enough, based on the replications requested?
How much disk? Transaction volume? IOPS is critical, having enough “arms” on the data is critical
Do we need to add nodes to environment to accommodate project? The cluster can add nodes and therefore capacity while remaining in the resiliency group already established.
Do we need to add clusters? Should the application be on an isolated set of nodes? Resiliency can be extended to add additional clusters (DBFS databases).
Monitoring of existing processes to determine how to load-level the environment, the is overall oversight to analyze the environment over time to see how it behaves over time.

Disaster Recovery
What failover options are needed? Any at all? Local, geographic? Both?

Understanding the Clusters / Nodes
Each node stands alone, however each node can host a failed node from any other node in the cluster pair
A cluster pair is defined as two clusters reciprocating in a failover capacity ie: TX 3 node cluster and VA 3 node cluster are a clustered pair.
Either is designed and limited so that a cluster can take on the traffic from the other cluster in the pair
Definitions for each node must house the infrastructure from ALL other nodes in the cluster pair (local cluster and remote cluster)
Each instance of GoldenGate has an overhead infrastructure for itself independent of the client data or replication groups (common DBFS)

Understanding the Clusters / Nodes (part 2)
Each version of GoldenGate must support one and only one source version per code tree (Oracle verified)
A system will need to be devised to easily track code trees and a naming scheme that will eliminate or at least minimize error.
XAG (CRS) tracks GoldenGate instances at the cluster level not at the node level

Scripts
When configuring the environment, building the components and building the resiliency. Several hundred scripts were written in order to make the process of setup easier.
Directory structure
Script examples
Concept of function libraries

oow14-con7773-goldengateperftune-2332015

maa-gg-performance-1969630

I believe this paper was written by Stephan Haisley and has most of the configuration options we need

maa-wp-gg-oracledbm-128760

Heartbeat Table

Oracle Heartbeat Doc

GoldenGate 12.3 deployment

This is a presentation from Oct 2017 about setup of the XAG failover process

Failover Setup XAG

When setting up a GoldenGate instance its important to remember that you also have to setup a CRS resource for failover.

Here is a list of the scripts created for this purpose


total 112
-rw-r--r-- 1 oracle dba     0 Aug 10 08:51 crsdbfs_lst.txt
-rwxr-xr-x 1 oracle dba   612 Aug  9 12:51 crs_relo_ggtxd001_node_mp.ksh
-rwxr-xr-x 1 oracle dba   520 Aug  9 12:49 crs_relo_ggtxd001_node_np.ksh
-rwxr-xr-x 1 oracle dba   520 Aug  9 11:06 crs_relo.ksh
-rwxr-xr-x 1 oracle dba   522 Aug  9 10:55 crs_stop.ksh
-rwxr-xr-x 1 oracle dba   522 Aug  9 10:52 crs_stop_ggtxd001.ksh
-rwxr-xr-x 1 oracle dba 14713 Aug  9 10:49 mount_dbfs_ggtxd001.bsh
-rw-r--r-- 1 oracle dba    10 Aug  9 10:26 nohup.out
-rw-r--r-- 1 oracle dba  4406 Aug  9 10:18 mount-dbfs_ggtxd001.conf
-rwxr-xr-x 1 oracle dba   516 Aug  9 09:48 crs_start_ggtxd001.ksh
-rwxr-xr-x 1 oracle dba   513 Aug  9 09:45 crs_stat.ksh
-rwxr-xr-x 1 oracle dba   524 Aug  9 09:35 crs_start.ksh
-rwxr-xr-x 1 oracle dba   543 Aug  9 09:27 crs_add_res_dbfs_ggtxd001.ksh
-rwxr-xr-x 1 oracle dba   429 Aug  9 09:00 crs_relo_2_1.ksh
-rwxr-xr-x 1 oracle dba 13695 Aug  7 11:03 mount_dbfs.bsh
-rw-r--r-- 1 oracle dba  3971 Aug  3 16:23 mount-dbfs_test01.conf
-rwxr-xr-x 1 oracle dba 13576 Aug  3 15:44 mount-dbfs_01.bsh
-rw-r--r-- 1 oracle dba  3962 Aug  3 15:34 mount-dbfs.conf
-rw-r--r-- 1 oracle dba  3761 Aug  3 15:22 mount-dbfs.conf.orig


# crsctl delete resource dbfs_ggtxd001

# crsctl add resource dbfs_ggtxd001 -type cluster_resource -attr "ACTION_SCRIPT=/oracle/admin/scripts/clle_dbfscommon/mount_dbfs.bsh, CHECK_INTERVAL=30, RESTART_ATTEMPTS=10"

# crsctl stat resource dbfs_ggtxd001

# /oracle_crs/product/12.1.0.2/crs_1/bin/crsctl relocate resource dbfs_ggtxd001 -node lrdne67np

echo
echo "Relocate ggtxd001 to node mp..."
echo

/oracle_crs/product/12.1.0.2/crs_1/bin/crsctl relocate resource dbfs_ggtxd001 -n lrdne67mp

echo
echo "Status from CRS...."
echo

/oracle_crs/product/12.1.0.2/crs_1/bin/crsctl status resource dbfs_ggtxd001

Let us take a look at the process of configuring Goldengate 12c to work in an Oracle 12c Grid Infrastructure RAC or Exadata environment using DBFS on Linux x86-64.

Simply put the Oracle Database File System (DBFS) is a standard file system interface on top of files and directories that are stored in database tables as LOBs.

In one of my earlier posts we had seen how we can configure Goldengate in an Oracle 11gR2 RAC environment using ACFS as the shared location.

Until recently Exadata did not support using ACFS but ACFS is now supported on version 12.1.0.2 of the RAC Grid Infrastructure.

In this post we will see how the Oracle DBFS (Database File System) will be setup and configured and used as the shared location for some of the key Goldengate files like the trail files and checkpoint files.

In summary the broad steps involved are:

1) Install and configure FUSE (Filesystem in Userspace)
2) Create the DBFS user and DBFS tablespaces
3) Mount the DBFS filesystem
5) Create symbolic links for the Goldengate software directories dirchk,dirpcs, dirdat,
BR to point to directories on DBFS
6) Create the Application VIP
7) Download the mount-dbfs.sh script from MOS and edit according to our environment
8) Create the DBFS Cluster Resource
9) Download and install the Oracle Grid Infrastructure Bundled Agent
10) Register Goldengate with the bundled agents using agctl utility

Install and Configure FUSE

Using the following command check if FUSE has been installed:
lsmod | grep fuse

FUSE can be installed in a couple of ways – either via the Yum repository or using the RPMs available on the OEL software media.

Using Yum:

     yum install kernel-devel
     yum install fuse fuse-libs

Via RPM’s:

If installing from the media, then these are the RPMs which are required:

kernel-devel-2.6.32-358.el6.x86_64.rpm
fuse-2.8.3-4.el6.x86_64.rpm
fuse-devel-2.8.3-4.el6.x86_64.rpm
fuse-libs-2.8.3-4.el6.x86_64.rpm

A group named fuse must be created and the OS user who will be mounting the DBFS filesystem needs to be added to the fuse group.

For example if the OS user is ‘oracle’, then we use the usermod command to modify the secondary group membership for the oracle user. Important is to ensure we do not exclude any current groups the user already is a member of.

# /usr/sbin/groupadd fuse
# usermod -G dba,fuse oracle

One of the mount options which we will use is called “allow_other” which will enable users other than the user who mounted the DBFS file system to access the file system.

The /etc/fuse.conf needs to have the “user_allow_other” option as shown below.

$ # cat /etc/fuse.conf
user_allow_other

chmod 644 /etc/fuse.conf

Important: Ensure that the variable LD_LIBRARY_PATH is set and includes the path to $ORACLE_HOME/lib. Otherwise we will get an error when we try to mount the DBFS using the dbfs_client executable.

Create the DBFS tablespaces and mount the DBFS

If the source database used by Goldengate Extract is running on RAC or hosted on Exadata then we will create ONE tablespace for DBF.

If the target database where Replicat will be applying changes in on RAC or Exadata, then we will create TWO tableapaces for DBFS with each tablespace having different logging and caching settings – typically one tablespace will be used for the Goldengate trail files and the other for the Goldengate checkpoint files.

If using Exadata then typically an ASM disk group called DBFS_DG will already be available for us to use, otherwise on an non-Exadata platform we will create a separate ASM disk group for holding DBFS files.

Note than since we will be storing Goldengate trail files on DBFS, a best practice would be to allocate enough disk space/tablespace space to be able to retain at least a minimum of 12 hours of trail files. So we need to keep that in mind when we create the ASM disk group or create the DBFS tablespace.

CREATE bigfile TABLESPACE dbfs_ogg_big datafile '+DBFS_DG' SIZE
1000M autoextend ON NEXT 100M LOGGING EXTENT MANAGEMENT LOCAL
AUTOALLOCATE SEGMENT SPACE MANAGEMENT AUTO;

Create the DBFS user

CREATE USER dbfs_user IDENTIFIED BY dbfs_pswd DEFAULT TABLESPACE dbfs_ogg_big QUOTA UNLIMITED ON dbfs_ogg_big; GRANT create session, create table, create view, create procedure, dbfs_role TO dbfs_user; Create the DBFS Filesystem To create the DBFS filesystem we connect as the DBFS_USER Oracle user account and either run the dbfs_create_filesystem.sql or dbfs_create_filesystem_advanced.sql script located under $ORACLE_HOME/rdbms/admin directory. For example: cd $ORACLE_HOME/rdbms/admin sqlplus dbfs_user/dbfs_pswd SQL> @dbfs_create_filesystem dbfs_ogg_big gg_source OR

SQL> @dbfs_create_filesystem_advanced.sql dbfs_ogg_big  gg_source
      nocompress nodeduplicate noencrypt non-partition

Where …
o dbfs_ogg_big: tablespace for the DBFS database objects
o gg_source: filesystem name, this can be any string and will appear as a directory under the mount point

If we were configuring DBFS on the Goldengate target or Replicat side of things,it is recommended to use the NOCACHE LOGGING attributes for the tablespace which holds the trail files because of the sequential reading and writing nature of the trail files.

For the checkpoint files on the other hand it is recommended to use CACHING and LOGGING attributes instead.

The example shown below illustrates how we can modify the LOB attributes.(assuming we have created two DBFS tablespaces)

SQL> SELECT table_name, segment_name, cache, logging FROM dba_lobs 
     WHERE tablespace_name like 'DBFS%'; 

TABLE_NAME              SEGMENT_NAME                CACHE     LOGGING
----------------------- --------------------------- --------- -------
T_DBFS_BIG              LOB_SFS$_FST_1              NO        YES
T_DBFS_SM               LOB_SFS$_FST_11             NO        YES



SQL> ALTER TABLE dbfs_user.T_DBFS_SM 
     MODIFY LOB (FILEDATA) (CACHE LOGGING); 


SQL> SELECT table_name, segment_name, cache, logging FROM dba_lobs 
     WHERE tablespace_name like 'DBFS%';  

TABLE_NAME              SEGMENT_NAME                CACHE     LOGGING
----------------------- --------------------------- --------- -------
T_DBFS_BIG              LOB_SFS$_FST_1              NO        YES
T_DBFS_SM               LOB_SFS$_FST_11             YES       YES

As the user root, now create the DBFS mount point on ALL nodes of the RAC cluster (or Exadata compute servers).

# cd /mnt 
# mkdir DBFS 
# chown oracle:oinstall DBFS/

Create a custom tnsnames.ora file in a separate location (on each node of the RAC cluster).

In our 2 node RAC cluster for example these are entries we will make for the ORCL RAC database.

Node A

orcl =
  (DESCRIPTION =
      (ADDRESS =
        (PROTOCOL=BEQ)
        (PROGRAM=/u02/app/oracle/product/12.1.0/dbhome_1/bin/oracle)
        (ARGV0=oracleorcl1)
        (ARGS='(DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=BEQ)))')
        (ENVS='ORACLE_HOME=/u02/app/oracle/product/12.1.0/dbhome_1,ORACLE_SID=orcl1')
      )
  (CONNECT_DATA=(SID=orcl1))
)

Node B

orcl =
  (DESCRIPTION =
      (ADDRESS =
        (PROTOCOL=BEQ)
        (PROGRAM=/u02/app/oracle/product/12.1.0/dbhome_1/bin/oracle)
        (ARGV0=oracleorcl2)
        (ARGS='(DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=BEQ)))')
        (ENVS='ORACLE_HOME=/u02/app/oracle/product/12.1.0/dbhome_1,ORACLE_SID=orcl2')
      )
  (CONNECT_DATA=(SID=orcl2))
)

We will need to provide the password for the DBFS_USER database user account when we mount the DBFS filesystem via the dbfs_mount command. We can either store the password in a text file or we can use Oracle Wallet to encrypt and store the password.

In this example we are not using the Oracle Wallet, so we need to create a file (on all nodes of the RAC cluster) which will contain the DBFS_USER password.

For example:

echo dbfs_pswd > passwd.txt 

nohup $ORACLE_HOME/bin/dbfs_client dbfs_user@orcl -o allow_other,direct_io /mnt/DBFS < ~/passwd.txt &

After the DBFS filesystem is mounted successfully we can now see it via the ‘df’ command like shown below. Note in this case we had created a tablespace of 5 GB for DBFS and the space allocated and used displays that.

$  df -h |grep dbfs

dbfs-dbfs_user@:/     4.9G   11M  4.9G   1% /mnt/dbfs

The command used to unmount the DBFS filesystem would be:

fusermount -u

Create links from Oracle Goldengate software directories to DBFS

Create the following directories on DBFS

$ mkdir /mnt/gg_source/goldengate 
$ cd /mnt/gg_source/goldengate 
$ mkdir dirchk
$ mkdir dirpcs 
$ mkdir dirprm
$ mkdir dirdat
$ mkdir BR

Make the symbolic links from Goldengate software directories to DBFS


cd /u03/app/oracle/goldengate

Move the old directories to a .old version
mv dirchk dirchk.old
mv dirdat dirdat.old
mv dirpcs dirpcs.old
mv dirprm dirprm.old
mv BR BR.old

ln -s /mnt/dbfs/gg_source/goldengate/dirchk dirchk
ln -s /mnt/dbfs/gg_source/goldengate/dirdat dirdat
ln -s /mnt/dbfs/gg_source/goldengate/dirprm dirprm
ln -s /mnt/dbfs/gg_source/goldengate/dirpcs dirpcs
ln -s /mnt/dbfs/gg_source/goldengate/BR BR

For example :

[oracle@rac2 goldengate]$ ls -l dirdat
lrwxrwxrwx 1 oracle oinstall 26 May 16 15:53 dirdat -> /mnt/dbfs/gg_source/goldengate/dirdat

Also copy the jagent.prm file which comes out of the box located in the dirprm directory

[oracle@rac2 dirprm.old]$ pwd
/u03/app/oracle/goldengate/dirprm.old
[oracle@rac2 dirprm.old]$ cp jagent.prm /mnt/dbfs/gg_source/dirprm

Note – in the Extract parameter file(s) we need to include the BR parameter pointing to the DBFS stored directory

BR BRDIR /mnt/dbfs/gg_source/goldengate/BR
Create the Application VIP

Typically the Goldengate source and target databases will be located outside the same Exadata machine and even in a non-Exadata RAC environment the source and target databases are on usually on different RAC clusters. In that case we have to use an Application VIP which is a cluster resource managed by Oracle Clusterware and the VIP assigned to one node will be seamlessly transferred to another surviving node in the event of a RAC (or Exadata compute) node failure.

Run the appvipcfg command to create the Application VIP as shown in the example below.

$GRID_HOME/bin/appvipcfg create -network=1 -ip= 192.168.56.90 -vipname=gg_vip_source -user=root

We have to assign an unused IP address to the Application VIP. We run the following command to identify the value we use for the network parameter as well as the subnet for the VIP.

$ crsctl stat res -p |grep -ie .network -ie subnet |grep -ie name -ie subnet

NAME=ora.net1.network
USR_ORA_SUBNET=192.168.56.0

As root give the Oracle Database software owner permissions to start the VIP.

$GRID_HOME/bin/crsctl setperm resource gg_vip_source -u user:oracle:r-x

As the Oracle database software owner start the VIP

$GRID_HOME/bin/crsctl start resource gg_vip_source

Verify the status of the Application VIP

$GRID_HOME/bin/crsctl status resource gg_vip_source

Download the mount-dbfs.sh script from MOS

Download the mount-dbfs.sh script from MOS note 1054431.1.

Copy it to a temporary location on one of the Linux RAC nodes and run the command as root:

# dos2unix /tmp/mount-dbfs.sh

Change the ownership of the file to the Oracle Grid Infrastructure owner and also copy the file to the $GRID_HOME/crs/script directory location.

Next make changes to the environment variable settings section of the mouny-dbfs.sh script as required. These are the changes I made to the script.

### Database name for the DBFS repository as used in "srvctl status database -d $DBNAME"
DBNAME=orcl

### Mount point where DBFS should be mounted
MOUNT_POINT=/mnt/dbfs

### Username of the DBFS repository owner in database $DBNAME
DBFS_USER=dbfs_user

### RDBMS ORACLE_HOME directory path
ORACLE_HOME=/u02/app/oracle/product/12.1.0/dbhome_1

### This is the plain text password for the DBFS_USER user
DBFS_PASSWD=dbfs_user

### TNS_ADMIN is the directory containing tnsnames.ora and sqlnet.ora used by DBFS
TNS_ADMIN=/u02/app/oracle/admin

### TNS alias used for mounting with wallets
DBFS_LOCAL_TNSALIAS=orcl

Create the DBFS Cluster Resource

Before creating the Cluster Resource for DBFS,test the mount-dbfs.sh script

$ ./mount-dbfs.sh start
$ ./mount-dbfs.sh status
Checking status now
Check – ONLINE

$ ./mount-dbfs.sh stop

As the Grid Infrastructure owner create a script called add-dbfs-resource.sh and store it in the $ORACLE_HOME/crs/script directory.

This script will create a Cluster Managed Resource called dbfs_mount by calling the Action Script mount-dbfs.sh which we had created earlier.

Edit the following variables in the script as shown below:

ACTION_SCRIPT
RESNAME
DEPNAME ( this can be the Oracle database or a database service)
ORACLE_HOME

#!/bin/bash
ACTION_SCRIPT=/u02/app/12.1.0/grid/crs/script/mount-dbfs.sh
RESNAME=dbfs_mount
DEPNAME=ora.orcl.db
ORACLE_HOME=/u01/app/12.1.0.2/grid
PATH=$ORACLE_HOME/bin:$PATH
export PATH ORACLE_HOME
crsctl add resource $RESNAME \
-type cluster_resource \
-attr "ACTION_SCRIPT=$ACTION_SCRIPT, \
CHECK_INTERVAL=30,RESTART_ATTEMPTS=10, \
START_DEPENDENCIES='hard($DEPNAME)pullup($DEPNAME)',\
STOP_DEPENDENCIES='hard($DEPNAME)',\
SCRIPT_TIMEOUT=300"

Execute the script – it should produce no output.

./ add-dbfs-resource.sh

Download and Install the Oracle Grid Infrastructure Bundled Agent

Starting with Oracle 11.2.0.3 on 64-bit Linux,out-of-the-box Oracle Grid Infrastructure bundled agents were introduced which had predefined clusterware resources for applications like Siebel and Goldengate.

The bundled agent for Goldengate provided integration between Oracle Goldengate and dependent resources like the database, filesystem and the network.

The AGCTL agent command line utility can be used to start and stop Goldengate as well as relocate Goldengate resources between nodes in the cluster.

Download the latest version of the agent (6.1) from the URL below:

http://www.oracle.com/technetwork/database/database-technologies/clusterware/downloads/index.html

The downloaded file will be xagpack_6.zip.

There is an xag/bin directory with the agctl executable already existing in the $GRID_HOME root directory. We need to install the new bundled agent in a separate directory and ensure the $PATH includes [{–nodes <node1,node2[,...]> | –all_nodes}]</node1,node2[,...]>

Register Goldengate with the bundled agents using agctl utility

Using agctl utility create the GoldenGate configuration.

Ensure that we are running agctl from the downloaded bundled agent directory and not from the $GRID_HOME/xag/bin directory or ensure that the $PATH variable has been amended as described earlier.

/home/oracle/xagent/bin/agctl add goldengate gg_source --gg_home /u03/app/oracle/goldengate \
--instance_type source \
--nodes rac1,rac2 \
--vip_name gg_vip_source \
--filesystems dbfs_mount --databases ora.orcl.db \
--oracle_home /u02/app/oracle/product/12.1.0/dbhome_1 \
--monitor_extracts ext1,extdp1

Once GoldenGate is registered with the bundled agent, we should only use agctl to start and stop Goldengate processes. The agctl command will start the Manager process which in turn will start the other processes like Extract, Data Pump and Replicat if we have configured them for automatic restart.

Let us look at some examples of using agctl.

Check the Status – note the DBFS filesystem is also mounted currently on node rac2

$ pwd
/home/oracle/xagent/bin
$ ./agctl status goldengate gg_source
Goldengate  instance 'gg_source' is running on rac2


$ cd /mnt/dbfs/
$ ls -lrt
total 0
drwxrwxrwx 9 root root 0 May 16 15:37 gg_source

Stop the Goldengate environment

$ ./agctl stop goldengate gg_source 
$ ./agctl status goldengate gg_source
Goldengate  instance ' gg_source ' is not running

GGSCI (rac2.localdomain) 1> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     STOPPED
EXTRACT     STOPPED     EXT1        00:00:03      00:01:19
EXTRACT     STOPPED     EXTDP1      00:00:00      00:01:18

Start the Goldengate environment – note the resource has relocated to node rac1 from rac2 and the Goldengate processes on rac2 have been stopped and started on node rac1.

$ ./agctl start goldengate gg_source
$ ./agctl status goldengate gg_source
Goldengate  instance 'gg_source' is running on rac1

GGSCI (rac2.localdomain) 2> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     STOPPED


GGSCI (rac1.localdomain) 1> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     EXT1        00:00:09      00:00:06
EXTRACT     RUNNING     EXTDP1      00:00:00      00:05:22

We can also see that the agctl has unmounted DBFS on rac2 and mounted it on rac1 automatically.
[oracle@rac1 goldengate]$ ls -l /mnt/dbfs
total 0
drwxrwxrwx 9 root root 0 May 16 15:37 gg_source

[oracle@rac2 goldengate]$ ls -l /mnt/dbfs
total 0

Lets test the whole thing!!

Now that we see that the Goldengate resources are running on node rac1,let us see what happens when we reboot that node to simulate a node failure when Goldengate is up and running and the Extract and Data Pump processes are running on the source.

AGCTL and Cluster Services will relocate all the Goldengate resources, VIP, DBFS to the other node seamlessly and we see that the Extract and Data Pump processes have been automatically started up on node rac2.

[oracle@rac1 goldengate]$ su -
Password:
[root@rac1 ~]# shutdown -h now

Broadcast message from oracle@rac1.localdomain
[root@rac1 ~]#  (/dev/pts/0) at 19:45 ...

The system is going down for halt NOW!

Connect to the surviving node rac2 and check ……

[oracle@rac2 bin]$ ./agctl status goldengate gg_source
Goldengate  instance 'gg_source' is running on rac2

GGSCI (rac2.localdomain) 1> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     EXT1        00:00:07      00:00:02
EXTRACT     RUNNING     EXTDP1      00:00:00      00:00:08

Check the Cluster Resource ….

oracle@rac2 bin]$ crsctl stat res dbfs_mount -t
--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
dbfs_mount
      1        ONLINE  ONLINE       rac2                     STABLE
--------------------------------------------------------------------------------

ITRemote

Ready for Action?