Oracle Clusterware Arquitecture

CLUSTERWARE ARCHITECTURE AND PROCESSES

Oracle Clusterware provides a mechanism to bind multiple servers together to make databases more highly available. Besides
being the cluster manager for Oracle RAC environments in Oracle Database 10g Release 1 and higher, it is also capable of
providing database failover capabilities as a standalone product. In order to troubleshoot Oracle Clusterware, you must first
understand how it works. While some understanding of Oracle Clusterware is assumed, we'll go over a few of the components
and their roles in this section.
Oracle Clusterware has certain hardware and software requirements. These requirements include shared storage, redundant public network interfaces, redundant private network interfaces, and the same operating system platform and version across all
nodes in the cluster.

Architecture and Infrastructure
In order to carry out its responsibilities, Oracle Clusterware employs a small repository of metadata called the Oracle Cluster
Registry (OCR) and structures called voting disks. In Oracle Clusterware 10g Release 2 and higher, the installer conveniently
offers redundant configuration options for each of these components as well (up to 2 OCRs and 3 voting disks). The OCR
holds information about cluster members, databases and services they offer, as well as other cluster resources like VIPs,
listeners, and ONS processes. The OCR is also where Clusterware tracks the current status of each of these components. The
voting disks hold information about current cluster membership. In the event of a partial cluster failure (for example, if the
interconnect fails), the voting disks help the cluster determine which of the nodes will survive and remain in the cluster.

CSS, CRS, EVM
To carry out its mission, Oracle Clusterware has a number of background processes. You will observe these processes alive
and active on any Oracle Cluster. Each process has specific responsibilities, as noted briefly here.
Process Description
crsd Performs OCR maintenance and manages application resources; runs as the root user
evmd Event Manager detects cluster disruptions and performs Oracle Clusterware callouts
ocssd Manages cluster membership, runs as the Oracle software owner (usually “oracle”); if this process fails, the node is restarted
oprocd
Process Monitor ensures that other processes are running and performs appropriate actions (per the cluster configuration) if they fail; this process does not exist when Oracle Clusterware is integrated with another (“3rd party”) clusterware; this process is called OraFenceService on Windows.
Oracle Process
Manager Daemon
On Windows, this process is a dependency for all Clusterware processes and provides the necessary delay for other Windows services that may be necessary for Oracle Clusterware (like OCFS) to start and become active. This process does not exist on Linux or UNIX systems.
racg The racg process is used to start, stop, and monitor some of the Oracle Clusterware resources like gsd, ons, vip, and other “built-in” resources On Linux and UNIX systems, these processes are individual processes with separate process IDs. On Windows systems, the processes are slightly different, but generally follow the pattern Ora<ProcessName>Service except for oprocd. As noted above, oprocd on Windows is known instead as OraFenceService.

Typically, the most interesting logging information comes from the crsd and ocssd processes. On Linux/UNIX systems, the
logfiles are placed in $ORA_CRS_HOME/log/<nodename>/<processname>. For example, the crsd logs for node1
would be found in $ORA_CRS_HOME/log/node1/crsd.

Comentarios