In this blog will talk about Oracle Clusterware basics. It is really needed to know the basics before we can start talking about Clusterware installation, configuration and administration. The below information is an extract from the Oracle Clusterware, Administration and Deployment Guide 11g Release 2 (11.2) (http://docs.oracle.com/cd/E14072_01/rac.112/e10717.pdf).
What is Oracle Clusterware?
Oracle Clusterware enables servers to communicate with each other, so that they
appear to function as a collective unit. This combination of servers is commonly
known as a cluster. Although the servers are standalone servers, each server has
additional processes that communicate with other servers. In this way the separate
servers appear as if they are one system to applications and end users.
Oracle Clusterware provides the infrastructure necessary to run Oracle Real
Application Clusters (Oracle RAC). Oracle Clusterware also manages resources, such
as virtual IP (VIP) addresses, databases, listeners, services, and so on. These resources
are generally named ora.resource_name.host_name. Oracle does not support
editing these resources except under the explicit direction of Oracle support.
Additionally, Oracle Clusterware can help you manage your applications.
Oracle Clusterware has two stored components, besides the binaries: The voting disk
files, which record node membership information, and the Oracle Cluster Registry
(OCR), which records cluster configuration information. Voting disks and OCRs must
reside on shared storage available to all cluster member nodes.
Oracle Clusterware uses voting disk files to provide fencing and cluster node
membership determination. The OCR provides cluster configuration information. You can place the Oracle Clusterware files on either Oracle ASM or on shared common
disk storage. If you configure Oracle Clusterware on storage that does not provide file
redundancy, then Oracle recommends that you configure multiple locations for OCR
and voting disks. The voting disks and OCR are described as follows:
■ Voting Disks
Oracle Clusterware uses voting disk files to determine which nodes are members
of a cluster. You can configure voting disks on Oracle ASM, or you can configure
voting disks on shared storage.
If you configure voting disks on Oracle ASM, then you do not need to manually
configure the voting disks. Depending on the redundancy of your disk group, an
appropriate number of voting disks are created.
If you do not configure voting disks on Oracle ASM, then for high availability,
Oracle recommends that you have a minimum of three voting disks on physically
separate storage. This avoids having a single point of failure. If you configure a
single voting disk, then you must use external mirroring to provide redundancy.
You should have at least three voting disks, unless you have a storage device, such
as a disk array that provides external redundancy. Oracle recommends that you do
not use more than five voting disks. The maximum number of voting disks that is
supported is 15.
■ Oracle Cluster Registry
Oracle Clusterware uses the Oracle Cluster Registry (OCR) to store and manage
information about the components that Oracle Clusterware controls, such as
Oracle RAC databases, listeners, virtual IP addresses (VIPs), and services and any
applications. The OCR stores configuration information in a series of key-value
pairs in a tree structure. To ensure cluster high availability, Oracle recommends
that you define multiple OCR locations (multiplex). In addition:
– You can have up to five OCR locations
– Each OCR location must reside on shared storage that is accessible by all of the
nodes in the cluster
– You can replace a failed OCR location online if it is not the only OCR location
– You must update the OCR through supported utilities such as Oracle
Enterprise Manager, the Server Control Utility (SRVCTL), the OCR
configuration utility (OCRCONFIG), or the Database Configuration Assistant
(DBCA)
Oracle Clusterware Processes on Linux and UNIX Systems
Oracle Clusterware processes on Linux and UNIX systems include the following:
■ crsd: Performs high availability recovery and management operations such as
maintaining the OCR and managing application resources. This grid infrastructure
process runs as root and restarts automatically upon failure.
When you install Oracle Clusterware in a single-instance database environment
for Oracle ASM and Oracle Restart, ohasd manages application resources and
crsd is not used.
■ cssdagent: Starts, stops, and checks the status of the CSS daemon, ocssd. In
addition, the cssdagent and cssdmonitor provide the following services to
guarantee data integrity:
– Monitors the CSS daemon; if the CSS daemon stops, then it shuts down the
node
– Monitors the node scheduling to verify that the node is not hung, and shuts
down the node on recovery from a hang.
oclskd: (Oracle Clusterware Kill Daemon) CSS uses this daemon to stop
processes associated with CSS group members for which stop requests have come
in from other members on remote nodes.
■ ctssd: Cluster time synchronization service daemon: Synchronizes the time on all
of the nodes in a cluster to match the time setting on the master node but not to an
external clock.
■ diskmon (Disk Monitor daemon): Monitors and performs I/O fencing for HP
Oracle Exadata Storage Server storage. Because Exadata storage can be added to
any Oracle RAC node at any time, the diskmon daemon is always started when
ocssd starts.
■ evmd (Event manager daemon): Distributes and communicates some cluster
events to all of the cluster members so that they are aware of changes in the
cluster.
evmlogger (Event manager logger): This is started by EVMD at startup. This
reads a configuration file to determine what events to subscribe to from EVMD
and it runs user defined actions for those events. This facility maintains backward
compatibility only.
■ gpnpd (Grid Plug and Play daemon): Manages distribution and maintenance of
the Grid Plug and Play profile containing cluster definition data.
■ mdnsd (Multicast Domain Name Service daemon): Manages name resolution and
service discovery within attached subnets.
■ ocssd (Cluster Synchronization Service daemon): Manages cluster node
membership and runs as the oracle user; failure of this process results in a node
restart.
■ ohasd (Oracle High Availability Services daemon): Starts Oracle Clusterware
processes and also manages the OLR and acts as the OLR server.
In a cluster, ohasd runs as root. However, in an Oracle Restart environment,
where ohasd manages application resources, it runs as the oracle user
Next blog -> Basic Clusterware Administration Commands
Thanks,
Alfred