In Oracle Real Application Clusters (RAC), maintaining cluster integrity and node synchronization is critical for high availability. One of the most important components that ensures this stability is the Voting Disk.
Voting disks play a crucial role in determining which nodes are part of the cluster and help prevent split-brain situations. Understanding how they work is essential for every Oracle DBA managing RAC environments. Voting disks are used by Oracle Clusterware to determine node membership and to resolve cluster communication issues automatically.
What is a Voting Disk?
A Voting Disk is a shared disk that stores information about cluster node membership. Each node in the RAC cluster must be able to access the voting disk(s).
- Maintains cluster node membership information
- Ensures cluster consistency
- Helps in node eviction decisions
How Voting Disk Works
Each node in the cluster continuously sends heartbeat signals to the voting disk. This mechanism allows the cluster to determine whether a node is alive or not.
- If a node stops sending heartbeat, it is considered failed
- Cluster reconfigures automatically
- Failed node gets evicted to maintain integrity
This process prevents split-brain scenarios, where two sets of nodes might try to operate independently.
Why Voting Disks Are Always Odd in Number
Oracle recommends configuring an odd number of voting disks to ensure proper quorum and avoid tie situations.
Cluster decisions are based on majority voting. If the number of voting disks is even, there is a possibility of a tie, which can lead to cluster instability.
Example:
2 Voting Disks:
Node A sees 1 disk
Node B sees 1 disk
No majority, resulting in cluster confusion
3 Voting Disks:
Majority is 2 disks
Clear decision ensures cluster stability
Using an odd number ensures that the cluster can always determine a clear majority.
How Voting Disks Help in Node Eviction
Voting disks play a critical role in deciding which node should remain in the cluster during communication failures.
- Each node tries to access voting disks
- The node that can access majority of voting disks survives
- The node that loses quorum gets evicted
Example Scenario:
3 Voting Disks, majority is 2
Node A can access 2 disks
Node B can access 1 disk
Node A survives and Node B is evicted
Real Node Eviction Logs (CSSD Logs)
Below is a real example of node eviction captured from Cluster Synchronization Services Daemon (CSSD) logs. These logs help DBAs identify the exact reason for eviction.
2024-02-10 10:15:32.123: [ CSSD]clssnmvDHBValidateNCopy: node 2 is missing heartbeat 2024-02-10 10:15:32.124: [ CSSD]clssnmvDiskPingMonitorThread: node 2 not responding, network issue suspected 2024-02-10 10:15:33.130: [ CSSD]clssnmvKillNode: initiating node eviction for node 2 2024-02-10 10:15:33.131: [ CSSD]clssnmvKillNode: node 2 evicted from cluster 2024-02-10 10:15:34.200: [ CSSD]clssnmvClusterReconfig: cluster reconfiguration started 2024-02-10 10:15:36.450: [ CSSD]clssnmvClusterReconfig: cluster reconfiguration completed
In the above logs, node eviction occurred due to missing heartbeat signals. CSSD detected the failure and removed the node to maintain cluster stability.
Number of Voting Disks
- 1 Voting Disk provides no redundancy
- 3 Voting Disks is recommended for production
- 5 Voting Disks provides higher fault tolerance
Cluster remains operational as long as majority of voting disks are accessible.
Voting Disk Failure Scenarios
Single Voting Disk Failure
If only one voting disk exists and it fails, the cluster will stop functioning.
Multiple Voting Disk Setup
- Cluster continues if majority disks are available
- Cluster fails if quorum is lost
In a 3 voting disk setup, at least 2 disks must be available for cluster survival.
Checking Voting Disk Status
crsctl query css votedisk
This command displays all configured voting disks and their status.
Best Practices for Voting Disks
- Use at least 3 voting disks in production
- Place voting disks on separate storage or ASM disk groups
- Avoid single point of failure
- Monitor voting disk health regularly
Performance and Architecture Considerations
- Voting disks do not store user data
- They are critical for cluster coordination
- Should be placed on highly available storage
Key Takeaways
Voting Disk is essential for cluster membership management.
Odd number ensures proper quorum.
Helps in automatic node eviction during failures.
Maintains cluster stability and consistency.
Conclusion
Voting disks are a core component of Oracle RAC architecture. Proper configuration and understanding of voting disk behavior ensures cluster stability, fault tolerance, and high availability.
Toufique Khan

No comments:
Post a Comment