As a database professional, you might encounter communication issues with Cluster Ready Services (CRS) in your Oracle RAC environment. In this post, I'll walk you through a recent case where we resolved the "CRS-4535: Cannot communicate with Cluster Ready Services" error.
The Initial Problem
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
Troubleshooting Steps
- Verify CRS Status
- Run crsctl stat res -t
- Run crsctl check crs
- Review CRS Logs
- Check /Oracle/app/grid/diag/crs/srv1/crs/trace/crsd.log
- Confirm Clusterware Processes
ps -ef | grep -E 'crsd.bin|ocssd.bin|evmd.bin'
- Validate Cluster Components
- Check voting disk: crsctl query css votedisk
- Verify OCR: ocrcheck
- Network Configuration Check
- Verify network interfaces: ifconfig -a
- Test node connectivity with ping tests
Root Cause and Solution
In our case, the culprit was the enabled firewall preventing proper communication between Oracle Clusterware components. Here's how we resolved it:
[root@srv1 bin]# systemctl status firewalld
firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled)
Active: active (running) since Tue 2025-03-11 10:15:20 IST ; 1h 12min ago
[root@srv1 bin]# systemctl stop firewalld
[root@srv1 bin]# systemctl disable firewalld
Successful Resolution
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
Key Takeaways
- Always check firewall settings when experiencing cluster communication issues
- Follow a systematic troubleshooting approach
- Verify all services after making changes
- Document the resolution for future reference
Note: While disabling the firewall resolved our issue, in production environments, you might want to configure appropriate firewall rules instead of completely disabling it.
Remember to always follow your organization's security policies when making firewall changes in production environments.
Have you encountered similar CRS communication issues? Share your experiences in the comments below!
Please feel free to ask. thank you 🙂Toufique Khan
No comments:
Post a Comment