Template:CephTroubleshooting
Troubleshooting
The following guide will provide an outline of steps to take when encountering issues with Scale-out SAN and Object Storage using Ceph in a QuantaStor environment.
Hardware Disk Failure
- Replace failed drive and return hardware RAID5 to normal operational status
Node Connectivity Issues
- Verify networking hardware infrastructure is correct and all cabling valid
- Verify that node network configuration is correct under System Management
- Note that all nodes must share networks on the same network port (ie, All nodes should have 10.0.0.0/16 on ethX, 192.168.0.0/16 on ethY)
Ceph will automatically restore OSD status and rebalance data once network status has been successfully restored.
Node Failure and Replacement
In the event a node has completely failed (due to hardware failure, decomissioning, or other action), the node should be removed from the Ceph cluster.
A new node can then be added to the cluster (if desired or necessary).
See the Management and Operations section for details on Removing and Adding a node to a Ceph cluster.