Template:CephTroubleshooting

From OSNEXUS Online Documentation Site
Revision as of 17:45, 4 June 2019 by Qadmin (talk | contribs) (→‎Storage Pool / RAID LUN Failure)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Troubleshooting

The following guide will provide an outline of steps to take when encountering issues with Scale-out SAN and Object Storage using Ceph in a QuantaStor environment.

Hardware Disk Failure

  • Replace failed drive and return hardware RAID5 to normal operational status

Node Connectivity Issues

  • Verify networking hardware infrastructure is correct and all cabling valid
  • Verify that node network configuration is correct under System Management
    • Note that all nodes must share networks on the same network port (ie, All nodes should have 10.0.0.0/16 on ethX, 192.168.0.0/16 on ethY)

Ceph will automatically restore OSD status and rebalance data once network status has been successfully restored.

Node Failure and Replacement

In the event a node has completely failed (due to hardware failure, decomissioning, or other action), the node should be removed from the Ceph cluster.

A new node can then be added to the cluster (if desired or necessary).

See the Management and Operations section for details on Removing and Adding a node to a Ceph cluster.