+ Admin Guide Overview

From OSNEXUS Online Documentation Site
Revision as of 18:50, 29 July 2010 by Qadmin (Talk | contribs)

Jump to: navigation, search

The QuantaStor Administrators Guide is intended for all administrators and cloud users who plan to manage their storage using QuantaStor Manager as well as for those just looking to get a deeper understanding of how the QuantaStor Storage System Platform (SSP) works.

Definitions

The following series of definitions are here to lay the ground work and context for the rest of the document. Here we define all the various objects and elements that can be managed via the QuantaStor Manager web interface or via the QuantaStor Remote Management CLI (available for Windows & Linux).

Storage System

The storage system is the object that represents the entire iSCSI server both from a physical and logical standpoint. This includes all the physical disks, fans, enclosures, power supplies and other physical elements of the system as well as all the logical elements including the storage pools, volumes, users, and storage clouds.

Storage Pool

The storage pool is an aggregation of one or more physical disks into a larger entity. Each storage pool has a single RAID type associated with it, and all storage volumes that are created within that storage pool inherit RAID type. For example, if a given storage pool of type RAID1 (mirroring) is made up to two 1TB disks, then there is 1TB of usable storage available to create storage volumes (LUNs) with.

Storage Volume

The storage volume is the most important object in the system as it represents the virtual disk device that is presented to the host a LUN. Each storage volume has a unique name and a unique target number and a unique IQN associated with it. Storage volumes can be created "thin" which means they do not use up any disk space until the device has been written to. Or "thick" which means that all the space for the storage volume is pre-reserved up front.

Storage Volume Group

Often times hosts and virtual machines will be comprised of more than one storage volume. Sometimes one storage volume is dedicated as a boot disk and another as a swap disk. In other cases there are multiple disks utilized to separate out the elements of a database application (index, data, log) into separate storage volumes for improved performance. Whatever the reason, it can become difficult to manage you storage system without a way to group these storage volumes together so that they can be operated on as a single unit. That's what Storage Volume Groups provide. They're simple containers for collecting together an arbitrary set of storage volumes so that they can be cloned, snapshot, or even deleted as a group.

Snapshot Schedules

Snapshot schedules are a powerful tool for automatically generating recovery points (snapshots), on a schedule so that you don't have to think about it. The snapshot schedule consists of a list of storage volumes to be snapshot, and a list of days of the week and hours of the day at which the snapshots are to be taken. A 'max snapshots' parameter sets the point at which the oldest snapshot created by the schedule should be cleaned up (default: 10).

Host

A host represents a server, workstation, laptop, or virtual machine that has a software or hardware iSCSI initiator by which it can access storage volumes (iSCSI targets) exposed by the storage system. Hosts are identified by one or more initiator IQNs and IP addresses. We recommend that you identify your hosts by IQN as that has the most flexibility since IP addresses can frequently change, especially if a host is using DHCP to acquire it's IP address.

Host Group

A host group is an arbitrary collection of hosts that have been grouped together for some purpose. Sometimes they're grouped together by location, but more ofter Host Groups are used to group together hosts that have been formed into a cluster such as a Microsoft Fail-over Cluster / MSCS. In other cases as with VMWare or XenServer multiple hosts can be combined together to form "resource pools" in which the virtual machines can live migrate from one host to another. In all these cases, each host typically needs access to all the same storage volumes in order to facilitate fail-over. This can be a tedious process with many storage systems as most require that an assignment operation be executed for each host and each volume. If you have 10 hosts and 100 volumes, that amounts to 1000 storage assignment tasks and potentially days of work. With QuantaStor we've tried our best to make that a snap, and Host Groups are key to making that possible. Using the same scenario but with 1 host group and 100 volumes, the storage assignment to the group of 10 hosts can be done in a single operation through QuantaStor manager in less than a minute.

Storage Cloud

Storage Clouds are essentially virtual storage systems. One of the key unique offerings that QuantaStor brings to storage management, storage clouds make it so that you can give groups of users private storage clouds so that the storage system effectively support multi-tenancy.

Storage Quota

Storage quotas go hand-in-hand with storage clouds. Quotas define a set amount of storage that can be provisioned from a Storage Pool from a specific Storage Cloud. More specifically, storage quotas allow you to define the amount of storage that can be thin-provisioned as well as the amount that can be utilized/reserved. The also allow the administrator to set the maximum number of volumes that can be created by a given cloud.

Roles

There are four (4) predefined roles that come with the initial storage system configuration which include:

  • Administrator
    • Administrators have full access to manage all aspects of the storage system. They can create new roles, users, storage pools, reconfigure target ports, everything.
  • Cloud Administrator
    • Cloud administrators are limited to managing just the resources contained within the storage cloud to which they are a member. This includes the storage volumes, snapshot schedules, and hosts within their cloud. Cloud administrators can only view the resources that are within the cloud to which they are a member, all other resources in the system are private and invisible to the cloud admin.
  • Cloud User
    • Cloud users can only view the resources within their cloud, just like the Cloud Administrator, but they have limited ability to manage storage volumes. More specifically, they can only snapshot, clone, and delete storage volumes they've access rights to. (By default when a user creates a storage volume or other resource they have access rights to modify that resource but the Administrator can add/remove rights afterward).
  • System Monitor
    • System monitors can only view the objects within the system. This role is useful for creating monitoring agents or for providing people in administrative roles a way of viewing the storage system without being able to change its configuration.

Besides the include roles outlined above, you can create as many custom roles as you like. Each role consists of a list of object action permissions coupled with a scope at which that action can be exercised. For example, there's a permission for "Storage Volume : view" which allows users to view storage volumes. If you add this permission to a role and assign it at a scope of 'system' then the user associated with that role can view all storage volumes in the system. If on the other hand the scope is set to 'user' then the user will only be able to view storage volumes that he/she created. This RBAC system with scoping is unique to QuantaStor & QuantaGrid and is a core technology behind our Storage Clouds.

Permissions

Permissions are simply a combination of an object and an action. For example here are some of the permissions associated with the Storage Volume object:

  • storage volume : view
  • storage volume : create
  • storage volume : delete
  • storage volume : snapshot
  • storage volume : clone
  • storage volume : restore
  • storage volume : assign
  • storage volume : unassign

When permissions are assigned to a role there is another element that's added, and that's the permission scope. The permission scope defines at what level the user is allowed to exercise the granted permission. Permission scopes include 'none', 'user', 'cloud', 'system', and 'grid'.

Users

Each user is given a unique user name and password so that they may login and share in managing the storage system, and each users is associated with a specific role. Some roles like the Cloud User and Cloud Administrator are only truly effective when the user is associated with a storage cloud. Once associated with a cloud, cloud users and admins can access, view, or modify resources within that cloud within the permission limits of their role. All other resources in the system are invisible.

  Note: Today QuantaStor does not support external authentication mechanisms 
  like Active Directory but that is planned for a future release.

User Groups

Often times a given group of users will be associated with more than one storage cloud. The user group object represents an arbitrary collection of users and provides a simple way to keep track of groups of users thereby making it easier add large groups of users to/from storage clouds.

Target Port

The target port represents an NIC or network interface card/port in your storage system. 1Ge ports are common in servers today and most servers typically have 2 x 1Ge ports. The term target port comes from SCSI terminology where the device to be accessed is called a 'target' and the entity accessing the target is called the 'initiator'. Hence the port in the storage system through which a target can be accessed is called a 'target port'. You can add as many target ports to your system as your storage system's PCI bus has room for. Some vendors like Intel sell dual and quad 1Ge port NICs, but if you find yourself needing larger numbers of ports to improve network throughput we suggest looking into adding 10Ge NICs to your QuataStor system.

Sessions

The session object represents and active iSCSI session between one of your hosts (aka initiator), and a specific storage volume (iSCSI target) in the QuantatStor storage system. Often times there will be more than one session connected to the same target as this forms multiple IO paths giving you improved performance and fail-over capabilities in the event a path is disconnected. You can drop or disconnect a session from within QuantaStor Manager, but keep in mind that many iSCSI initiators will automatically re-establish a new session with the array automatically. To permanently remove access to a volume from a host you'll need to unassign the volume using the 'Assign/Unassign Host Access' option. This appears in the pop-up menu when you right-click on a volume in QuantaStor Manager. You can reassign access rights back to the host so that it can access the volume again at any time.

CHAP Authentication

CHAP stands for [Challenge-handshake Authentication Protocol|http://en.wikipedia.org/wiki/Challenge-handshake_authentication_protocol] and it provides you with a mechanism by which you can associated a username and a password with a specific iSCSI target / storage volume. CHAP usernames & passwords are completely separate from the username & password that you use with the QuantaStor Manager web GUI. Simply put, the CHAP username and password are just arbitrary values that you make up but they must be at least 12 characters in length. To use CHAP with a specific volume you will need to press the 'Advanced Settings' button in the 'Modify Storage Volume' dialog in QuantaStor Manager. Besides being able to set a storage volume specific CHAP username and password, QuantaStor allows you to set a default CHAP password for all the volumes in a storage cloud and additionally a default CHAP password for all storage volumes you own.

Alerts

Alerts are simply messages from the system. Some alerts are just informational such as when the system first starts it sends out a message that the system startup completed successfully. Other alerts are warnings or errors indicating something serious has happened like a disk failure that needs to be addressed. Alerts are shown in QuantaStor Manager at the bottom of the screen.

Events

The QuantaStor service generates an event for each change that is made to the system. It is via these events that the QuantaStor Manager web UI is able to keep itself state consistent with the service. You don't see events anywhere as objects within QuantaStor Manager but you can view them using the CLI which can be useful if you're scripting and need to get notified when the system configuration has changed in some way.

Tasks

Every configuration change to the system is handled via a task. When you create a storage volume or do some other operation with the storage system, you'll notice a new task in the status bar at the bottom of the screen in QuantaStor Manager. Some long running tasks (like batch creation of 300 volumes) can be canceled while they're running. Just right-click the task in the task bar within QuantaStor Manager and choose cancel.


Managing Storage Pools

Storage pools combine or aggregate one or more physical disks (SATA, SAS, or SSD) into a single pool of storage from which storage volumes (iSCSI targets) can be created. Storage pools can be created using any of the following RAID types including RAID0, RAID1, RAID5, RAID6, or RAID10. Choosing the optimal RAID type depends on your the I/O access patters of your target application, number of disks you have, and the amount of fault-tolerance you require. (Note: Fault tolerance is just a way of saying how many disks can fail within a storage pool or (aka RAID group before you lose data.) RAID1 & RAID5 allow you have one disk fail without it interrupting disk IO. When a disk fails you can remove it and you should add a spare disk to the 'degraded' storage pool as soon as possible to in order to restore it to a fault-tolerant status. You can also assign spare disks to storage pools ahead of time so that the recovery happens automatically. RAID6 allows for up to two disk to fail and will keep running whereas RAID10 can allow for one disk failure per mirror pair. Finally, RAID0 is not fault tolerant at all but it is your only choice if you have only one disk and it can be useful in some scenarios where fault-tolerance is not required. Here's a breakdown of the various RAID types and their pros & cons.

  • RAID0 layout is also called 'striping' and it writes data across all the disk drives in the storage pool in a round robin fashion. This has the effect of greatly boosting performance. The drawback of RAID0 is that it is not fault tolerant, meaning that if a single disk in the storage pool fails then all of your data in the storage pool is lost. As such RAID0 is not recommended except in special cases where the potential for data loss is non-issue.
  • RAID1 is also called 'mirroring' because it achieves fault tolerance by writing the same data to two disk drives so that you always have two copies of the data. If one drive fails, the other has a complete copy and the storage pool continues to run. RAID1 and it's variant RAID10 are ideal for databases and other applications which do a lot of small write I/O operations.
  • RAID5 achieves fault tolerance via what's called a parity calculation where one of the drives contains an XOR calculation of the bits on the other drives. For example, if you have 4 disk drives and you create a RAID5 storage pool, 3 of the disks will store data, and the last disk will contain parity information. This parity information on the 4th drive can be used to recover from any data disk failure. In the event that the parity drive fails, it can be replaced and reconstructed using the data disks. RAID5 (and RAID6) are especially well suited for audio/video streaming, archival, and other applications which do a heavy sequential write I/O operations (such as reading/writing large files) and are not as well suited for database applications which do heavy amounts of small random write I/O operations or with large file-systems containing lots of small files with a heavy write load.
  • RAID6 improves upon RAID5 in that it can handle two drive failures but it requires that you have two disk drives dedicated to parity information. For example, if you have a RAID6 storage pool comprised of 5 disks then 3 disks will contain data, and 2 disks will contain parity information. In this example, if the disks are all 1TB disks then you will have 3TB of usable disk space for the creation of volumes. So there's some sacrifice of usable storage space to gain the additional fault tolerance. If you have the disks, we always recommend using RAID6 over RAID5. This is because all hard drives eventually fail and when one fails in a RAID5 storage pool your data is left vulnerable until a spare disk is utilized to recover your storage pool back to a fault tolerant status. With RAID6 your storage pool is still fault tolerant after the first drive failure. (Note: Fault-tolerant storage pools (RAID1,5,6,10) that have suffered a single disk drive failure are called degraded because they're still operational but they require a spare disk to recover back to a fully fault-tolerant status.)
  • RAID10 is similar to RAID1 in that it utilizes mirroring, but RAID10 also does striping over the mirrors. This gives you the fault tolerance of RAID1 combined with the performance of RAID10. The drawback is that half the disks are used for fault-tolerance so if you have 8 1TB disks utilized to make a RAID10 storage pool, you will have 4TB of usable space for creation of volumes. RAID10 will perform very well with both small random IO operations as well as sequential operations and it is highly fault tolerant as multiple disks can fail as long as they're not from the same mirror-pairing. If you have the disks and you have a mission critical application we highly recommend that you choose the RAID10 layout for your storage pool.

In many cases it is useful to create more than one storage pool so that you have both basic low cost fault-tolerant storage available from perhaps a RAID5 storage pool, as well as a highly fault-tolerant RAID10 or RAID6 storage pool available for mission critical applications.

Once you have created a storage pool it will take some time to 'rebuild'. Once the 'rebuild' process has reached 1% you will see the storage pool appear in QuantaStor Manager and you can begin to create new storage volumes.

WARNING: Although you can begin using the pool at 1% rebuild completion, your storage pool is not fault-tolerant until the rebuild process has completed.


Target Port Configuration

Target ports are simply the network ports (NICs) through which your client hosts (initiators) access your storage volumes (aka targets). The terms 'target' and 'initiator' are SCSI terms that are synonymous with 'server' and 'client' respectively. QuantaStor supports both statically assigned IP addresses as well as dynamically assigned (DHCP) addresses. If you selected automatic network configuration when you initially installed QuantaStor then you'll have one port setup with DHCP and the others are likely offline. We recommend that you always use static IP addresses unless you have your DHCP server setup to specifically assign an IP address to your NICs as identified by MAC address. If you don't set the target ports up with static IP addresses you risk the IP address changing and losing access to your storage when the dynamically assigned address expires. To modify the configuration of a target port first select the tree section named "Storage System" under the "Storage Management" tab on the left hand side of the screen. After that, select the "Target Ports" tab in the center of the screen to see the list of target ports that were discovered. To modify the configuration of one of the ports, simply right-click on it and choose "Modify Target Port" from the pop-up menu. Alternatively you can press the "Modify" button in the tool bar at the top of the screen in the "Target Ports" section. Once the "Modify Target Port" dialog appears you can select the target port type for the selected port (static), enter the IP address for the port, subnet mask, and gateway for the port. You can also set the MTU to 9000 for jumbo packet support, but we recommend that you get your network configuration up and running with standard 1500 byte frames as jumbo packet support requires that you custom configure your host side NICs and network switch with 9K frames as well.

NIC Bonding / Trunking

QuantaStor supports NIC bonding, also called trunking, which allows you to combine multiple NICs together to improve performance and reliability. If combine two or more ports together into a virtual port you'll need to make sure that all the bonded ports are connected to the same network switch. There are very few exceptions to this rule. For example, if you have two networks and 4 ports (p1, p2, p3, p4) you'll want to create two separate virtual ports each bonding two NIC ports (p1, p2 / p3, p4) together and each pair connected to a separate network (p1, p2 -> network A / p3, p4 -> network B). This type of configuration is highly recommended as you have both improved bandwidth and have no single point of failure in the network or in the storage system. Of course you'll need your host to have at least 2 NIC ports and they'll each need to connect to the separate networks. For very simple configurations you can just connect everything to one switch but again, the more redundancy you can work into your SAN the better.

Alert Settings

Managing Hosts

Managing Snapshot Schedules

Near Continuous Data Replication (N-CDP)

Managing Sessions

The list of active iSCSI sessions with the storage system can be found by selecting the 'Storage System' tree-tab in QuantaStor Manager then selecting the 'Sessions' tab in the center view. Here's a screenshot of a list of active sessions as shown in QuantaStor Manager.

Session List

Dropping Sessions

To drop an iSCSI session, just right-click on it and choose 'Drop Session' from the menu.

Drop Session Dialog

Keep in mind that some initiators will automatically re-establish a new iSCSI session if one is dropped by the storage system. To prevent this, just unassign the storage volume from the host so that the host cannot re-login.

Managing Storage Volumes

Creating & Deleting Storage Volumes

Creating Snapshots

QuantaStor snapshots are probably not like any snapshots you've used with any other storage vendor on the market. Some key features of QuantaStor volume snapshots include:

  • massive scalability
    • create hundreds of snapshots in just seconds
  • supports snapshots of snapshots
    • you can create snapshots of snapshots of snapshots, ad infinitum.
  • snapshots are R/W by default, read-only snapshots are also supported
  • snapshots perform extremely well even when large numbers exist
  • snapshots can be converted into primary storage volumes instantly

All of these advanced snapshot capabilities make QuantaStor ideally suited for virtual desktop solutions, off-host backup, and near continuous data protection (NCDP). If you're looking to get NCDP functionality, just create a 'snapshot schedule' and snapshots can be created for your storage volumes as frequently as every hour.

To create a snapshot or a batch of snapshots you'll want to select the storage volume that you which to snap, right-click on it and choose 'Snapshot Storage Volume' from the menu.

If you do not supply a name then QuantaStor will automatically choose a name for you by appending the suffix "_snap" to the end of the original's volume name. So if you have a storage volume named 'vol1' and you create a snapshot of it, you'll have a snapshot named 'vol1_snap000'. If you create many snapshots then the system will increment the number at the end so that each snapshot has a unique name.

Creating Clones

Clones represent complete copies of the data blocks in the original storage volume, and a clone can be created in any storage pool in your storage system whereas a snapshot can only be created within the same storage pool as the original. You can create a clone at any time and while the source volume is in use because QuantaStor creates a temporary snapshot in the background to facilitate the clone process. The temporary snapshot is automatically deleted once the clone operation completes. Note also that you cannot use a cloned storage volume until the data copy completes. You can monitor the progress of the cloning by looking at the Task bar at the bottom of the QuantaStor Manager screen. In contrast to clones, snapshots are created near instantly and do not involve data movement so you can use them immediately.

Restoring from Snapshots

If you've accidentally lost some data by inadvertently deleting files in one of your storage volumes, you can recover your data quickly and easily using the 'Restore Storage Volume' operation. To restore your original storage volume to a previous point in time, first select the original, the right-click on it and choose "Restore Storage Volume" from the pop-up menu. When the dialog appears you will be presented with all the snapshots of that original from which you can recover from. Just select the snapshot that you want to restore to and press ok. Note that you cannot have any active sessions to the original or the snapshot storage volume when you restore, if you do you'll get an error. This is to prevent the restore from taking place while the OS has the volume in use or mounted as this will lead to data corruption.

WARNING: When you restore, the data in the original is replaced with the data in 
the snapshot.  As such, there's a possibility of loosing data as everything that 
was written to the original since the time the snapshot was created will be lost.  
Remember, you can always create a snapshot of the original before you restore it 
to a previous point-in-time snapshot.

Converting a Snapshot into a Primary

IO Tuning

QuantaStor has a number of tunable parameters in the /etc/quantastor.conf file that can be adjusted to better match the needs of your application. That said, we've spent a considerable amount of time tuning the system to efficiently support a broad set of application types so we do not recommend adjusting these settings unless you are a highly skilled Linux administrator. The default contents of the /etc/quantastor.conf configuration file are as follows:

[device]
nr_requests=2048
scheduler=deadline
read_ahead_kb=512

[mdadm]
chunk_size_kb=256
parity_layout=left-symmetric

[btrfs]
nodatasum=false

There are tunable settings for device parameters, md array chunk-size and parity configuration settings, as well as some settings for btrfs. These configuration settings are read from the configuration file dynamically each time one of the settings is needed so there's no need to restart the quantastor service. Simply edit the file and the changes will be applied to the next operation that utilizes them. For example, if you adjust the chunk_size_kb setting for mdadm then the next time a storage pool is created it will use the new chunk size. Other tunable settings like the device settings will automatically be applied within a minute or so of your changes because the system periodically checks the disk configuration and updates it to match the tunable settings. Also, you can delete the quantastor.conf file and it will automatically use the defaults that you see listed above.