Difference between revisions of "+ Admin Guide Overview"

From OSNEXUS Online Documentation Site
Jump to: navigation, search
m
m ([Terms & Definitions])
Line 1: Line 1:
 
The QuantaStor Administrators Guide is intended for all administrators and cloud users who plan to manage their storage using QuantaStor Manager as well as for those just looking to get a deeper understanding of how the QuantaStor Storage System Platform (SSP) works.
 
The QuantaStor Administrators Guide is intended for all administrators and cloud users who plan to manage their storage using QuantaStor Manager as well as for those just looking to get a deeper understanding of how the QuantaStor Storage System Platform (SSP) works.
  
== [Terms & Definitions] ==
+
== Terms & Definitions ==
  
 
The following series of definitions are here to lay the ground work and context for the rest of the document.  Here we define all the various objects and elements that can be managed via the QuantaStor Manager web interface or via the QuantaStor Remote Management CLI (available for Windows & Linux).
 
The following series of definitions are here to lay the ground work and context for the rest of the document.  Here we define all the various objects and elements that can be managed via the QuantaStor Manager web interface or via the QuantaStor Remote Management CLI (available for Windows & Linux).

Revision as of 15:11, 2 April 2012

The QuantaStor Administrators Guide is intended for all administrators and cloud users who plan to manage their storage using QuantaStor Manager as well as for those just looking to get a deeper understanding of how the QuantaStor Storage System Platform (SSP) works.

Contents

Terms & Definitions

The following series of definitions are here to lay the ground work and context for the rest of the document. Here we define all the various objects and elements that can be managed via the QuantaStor Manager web interface or via the QuantaStor Remote Management CLI (available for Windows & Linux).

Storage System

The storage system is the object that represents the entire iSCSI server both from a physical and logical standpoint. This includes all the physical disks, fans, enclosures, power supplies and other physical elements of the system as well as all the logical elements including the storage pools, volumes, users, and storage clouds.

Storage Pool

The storage pool is an aggregation of one or more physical disks into a larger entity. Each storage pool has a single RAID type associated with it, and all storage volumes that are created within that storage pool inherit RAID type. For example, if a given storage pool of type RAID1 (mirroring) is made up to two 1TB disks, then there is 1TB of usable storage available to create storage volumes (LUNs) with.

Storage Volume

The storage volume is the most important object in the system as it represents the virtual disk device that is presented to the host a LUN. Each storage volume has a unique name and a unique target number and a unique IQN associated with it. Storage volumes can be created "thin" which means they do not use up any disk space until the device has been written to. Or "thick" which means that all the space for the storage volume is pre-reserved up front.

Storage Volume Group

Often times hosts and virtual machines will be comprised of more than one storage volume. Sometimes one storage volume is dedicated as a boot disk and another as a swap disk. In other cases there are multiple disks utilized to separate out the elements of a database application (index, data, log) into separate storage volumes for improved performance. Whatever the reason, it can become difficult to manage you storage system without a way to group these storage volumes together so that they can be operated on as a single unit. That's what Storage Volume Groups provide. They're simple containers for collecting together an arbitrary set of storage volumes so that they can be cloned, snapshot, or even deleted as a group.

Snapshot Schedules

Snapshot schedules are a powerful tool for automatically generating recovery points (snapshots), on a schedule so that you don't have to think about it. The snapshot schedule consists of a list of storage volumes and/or network shares to be snapshot, and a list of days of the week and hours of the day at which the snapshots are to be taken. A 'max snapshots' parameter sets the point at which the oldest snapshot created by the schedule should be cleaned up (default: 10).

Host

A host represents a server, workstation, laptop, or virtual machine that has a software or hardware iSCSI initiator by which it can access storage volumes (iSCSI targets) exposed by the storage system. Hosts are identified by one or more initiator IQNs and IP addresses. We recommend that you identify your hosts by IQN as that has the most flexibility since IP addresses can frequently change, especially if a host is using DHCP to acquire it's IP address.

Host Group

A host group is an arbitrary collection of hosts that have been grouped together typically because they form a cluster or pool of hosts designated for some purpose. Sometimes they're grouped together by location, but more ofter Host Groups are used to group together hosts that have been formed into a cluster such as a Microsoft Fail-over Cluster / MSCS. In other cases as with VMWare or XenServer multiple hosts can be combined together to form "resource pools" in which the virtual machines can live migrate from one host to another. In all such cases, each host in the group will typically needs access to all the same storage volumes in order to facilitate fail-over and live migration. Host groups make it easy to assign storage to multiple hosts all at once, and you can add and remove hosts from a host group to quickly assign storage to a cluster node or resource pool. This was traditionally a long tedious process with many legacy storage systems as most systems would require that the administrator assign each volume to each host individually. If you have 10 hosts and 100 volumes, that amounts to 1000 storage assignment operations and potentially days of work for a storage administrator. With host groups we make that a snap.. using the same example but with 1 host group representing the 10 hosts and 100 volumes, the storage assignment to the group of 10 hosts can be done in a single operation through QuantaStor manager in less than a minute.

Network Share

A network share is an NFSv3 network share. Each network share is a separate root directory in a storage pool in which you can store files. If you are using an 'Advanced' storage pool, then you can create snapshots and snapshot schedules for your network share whereas with 'Standard' pools you cannot.

Storage Cloud

Storage Clouds are essentially virtual storage systems. One of the key unique offerings that QuantaStor brings to storage management, storage clouds make it so that you can give groups of users private storage clouds so that the storage system effectively support multi-tenancy.

Storage Quota

Storage quotas go hand-in-hand with storage clouds. Quotas define a set amount of storage that can be provisioned from a Storage Pool from a specific Storage Cloud. More specifically, storage quotas allow you to define the amount of storage that can be thin-provisioned as well as the amount that can be utilized/reserved. The also allow the administrator to set the maximum number of volumes that can be created by a given cloud.

Storage System Link

Storage System Links represent a link between to separate storage systems for use with remote replication. Once a link is established you can replicate a storage volume from one system to another by right-clicking any storage volume and choosing "Create Remote Replica". From there you just need to select the target system and the target storage pool. Any given storage system can have multiple storage system links and the links are bi-directional so you can replicate in either direction.

Remote Replica Association

When you replicate a storage volume from one system to another, a remote replica association object is crated automatically to track the status of the replication operation. If the replication fails for any reason you can use the 'Resync' operation to restart the replication.

Roles

There are four (4) predefined roles that come with the initial storage system configuration which include:

  • Administrator
    • Administrators have full access to manage all aspects of the storage system. They can create new roles, users, storage pools, reconfigure target ports, everything.
  • Cloud Administrator
    • Cloud administrators are limited to managing just the resources contained within the storage cloud to which they are a member. This includes the storage volumes, snapshot schedules, and hosts within their cloud. Cloud administrators can only view the resources that are within the cloud to which they are a member, all other resources in the system are private and invisible to the cloud admin.
  • Cloud User
    • Cloud users can only view the resources within their cloud, just like the Cloud Administrator, but they have limited ability to manage storage volumes. More specifically, they can only snapshot, clone, and delete storage volumes they've access rights to. (By default when a user creates a storage volume or other resource they have access rights to modify that resource but the Administrator can add/remove rights afterward).
  • System Monitor
    • System monitors can only view the objects within the system. This role is useful for creating monitoring agents or for providing people in administrative roles a way of viewing the storage system without being able to change its configuration.

Besides the include roles outlined above, you can create as many custom roles as you like. Each role consists of a list of object action permissions coupled with a scope at which that action can be exercised. For example, there's a permission for "Storage Volume : view" which allows users to view storage volumes. If you add this permission to a role and assign it at a scope of 'system' then the user associated with that role can view all storage volumes in the system. If on the other hand the scope is set to 'user' then the user will only be able to view storage volumes that he/she created. This RBAC system with scoping is unique to QuantaStor & QuantaGrid and is a core technology behind our Storage Clouds.

Permissions

Permissions are simply a combination of an object and an action. For example here are some of the permissions associated with the Storage Volume object:

  • storage volume : view
  • storage volume : create
  • storage volume : delete
  • storage volume : snapshot
  • storage volume : clone
  • storage volume : restore
  • storage volume : assign
  • storage volume : unassign

When permissions are assigned to a role there is another element that's added, and that's the permission scope. The permission scope defines at what level the user is allowed to exercise the granted permission. Permission scopes include 'none', 'user', 'cloud', 'system', and 'grid'.

Users

Each user is given a unique user name and password so that they may login and share in managing the storage system, and each users is associated with a specific role. Some roles like the Cloud User and Cloud Administrator are only truly effective when the user is associated with a storage cloud. Once associated with a cloud, cloud users and admins can access, view, or modify resources within that cloud within the permission limits of their role. All other resources in the system are invisible.

  Note: Today QuantaStor does not support external authentication mechanisms 
  like Active Directory but that is planned for a future release.

User Groups

Often times a given group of users will be associated with more than one storage cloud. The user group object represents an arbitrary collection of users and provides a simple way to keep track of groups of users thereby making it easier add large groups of users to/from storage clouds.

Target Port

The target port represents an NIC or network interface card/port in your storage system. 1Ge ports are common in servers today and most servers typically have 2 x 1Ge ports. The term target port comes from SCSI terminology where the device to be accessed is called a 'target' and the entity accessing the target is called the 'initiator'. Hence the port in the storage system through which a target can be accessed is called a 'target port'. You can add as many target ports to your system as your storage system's PCI bus has room for. Some vendors like Intel sell dual and quad 1Ge port NICs, but if you find yourself needing larger numbers of ports to improve network throughput we suggest looking into adding 10Ge NICs to your QuataStor system.

Sessions

The session object represents and active iSCSI session between one of your hosts (aka initiator), and a specific storage volume (iSCSI target) in the QuantatStor storage system. Often times there will be more than one session connected to the same target as this forms multiple IO paths giving you improved performance and fail-over capabilities in the event a path is disconnected. You can drop or disconnect a session from within QuantaStor Manager, but keep in mind that many iSCSI initiators will automatically re-establish a new session with the array automatically. To permanently remove access to a volume from a host you'll need to unassign the volume using the 'Assign/Unassign Host Access' option. This appears in the pop-up menu when you right-click on a volume in QuantaStor Manager. You can reassign access rights back to the host so that it can access the volume again at any time.

CHAP Authentication

CHAP stands for Challenge-handshake Authentication Protocol and it provides you with a mechanism by which you can associated a username and a password with a specific iSCSI target / storage volume. CHAP usernames & passwords are completely separate from the username & password that you use with the QuantaStor Manager web GUI. Simply put, the CHAP username and password are just arbitrary values that you make up but they must be at least 12 characters in length. To use CHAP with a specific volume you will need to press the 'Advanced Settings' button in the 'Modify Storage Volume' dialog in QuantaStor Manager. Besides being able to set a storage volume specific CHAP username and password, QuantaStor allows you to set a default CHAP password for all the volumes in a storage cloud and additionally a default CHAP password for all storage volumes you own.

Alerts

Alerts are simply messages from the system. Some alerts are just informational such as when the system first starts it sends out a message that the system startup completed successfully. Other alerts are warnings or errors indicating something serious has happened like a disk failure that needs to be addressed. Alerts are shown in QuantaStor Manager at the bottom of the screen.

Events

The QuantaStor service generates an event for each change that is made to the system. It is via these events that the QuantaStor Manager web UI is able to keep itself state consistent with the service. You don't see events anywhere as objects within QuantaStor Manager but you can view them using the CLI which can be useful if you're scripting and need to get notified when the system configuration has changed in some way.

Tasks

Every configuration change to the system is handled via a task. When you create a storage volume or do some other operation with the storage system, you'll notice a new task in the status bar at the bottom of the screen in QuantaStor Manager. Some long running tasks (like batch creation of 300 volumes) can be canceled while they're running. Just right-click the task in the task bar within QuantaStor Manager and choose cancel.

Storage System Management Operations

When you initially connect to QuantaStor manager you'll see a toolbar (aka ribbon bar) at the top of the screen and a stack view / tree view on the left hand side of the screen. By selecting different areas of the tree view (Storage Volumes, Hosts, etc) the ribbon view . tool bar will change accordingly to indicate the operations available for that section. The following diagram shows these two sections:

Main Tree View & Ribbon-bar / Toolbar

Note also that you can right-click on the title-bar for each stack item in the tree view to access a pop-up menu, and you can right-click on any object anywhere in the UI to access a context sensitive pop-up menu for that item.

License Management

Recovery Manager

Upgrade Manager

System Checklist

System Hostname & DNS management

Physical Disk Management

Identifying physical disks in an enclosure

Scanning for physical disks

Hardware Controller & Enclosure Integration

QuantaStor has custom integration modules 'plug-ins' for a number of major RAID controller cards which monitor the health and status of your hardware RAID units, disks, enclosures, and controllers. When a disk failure occurs within a hardware RAID group, QuantaStor detects this and sends you an email through the QuantaStor alert management system. Note that QuantaStor also has software RAID support for RAID levels 1,5,6 & 10 so you do not need a hardware RAID card but hardware RAID can boost performance and offer you additional RAID configuration options. Also, you can use any RAID controller that works with Ubuntu Server, but QuantaStor will only detect alerts and discover the configuration details of those controllers for which there is a QuantaStor hardware controller plug-in. Note that the plug-in discovery logic is triggered every couple of minutes so in some cases you will find that there is a small delay before the information in the web interface is updated.

As of QuantaStor v2.0 there is hardware plug-in support for the following controllers:

  • Adaptec 5405, 5805
  • LSI 3ware 9750
  • LSI 3ware 9690SA
  • LSI MegaRAID / DELL PERC
    • LSI MegaRAID 9240, 9260, 9261, 9265, 9280, 9285
  • HP SmartArray P400 / others
  • Fusion IO

More RAID controller support is planned for 2012 including:

  • Adaptec 6xxx series
  • Areca

Fusion IO integration

The Fusion IO integration requires that the following packages are installed:

  • fio-util_2.2.0.82-1.0_amd64.deb
  • iomemory-vsl-2.6.35-22-server_2.2.0.82-1.0_amd64.deb

Once installed the Fusion IO control and logic devices will automatically show up in the Hardware Controller & Enclosure view within QuantaStor Manager.

LSI 3ware 9750 & 9690SA integration

LSI does an excellent job keeping their 3ware Linux drivers current and integrated with new revisions of the Linux kernel so generally there is no need to do any driver installation as part of configuring QuantaStor for use with your LSI 3ware card. The QuantaStor plug-in for LSI 3ware cards utilizes the tw_cli command line tool and there is a script located under /opt/osnexus/quantastor/raid-tools which will automate its installation. Here's how to do that:

cd /opt/osnexus/quantastor/raid-tools
sudo lsi3warecli-install.sh

After the tw_cli tool is installed you should verify that it is working properly by running this command:

sudo /opt/osnexus/quantastor/raid-tools/tw_cli show

If you see your controller listed then it is working properly.

If you're concerned about the driver version for your 3ware card, note that the driver is called 3w_sas so you can run this command at the console to see which driver version is running:

sudo modinfo 3w_sas

As of version 1.5.1 QuantaStor supports create/delete/identify hardware RAID units, rescan controller, disk remove/identify, and some other commands all within the QuantaStor manager web interface. This makes it easier to setup and manage your storage system without having to utilize the 3ware BIOS and tw_cli CLI directly.

Note that if you arbitrarily remove a disk that was being utilized by a 3ware RAID unit, you cannot simply plug it back into the storage system and use it again. 3ware writes configuration data on the disk in what's called a Disk Control Block (DCB) and this needs to be scrubbed before you can use the disk again as a hot spare or within another unit. There is a great article written here on how to scrub the DCB on a disk so that you can use it again with your LSI 3ware controller. Formatting the disk in another system will also suffice. You can then add it back into the old system and designate it as a spare, and if you have a unit that is degraded it will automatically adopt the spare and begin rebuilding the unit back to a fully fault tolerant status. Of course if you pulled the disk because it was faulty you'll want to dispose of it properly and or return it back to the manufacturer for a warranty replacement.

LSI MegaRAID / DELL PERC integration

MegaRAID 92xx series and the DELL PERC H800 controller have been integrated with QuantaStor as of v1.5.2 and newer. To enable the hardware controller integration logic you'll first need to run a script at the console, and then you'll also want to make sure you have the latest firmware. Other MegaRAID controllers will work with QuantaStor but are not yet on our HCL and may not integrate properly with QuantaStor. To get started, first login to your QuantaStor system at the console. You'll need to make sure that your system is network connected with internet access as it will be downloading some necessary files and packages. Next, run the following two commands to install:

cd /opt/osnexus/quantastor/raid-tools
sudo lsimegaraid-install.sh

That's all there is to it. It will take a couple of minutes for the QuantaStor service to detect that the MegaRAID CLI is now installed but then you'll see the hardware configuration show up automatically in the web interface. The other thing is that this script will have upgraded the megaraid_sas driver included with QuantaStor. As such you must restart the system using the "Restart Storage System" option in the QuantaStor web management interface.

Also, new firmware is required to use the larger 3TB drives so it is recommended that you visit the Dell or LSI web site and download the latest firmware for you RAID controller. Here's an example of how to upgrade your MegaRAID (or DELL PERC) firmware using the MegaCli64 CLI which you'll find located under /opt/MegaRAID/MegaCli/MegaCli64 after running the install script.

root@quantastor:/opt/MegaRAID/MegaCli# ./MegaCli64 -AdpFwFlash -f FW1046E.rom -a0

Adapter 0: PERC H800 Adapter
Vendor ID: 0x1000, Device ID: 0x0079

FW version on the controller: 2.0.03-0772
FW version of the image file: 2.100.03-1046
Download Completed.
Flashing image to adapter...
Adapter 0: Flash Completed.

Exit Code: 0x00

HP SmartArray RAID integration

Like the LSI 3ware integration, the integration with the HP SmartArray series of HP RAID controllers is also handled through their CLI interface, the HP Array Configuration Utility (ACU) CLI, or hpacucli for short. An installation script is available under /opt/osnexus/quantastor/raid-tools called hpacucli-install.sh which will automate the download and installation of the HP ACU CLI. Once installed the HP hardware RAID configuration information will show up in QuantaStor automatically within a couple of minutes. Here's how to install the HP ACU CLI to activate the plug-in:

cd /opt/osnexus/quantastor/raid-tools
sudo hpacucli-install.sh

HP does not distribute the hpacucli in Ubuntu/Debian Linux package format so the installation of the cli requires a few downloads and a couple of additional linux packages to be installed. All of this is automated by the script shown above (hpacucli-install.sh) but note that it will take a few minutes.

Managing Storage Pools

Storage pools combine or aggregate one or more physical disks (SATA, SAS, or SSD) into a single pool of storage from which storage volumes (iSCSI targets) can be created. Storage pools can be created using any of the following RAID types including RAID0, RAID1, RAID5, RAID6, or RAID10. Choosing the optimal RAID type depends on your the I/O access patters of your target application, number of disks you have, and the amount of fault-tolerance you require. (Note: Fault tolerance is just a way of saying how many disks can fail within a storage pool or (aka RAID group before you lose data.) As a general guideline we recommend using RAID10 for all virtualization solutions and databases. RAID10 performs very well with sequential IO and random IO but is a bit more expensive since 1/2 the storage is used for fault tolerance. For archival storage or other similar workloads RAID6 and RAID5 are good choices. If you decide to use RAID6/5 with virtualization or other workloads that can produce a fair amount of random IO, we strongly recommend that you use a RAID controller with a batter backup unit/write back cache so that the RAID write penalty can be minimized.

RAID Levels

RAID1 & RAID5 allow you have one disk fail without it interrupting disk IO. When a disk fails you can remove it and you should add a spare disk to the 'degraded' storage pool as soon as possible to in order to restore it to a fault-tolerant status. You can also assign spare disks to storage pools ahead of time so that the recovery happens automatically. RAID6 allows for up to two disk to fail and will keep running whereas RAID10 can allow for one disk failure per mirror pair. Finally, RAID0 is not fault tolerant at all but it is your only choice if you have only one disk and it can be useful in some scenarios where fault-tolerance is not required. Here's a breakdown of the various RAID types and their pros & cons.

  • RAID0 layout is also called 'striping' and it writes data across all the disk drives in the storage pool in a round robin fashion. This has the effect of greatly boosting performance. The drawback of RAID0 is that it is not fault tolerant, meaning that if a single disk in the storage pool fails then all of your data in the storage pool is lost. As such RAID0 is not recommended except in special cases where the potential for data loss is non-issue.
  • RAID1 is also called 'mirroring' because it achieves fault tolerance by writing the same data to two disk drives so that you always have two copies of the data. If one drive fails, the other has a complete copy and the storage pool continues to run. RAID1 and it's variant RAID10 are ideal for databases and other applications which do a lot of small write I/O operations.
  • RAID5 achieves fault tolerance via what's called a parity calculation where one of the drives contains an XOR calculation of the bits on the other drives. For example, if you have 4 disk drives and you create a RAID5 storage pool, 3 of the disks will store data, and the last disk will contain parity information. This parity information on the 4th drive can be used to recover from any data disk failure. In the event that the parity drive fails, it can be replaced and reconstructed using the data disks. RAID5 (and RAID6) are especially well suited for audio/video streaming, archival, and other applications which do a heavy sequential write I/O operations (such as reading/writing large files) and are not as well suited for database applications which do heavy amounts of small random write I/O operations or with large file-systems containing lots of small files with a heavy write load.
  • RAID6 improves upon RAID5 in that it can handle two drive failures but it requires that you have two disk drives dedicated to parity information. For example, if you have a RAID6 storage pool comprised of 5 disks then 3 disks will contain data, and 2 disks will contain parity information. In this example, if the disks are all 1TB disks then you will have 3TB of usable disk space for the creation of volumes. So there's some sacrifice of usable storage space to gain the additional fault tolerance. If you have the disks, we always recommend using RAID6 over RAID5. This is because all hard drives eventually fail and when one fails in a RAID5 storage pool your data is left vulnerable until a spare disk is utilized to recover your storage pool back to a fault tolerant status. With RAID6 your storage pool is still fault tolerant after the first drive failure. (Note: Fault-tolerant storage pools (RAID1,5,6,10) that have suffered a single disk drive failure are called degraded because they're still operational but they require a spare disk to recover back to a fully fault-tolerant status.)
  • RAID10 is similar to RAID1 in that it utilizes mirroring, but RAID10 also does striping over the mirrors. This gives you the fault tolerance of RAID1 combined with the performance of RAID10. The drawback is that half the disks are used for fault-tolerance so if you have 8 1TB disks utilized to make a RAID10 storage pool, you will have 4TB of usable space for creation of volumes. RAID10 will perform very well with both small random IO operations as well as sequential operations and it is highly fault tolerant as multiple disks can fail as long as they're not from the same mirror-pairing. If you have the disks and you have a mission critical application we highly recommend that you choose the RAID10 layout for your storage pool.

In many cases it is useful to create more than one storage pool so that you have both basic low cost fault-tolerant storage available from perhaps a RAID5 storage pool, as well as a highly fault-tolerant RAID10 or RAID6 storage pool available for mission critical applications.

Once you have created a storage pool it will take some time to 'rebuild'. Once the 'rebuild' process has reached 1% you will see the storage pool appear in QuantaStor Manager and you can begin to create new storage volumes.

WARNING: Although you can begin using the pool at 1% rebuild completion, your storage pool is not fault-tolerant until the rebuild process has completed.

Target Port Configuration

Target ports are simply the network ports (NICs) through which your client hosts (initiators) access your storage volumes (aka targets). The terms 'target' and 'initiator' are SCSI terms that are synonymous with 'server' and 'client' respectively. QuantaStor supports both statically assigned IP addresses as well as dynamically assigned (DHCP) addresses. If you selected automatic network configuration when you initially installed QuantaStor then you'll have one port setup with DHCP and the others are likely offline. We recommend that you always use static IP addresses unless you have your DHCP server setup to specifically assign an IP address to your NICs as identified by MAC address. If you don't set the target ports up with static IP addresses you risk the IP address changing and losing access to your storage when the dynamically assigned address expires. To modify the configuration of a target port first select the tree section named "Storage System" under the "Storage Management" tab on the left hand side of the screen. After that, select the "Target Ports" tab in the center of the screen to see the list of target ports that were discovered. To modify the configuration of one of the ports, simply right-click on it and choose "Modify Target Port" from the pop-up menu. Alternatively you can press the "Modify" button in the tool bar at the top of the screen in the "Target Ports" section. Once the "Modify Target Port" dialog appears you can select the target port type for the selected port (static), enter the IP address for the port, subnet mask, and gateway for the port. You can also set the MTU to 9000 for jumbo packet support, but we recommend that you get your network configuration up and running with standard 1500 byte frames as jumbo packet support requires that you custom configure your host side NICs and network switch with 9K frames as well.

NIC Bonding / Trunking

QuantaStor supports NIC bonding, also called trunking, which allows you to combine multiple NICs together to improve performance and reliability. If combine two or more ports together into a virtual port you'll need to make sure that all the bonded ports are connected to the same network switch. There are very few exceptions to this rule. For example, if you have two networks and 4 ports (p1, p2, p3, p4) you'll want to create two separate virtual ports each bonding two NIC ports (p1, p2 / p3, p4) together and each pair connected to a separate network (p1, p2 -> network A / p3, p4 -> network B). This type of configuration is highly recommended as you have both improved bandwidth and have no single point of failure in the network or in the storage system. Of course you'll need your host to have at least 2 NIC ports and they'll each need to connect to the separate networks. For very simple configurations you can just connect everything to one switch but again, the more redundancy you can work into your SAN the better.

By default, QuantaStor v2 uses Linux bonding mode-0, a round-robin policy. This mode provides load balancing and fault tolerance by transmitting packets in sequential order from the first available interface through the last. QuantaStor will also support LACP 802.3ad Dynamic Link aggregation. Currently this requires changing Linux configuration files.

10GbE NIC support

QuantaStor works with all the major 10GbE cards from Chelsio, Intel and others. We recommend the Chelsio N320E cards and you can use NIC bonding in conjunction with 10GbE to further increase bandwidth. If you are using 10GbE we recommend that you designate your slower 1GbE ports as iSCSI disabled so that they are only used for management traffic.

Remote-Replication Configuration

Remote replication allows you to copy a volume or network share from one QuantaStor storage system to another and is a great tool for migrating volumes and network shares between systems and for using a remote system as a DR site. Remote replication is done asynchronously which means that changes to volumes and network shares on the original/source system are not kept in lock step with the copy on the remote storage system. Rather, the changes from the source system are replicated periodically to the target system, typically every hour or perhaps nightly using a replication schedule. Once a given set of the volumes and/or network shares have been replicated from one system to another the subsequent periodic replication operations send only the changes and all information sent over the network is compressed to minimize network bandwidth and encrypted for security.

Creating a Storage System Link

The first step in setting up remote replication between two systems is to create a Storage System Link between the two. This is accomplished through the QuantaStor Manager web interface by selecting the 'Remote Replication' tab, and then pressing the 'Create Storage System Link' button in the tool bar to bring up the the dialog. To create a storage system link you must provide the IP address of the remote system and the admin username and password for that remote system. You must also indicate the local IP address that the remote system will utilize for communication between the remote and local system. If both systems are on the same network then you can simply select one of the IP addresses from one of the local ports but if the remote system is in the cloud or remote location then most likely you will need to specify the external IP address for your QuantaStor system. Note that the two systems communicate over ports 22 and 5151 so you will need to open these ports in your firewall in order for the QuantaStor systems to link up properly.

Creating a Remote Replica

Once you have a Storage System Link created between two systems you can now replicate volumes and network shares in either direction. Simply login to the system that you want to replicate volumes from, right-click on the volume to be replicated, then choose 'Create Remote Replica'. Creating a remote replica is much like creating a local clone only the data is being copied over to a storage pool in a remote storage system. As such, when you create a remote-replica you must specify which storage system you want to replicate too (only systems which have established and online storage system links will be displayed) and which storage pool within that system should be utilized to hold the remote replica. If you have already replicated the specified volume to the remote storage system then you can re-sync the remote volume by choosing the remote-replica association in the web interface and choosing 'resync'. This can also be done via the 'Create Remote Replica' dialog and then choose the option to replicate to an existing target if available.


Alert Settings

QuantaStor allows you to thin-provision storage and over provision storage but that feature comes with the associated risk of running out of disk space. As such, you will want to make sure that you configure and test your alert configuration settings in the Alert Manager. The Alert Manager allows you to specify at which thresholds you want to receive email regarding low disk space alerts for your storage pools. It also let's you specify the SMTP settings for routing email.

Drop Session Dialog

Managing Hosts

Hosts represent the client computers that you assign storage volumes to. In SCSI terminology the host computers initiate the communication with your storage volumes (target devices) and so they are called initiators. Each host entry can have one or more initiators associated with it and the reason for this is because an iSCSI initiator (Host) can be identified by IP address or IQN or both at the same time. We recommend using the IQN (iSCSI Qualified Name) at all times as you can have login problems when you try to identify a host by IP address especially when that host has multiple NICs and they're not all specified.

Managing Host Groups

Sometimes you'll have multiple hosts that need to be assigned the same storage volume(s) such as with a VMware or a XenServer resource pool. In such cases we recommend making a Host Group object which indicates all of the hosts in your cluster/resource pool. With a host group you can assign the volume to the group once and save a lot of time. Also, when you add another host to the host group, it automatically gets access to all the volumes assigned to the group so it makes it very easy to add nodes to your cluster and manage storage from a group perspective rather than individual hosts which can be cumbersome especially for larger clusters.

Managing Snapshot Schedules

Snapshot schedules enable you to have your storage volumes automatically protected on a regular schedule by creating snapshots of them. You can have more than one snapshot schedule, and each schedule can be associated with any storage volumes even those utilized in other snapshot schedules. In fact, this is something we recommend. For storage volumes containing critical data you should create a snapshot schedule that makes a snapshot of your volumes at least once a day and we recommend that you keep around 10-20 snapshots so that you have a week or two of snapshots that you can recover from. A second schedule that creates a single snapshot on the weekend of your critical volumes is also recommended. If you set that schedule to retain 10 snapshots that will give you over two months of historical snapshots from which you can recover data from.

Near Continuous Data Protection (N-CDP)

What all this boils down to is a feature we in the storage industry refer to as continuous data protection or CDP. True CDP solutions allow you to recover to any prior point in time at the granularity of seconds. So if you wanted to see what a storage volume look like at 5:14am on Saturday you could look at a 'point-in-time' view of that storage volume at that exact moment. Storage systems that allow you to create large number of snapshots thereby giving you the ability to roll-back or recover from a snapshot that was created perhaps every hour are referred to as NCDP or "near continuous data protection" solutions, and that's exactly what QuantaStor is. This NCDP capability is achieved through snapshot schedules, so be sure to set one up to protect your critical volumes and network shares.

Managing Sessions

The list of active iSCSI sessions with the storage system can be found by selecting the 'Storage System' tree-tab in QuantaStor Manager then selecting the 'Sessions' tab in the center view. Here's a screenshot of a list of active sessions as shown in QuantaStor Manager.

Session List

Dropping Sessions

To drop an iSCSI session, just right-click on it and choose 'Drop Session' from the menu.

Drop Session Dialog

Keep in mind that some initiators will automatically re-establish a new iSCSI session if one is dropped by the storage system. To prevent this, just unassign the storage volume from the host so that the host cannot re-login.

Managing Storage Volumes

Each storage volume is a unique iSCSI device or 'LUN' as it is often referred to in the storage industry. The storage volume is essentially a disk drive on the network (the SAN) that you can assign to any host in your environment.

Creating Storage Volumes

Storage volumes can be provisioned 'thick' or 'thin' which indicates whether the storage for the volume should be fully reserved (thick) or not (thin). As an example, a 100GB storage volume in a 1TB storage pool will only use 4KB of disk space in the pool when it is initially created leaving .99TB of disk space left over for use with other volumes and additional volume provisioning. In contrast, if you choose 'thick' provisioning by unchecking the 'thin provisioning' option then the entire 100GB will be pre-reserved. The advantage there is that that volume can never run out of disk space due to low storage availability in the pool but since it is reserved up front you will have 900GB free in your 1TB storage pool after it has been allocated so you can end up using up your available disk space fairly rapidly using thick provisioning.

Deleting Storage Volumes

There are two separate dialogs in QuantaStor manager for deleting storage volumes. If you press the the "Delete Volume(s)" button in the ribbon bar you will be presented with a dialog that will allow you to delete multiple volumes all at once and you can even search for volumes based on a partial name match. This can save a lot of time when you're trying to delete a multiple volumes. You can also right-click on a storage volume and choose 'Delete Volume' which will bring up a dialog which will allow you to delete just that volume. If there are snapshots of the volume you are deleting they are not deleted rather, they are promoted. For example, if you have snapshots S1, S2 of volume A1 then the snapshots will become root/primary storage volumes after A1 is deleted. Once a storage volume is deleted all the data is gone so use extreme caution when deleting your storage volumes to make sure you're deleting the right volumes. Technically, storage volumes are internally stored as files on a ext4 or btrfs filesystem so it is possible that you could use a filesystem file recovery tool to recover a lost volume but in generally speaking you would need to hire a company that specializes in data-recovery to get this data back.

Resizing Storage Volumes

Creating Snapshots

QuantaStor snapshots are probably not like any snapshots you've used with any other storage vendor on the market. Some key features of QuantaStor volume snapshots include:

  • massive scalability
    • create hundreds of snapshots in just seconds
  • supports snapshots of snapshots
    • you can create snapshots of snapshots of snapshots, ad infinitum.
  • snapshots are R/W by default, read-only snapshots are also supported
  • snapshots perform extremely well even when large numbers exist
  • snapshots can be converted into primary storage volumes instantly
  • you can delete snapshots at any time and in any order
  • snapshots are 'thin', that is they are a copy of the meta-data associated with the original volume and not a full copy of all the data blocks.

All of these advanced snapshot capabilities make QuantaStor ideally suited for virtual desktop solutions, off-host backup, and near continuous data protection (NCDP). If you're looking to get NCDP functionality, just create a 'snapshot schedule' and snapshots can be created for your storage volumes as frequently as every hour.

To create a snapshot or a batch of snapshots you'll want to select the storage volume that you which to snap, right-click on it and choose 'Snapshot Storage Volume' from the menu.

If you do not supply a name then QuantaStor will automatically choose a name for you by appending the suffix "_snap" to the end of the original's volume name. So if you have a storage volume named 'vol1' and you create a snapshot of it, you'll have a snapshot named 'vol1_snap000'. If you create many snapshots then the system will increment the number at the end so that each snapshot has a unique name.

Creating Clones

Clones represent complete copies of the data blocks in the original storage volume, and a clone can be created in any storage pool in your storage system whereas a snapshot can only be created within the same storage pool as the original. You can create a clone at any time and while the source volume is in use because QuantaStor creates a temporary snapshot in the background to facilitate the clone process. The temporary snapshot is automatically deleted once the clone operation completes. Note also that you cannot use a cloned storage volume until the data copy completes. You can monitor the progress of the cloning by looking at the Task bar at the bottom of the QuantaStor Manager screen. In contrast to clones, snapshots are created near instantly and do not involve data movement so you can use them immediately.

Restoring from Snapshots

If you've accidentally lost some data by inadvertently deleting files in one of your storage volumes, you can recover your data quickly and easily using the 'Restore Storage Volume' operation. To restore your original storage volume to a previous point in time, first select the original, the right-click on it and choose "Restore Storage Volume" from the pop-up menu. When the dialog appears you will be presented with all the snapshots of that original from which you can recover from. Just select the snapshot that you want to restore to and press ok. Note that you cannot have any active sessions to the original or the snapshot storage volume when you restore, if you do you'll get an error. This is to prevent the restore from taking place while the OS has the volume in use or mounted as this will lead to data corruption.

WARNING: When you restore, the data in the original is replaced with the data in 
the snapshot.  As such, there's a possibility of loosing data as everything that 
was written to the original since the time the snapshot was created will be lost.  
Remember, you can always create a snapshot of the original before you restore it 
to a previous point-in-time snapshot.

Converting a Snapshot into a Primary

A primary volume is simply a storage volume that's not a snapshot of any other storage volume. With QuantaStor you can take any snapshot and make it a primary storage very easily. Just select the storage volume in QuantaStor Manager, then right-click and choose 'Modify Storage Volume' from the pop-up menu. Once you're in the dialog, just un-check the box marked "Is Snapshot?". If the snapshot has snapshots of it then those snapshots will be connected to the previous parent volume of the snapshot. This conversion of snapshot to primary does not involve data movement so it's near instantaneous. After the snapshot becomes a primary it will still have data blocks in common with the storage volume it was previously a snapshot of but that relationship is cleared from a management perspective.

IO Tuning

QuantaStor has a number of tunable parameters in the /etc/quantastor.conf file that can be adjusted to better match the needs of your application. That said, we've spent a considerable amount of time tuning the system to efficiently support a broad set of application types so we do not recommend adjusting these settings unless you are a highly skilled Linux administrator. The default contents of the /etc/quantastor.conf configuration file are as follows:

[device]
nr_requests=2048
scheduler=deadline
read_ahead_kb=512

[mdadm]
chunk_size_kb=256
parity_layout=left-symmetric

[btrfs]
nodatasum=false

There are tunable settings for device parameters, md array chunk-size and parity configuration settings, as well as some settings for btrfs. These configuration settings are read from the configuration file dynamically each time one of the settings is needed so there's no need to restart the quantastor service. Simply edit the file and the changes will be applied to the next operation that utilizes them. For example, if you adjust the chunk_size_kb setting for mdadm then the next time a storage pool is created it will use the new chunk size. Other tunable settings like the device settings will automatically be applied within a minute or so of your changes because the system periodically checks the disk configuration and updates it to match the tunable settings. Also, you can delete the quantastor.conf file and it will automatically use the defaults that you see listed above.

Troubleshooting

Resetting the admin password

If you forget the admin password you can reset it by logging into the system via the console or via SSH and then run these commands:

sudo -i
cd /opt/osnexus/quantastor/bin
service quantastor stop
./qs_service --reset-password=newpass
service quantastor start

In the above example the new password for the system is set to 'newpass' but you can change that to anything of your choice.

Storage pool creation fails at 16%

Many motherboards include onboard RAID support which in some cases can conflict with the software raid mechanism QuantaStor utilizes. There's an easy fix for this, simply remove the driver using these two commands after logging in via the console as 'qadmin':

sudo apt-get remove dmraid
sudo update-initramfs -u

Here are a couple of articles that go into the problem in more detail here and here.

The two commands noted above removes the dmraid driver that linux utilizes to communicates with the RAID chipset in your BIOS. Once removed the devices will no longer be locked down so the software RAID mechanism we utilize (mdadm) is then unable to use the disks.