ClustrixDB monitors the amount of space available within your cluster and proactively warns of potential capacity issues. The thresholds for determining a cluster’s capacity are configurable and described below.
Types of Storage
To understand how to manage the Device and Database Utilization, one must first understand how ClustrixDB allocates disk space. ClustrixDB creates and allocates space in three different files:
device1 (permanent storage)
Permanent storage is used for all database data, undo logs, temporary tables, Binlogs, and ClustrixDB system usage. Such data resides in the device1 file. For new installations, the initial allocation size of device1 is configurable when ClustrixDB is installed. It can be extended after installation using ALTER CLUSTER RESIZE DEVICES. The size of device1 file on disk cannot easily be decreased.
ClustrixDB expects the device1 file to be the same size on every node. On database startup, ClustrixDB will automatically attempt to resize the device1 file on each node to match the largest device1 file present in the cluster, providing the global device_auto_resize_to_largest is set to true (default on new installations).
device1-temp (temporary storage)
Temporary storage is used for sorting and grouping of large query results. When required, temporary information is stored in the device1-temp file. By default, this is set to 5GB (per node), but it is configurable with the global variable device_temporary_space_limit_bytes. Increasing this value will take effect immediately, while decreasing it will only take effect at database start. ClustrixDB will attempt to make the device1-temp file the same size on every node.
device1-redo (write-ahead log)
The write-ahead log (WAL) is stored in the device1-redo file. This size of this file is 4GB and is not configurable.
Checking Storage Utilization
|See how much space is in use by using The CLX Command-Line Administration Tool|
shell> /opt/clustrix/bin/clx space nid | Hostname | Status | Undo | Perm | WAL | Temp | Used | DB Total | FS Free ----+--------------+---------+-----------------+-----------------+------------------+------------+-----------------+----------+-------- 16 | eukanuba003 | OK | 321.8M (0.04%) | 674.7G (79.4%) | 1024.0M (0.12%) | 0 (0.00%) | 760.1G (89.4%) | 850.0G | 113.9G 17 | karma183 | OK | 313.5M (0.04%) | 664.6G (78.2%) | 1024.0M (0.12%) | 0 (0.00%) | 750.1G (88.2%) | 850.0G | 113.9G 18 | eukanuba002 | OK | 324.3M (0.04%) | 669.5G (78.8%) | 1024.0M (0.12%) | 0 (0.00%) | 755.0G (88.8%) | 850.0G | 113.9G 19 | eukanuba001 | OK | 339.7M (0.04%) | 671.0G (78.9%) | 1024.0M (0.12%) | 0 (0.00%) | 756.4G (89.0%) | 850.0G | 113.9G 20 | eukanuba005 | OK | 277.3M (0.03%) | 668.7G (78.7%) | 1024.0M (0.12%) | 0 (0.00%) | 754.1G (88.7%) | 850.0G | 113.9G 21 | eukanuba004 | OK | 420.3M (0.05%) | 678.6G (79.8%) | 1024.0M (0.12%) | 0 (0.00%) | 764.1G (89.9%) | 850.0G | 113.9G 22 | eukanuba006 | OK | 397.0M (0.05%) | 670.4G (78.9%) | 1024.0M (0.12%) | 0 (0.00%) | 755.9G (88.9%) | 850.0G | 113.9G 23 | karma184 | OK | 479.9M (0.06%) | 674.8G (79.4%) | 1024.0M (0.12%) | 0 (0.00%) | 760.3G (89.5%) | 850.0G | 113.9G ----+--------------+---------+-----------------+-----------------+------------------+------------+-----------------+----------+-------- 2.8G (0.04%) | 5.2T (79.0%) | 8.0G (0.12%) | 0 (0.00%) | 5.9T (89.1%) | 6.6T | 910.9G
The default values for these global variables are optimal for most workloads.
Device Configuration Variables
These variables control the uniformity of device1 size throughout the cluster and the size of the device1-temp space, respectively.
Automatically resize all (online) devices in the cluster to match the largest device
Maximum amount of bytes allowed to be used for temporary containers.
Database Storage Thresholds
Global variables establish the database storage thresholds for a cluster. When the first level of thresholds are exceeded, alerts are sent. If storage utilization continues to increase, user queries will begin to fail once the next set of thresholds are exceeded. Finally, if storage utilization continues to grow, system queries (including for critical internal processes) will be killed. Once the database is completely full, the database may become inoperable. See Issue Resolution below for suggestions on freeing space.
The following variables are use to set thresholds for device1 utilization.
|Variable||Description||Default Value||Allowed Values|
Database guard rail message interval in seconds.
Warn about user queries when space usage surpasses this percentage.
Maximum: databasefull_user_error_percentage - 1
Fail user queries when space usage surpasses this percentage.
Minimum: databasefull_user_warn_percentage + 1
Maximum: databasefull_system_warn_percentage - 1
Warn about system queries when space usage surpasses this percentage.
Minimum: databasefull_user_error_percentage + 1
Maximum: databasefull_system_error_percentage - 1
Fail user queries when space usage surpasses this percentage.
Minimum: databasefull_system_warn_percentage + 1
The following alerts are triggered when the corresponding global variable is exceeded. This is evaluated each time ClustrixDB allocates space and any alerts necessary are sent every databasefull_message_interval_s seconds. If multiple alerts are detected, only the most critical will appear. To learn more about ClustrixDB's Alerter, see Database Alerts .
|Global Variable Evaluated||Alert Triggered||Level||Description||Message Shown|
Database space low
Database space is nn% used. Soon user queries will fail.
Database space extreme
Database space is nn% used. User queries will now fail.
Database space critical
Database space is nn% used. User queries will fail, and soon system queries will fail.
Database space exhausted
Database space is nn% used. User queries and system queries will now fail.
Resolving Low Space Issues
When you receive any of the alerts above, some action will be necessary to prevent the capacity of device1 from reaching the next threshold.
Some resolutions to consider:
Add nodes to the cluster by Expanding Your Cluster's Capacity - Flex Up.
Increase available space on the cluster by:
Removing temporary tables
Enlarge the size of the device1 file on all nodes by using ALTER CLUSTER RESIZE DEVICES.
Terminate and reschedule long running transactions such as ALTERs, Backups, and long-running transactions. These halt garbage collection and cause the undo log to temporarily grow in size.
If you need assistance, please contact Clustrix Support.