ClustrixDB provides the ability to export Probability Distributions (PDs), which can be useful along with schema (DDL) to investigate the performance of a query. Typically, this procedure is performed upon request by Clustrix Support. The following section outlines how to export PDs using a python script, sierra_stats.
Sierra_stats is included by default as part of installation in /opt/clustrix/bin/ directory. The required inputs are --host,--user, and -p (prompt for password), and --file, which is the name of the .tar.gz file to produce.
./sierra_stats -u username -H hostname_or_ip -p --db database -f filename [space_separated_list_of_tables]
Running sierra_stats requires read only access to SYSTEM tables. If no list of tables is specified, sierra_stats can export all the tables in one database. To export tables from multiple database, include a fully qualified list of tables.
sierra_stats requires Python version 2.7
The output of sierra_stats is a .tar.gz file consisting:
Show help message and exit
Clustrix node hostname to connect to. In system.users if host for the user is 127.0.0.1, this must be used for hostname.
Username to connect as
Database port (default:3306)
Provide password on CLI (not recommended)
Prompt for password (recommended)
Database name to dump tables from
filename for export output