Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space ML1 and version 9.0

...

Table of Contents
maxLevel1

Sample Usage of clustrix_import

clustrix_import is highly configurable, but the only required input parameter is the -i input file parameterThe tool should already be available on your cluster in  located in the /opt/clustrix/bin. If it not available there, please contact Clustrix Support to get the appropriate version.  Note that it may be necessary to add execute permissions.

shell> chmod a+x clustrix_import 

Sample Usage of clustrix_import

directory, is highly configurable, see options below. Once you generated a dump file and obtained clustrix_import, simply run clustrix_import:

Local Invocation on Node

shell> clustrix_import -u importuser -p sekrit -i /clustrix/bigdump.sql 

The simplest use case is to run clustrix_import directly on a node just specifying the dump file. In this case clustrix_import connects through the MySQL socket to obtain the IP for each node. 

Note

You should have created a user with the proper permissions for importing. e.g. mysql>grant all on *.* to '<importuser>'@'%' identified by '<importuserpasswd>';

Compressed Dumps

shell> zcat bigdb.mysqldump.sql.gz | clustrix_import -i - -H clx -u rootimportuser -p sekrit --scan-for-newlines=0 \ --initial-slices=12 --slice-down-threshold=2  

...

shell> mysqldump -h mysqlhost -u root -p mysekrit somedb | tee /tmp/somedb.sql | clustrix_import -i - -H clx -D somedb -u rootimportuser -p sekrit  

This example demonstrates pipelining a mysqldump file directly into clustrix_import. Using tee to copy the dump output into a file allows us to restart the import if it fails for some reason without needing to completely redo the dump (we'd need to make sure /tmp is sufficiently large to accommodate the dump). If the dump needed to be restarted we could then use the -S option to start at the table where the failure had occurred:

shell> clustrix_import -i /tmp/somedb.sql -D somedb -H clx -u rootimportuser -p sekrit -S employees 

Using the Dump Map Options

shell> clustrix_import -u importuser -p sekrit -i hugedump.sql --write-dump-map dump.map -H master_vip
shell> clustrix_import -u importuser -p sekrit -i hugedump.sql --read-dump-map dump.map -H slave_vip 

...

option flag(s)                           descriptiondefault                                  
--host/-Hhostname or IP to reach the cluster; may also specify comma-delimited list to override automatic detection of IPs through the database (use if cluster is NAT'd)localhost (mysql socket)
--database/-Ddatabase to import into; must already exist, and assumes that dump does not specify database with USE statementnone
--port/-PMySQL port to connect to3306
--user/-u

username with which to connect to the database; must have sufficient privileges to CREATE and INSERT data (or just INSERT if --inserts-only)

rootlogged in linux user
--passwd/-ppassword for above usernone
--threads-per-node/-Tnumber of parallel threads to connect, per node8
--retry-limitNumber of times to retry a failed statement30

...

Note that a mysqldump taken from a ClustrixDB system will include SLICES=n within /$* */ comments. clustrix_import will process these comments and set slicing accordingly. This will override --initial-slices and unless -A is specified, scan-ahead and slice-down may still occur.  

Note

The size of a slice must be taken into consideration when setting the number of slices for import. By default a slice is 1GB or more precisely 1048576kb. This is the size at which the rebalancer splits slices and is set by the global variable: rebalancer_split_threshold_kb . The recommended size is 8GB.


INSERT Aggregation

Multi-row inserts are significantly more efficient for ClustrixDB to process. By default, mysqldump will generate multi-row inserts of approximately 1MB each.  However, in some cases (e.g. --skip-extended-insert), there is only one row per insert. This might also be the case if the import file is being generated through a means other than mysqldump, such as pgdump as part of a Postgres migration. 

...