Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space ML1 and version 9.2

...

If you are currently using MySQL, you can use the following methods to validate your application's compatibility with ClustrixDB: 

  • To validate write statements, set up ClustrixDB as a slave using the SBR mode of replication to ensure the ClustrixDB slave can handle your application's write queries.

    • SBR provides the simplest option to validate compatibility, but for a production deployment, RBR is recommended. 

  • To validate read statements, SQL queries may be captured either using tcpdump utility or by enabling full query logging.
    • Once you have captured tcpdump output, Clustrix Support can help convert it into valid SQL, which you can then replay using MySQL client.
  • To understand how ClustrixDB differs from MySQL review the document General Differences from MySQL.

another database, see MySQL Compatibility and related pages to review differences. 

Prerequisites for a Successful Evaluation

  1. Please ensure that your cluster is installed according to the Best Practices for ClustrixDB Xpand Platform Configuration, and that sufficient resources have been allocated for your tests. Note that the minimum supported hardware configuration is adequate for testing compatibility, but is not suitable for load or performance testing. 
  2. Check ClustrixDB and system logs for any recurring errors. 
    1. These could include environmental issues, such as transient network errors or storage subsystem issues, which could introduce inconsistencies in performance or cause test failures.
  3. It is best to perform one aspect of performance testing at a time to obtain consistent, comparable numbers. For example, unless you are specifically measuring mixed workload behavior, perform analytic query tests separately from OLTP tests. 

...

It is important that your test includes sufficient concurrency (multiple threads) to engage all cluster resources and adequately model production load. Further, clients should distribute their thread connections evenly across all the nodes in a cluster. Some test harnesses (including sysbench) allow specification of a pool of hostnames/IPs to facilitate this. Otherwise, a front-end load balancer can be used, as described in Load Balancing ClustrixDB Xpand with HAProxy. You can confirm that load is evenly distributed during your test run by checking the output of The CLX Command-Line Administration Tool stat command, which shows TPS for each node, or with a query such as:

...

  • mysqldump --where flag can be used to extract a subset of production data, including the neat trick --where "1 LIMIT 1000"
  • Modify queries as necessary to operate on this subset of data (i.e. adjust a date range).

Please see Loading Data Onto ClustrixDBXpand for guidance on optimizing data import performance.

Query Set Used for Testing

The ClustrixGui Administration UI will indicate if your test workload is dominated by a single query. You can also look for such a problem by running SHOW PROCESSLIST or selecting from system.sessions while running your test, to identify any long-running queries. If you discover queries that are dominating your workload, you can optimize them as described in Optimizing Query Performance.

Testing

...

Query Performance

For OLAP testing, customers Customers are typically concerned with query response time for complex particular queries, rather than aggregate throughput. In this case, a simple framework that executes queries one at a time is reasonable, but consider the following guidelines:can be used for testing. The following guidlelines are helpful in determining query performance. 

Execute Queries at Least Twice

...

  • Start with a cluster that is larger than you need, and try to achieve desired target TPS/Latency. 
  • For very large tables (> 100 Gigs) that will keep growing, import the data into 8 Gig slices with slices with one slice per core using clustrix_import with the -m option where -m is the number of slicesthe -m option to set the number of slices.  If your table is bigger than (8 x number of nodes x number of cores), then use this formula for the number of slices: number of slices = (ceil(table size / 8GB / number of nodes)) * number of nodes.  Without the -m option clustrix_import ensures each table/index has at least once slice per node.                        

    shell> clustrix_import -u<user> -p<passwd> -i <dump_filename> -D <databasename> -m 48  
  • Create smaller tables with a slice count that is a common multiple of the cluster sizes you plan to test (e.g. 18 if you plan to test 3, 6, and 9 nodes).
  • Once your tests are running within acceptable limits, scale cluster down to the point where it can sustain the load at < 80% average CPU.

  • If for some reason you need to grow the cluster again, you may need to re-slice your tables (ALTER TABLE <table_name> SLICES=<number of nodes>) after growing the cluster.
  • Check that the Rebalancer is finished rebalancing and that system.layer_merges are complete, after reducing or expanding the cluster.

...