Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space ML1 and version 9.2

Table of Contents
maxLevel2

 

Info

Before partitioning your tables, please contact Clustrix Support to review your partitioning scheme, including the maximum number of planned partitions.

Why Partition a Table?

ClustrixDB’s distributed database architecture automatically handles many use cases for which legacy RDBMS applications required table partitions. Let’s look at some typical scenarios for partitioned tables and see if and how they are handled by ClustrixDB.

...

Historical usage of partitioned tables

...

How this translates to ClustrixDB

...

Dropping partitions is more efficient than deleting data and ClustrixDB supports table partitioning by ranges to optimize removal of aging data.

 

The above is the only legitimate use case for partitioning tables within ClustrixDB. Here's why:

...

For better query performance from partition elimination or partition pruning. Only partitions that contain potential matching values for a query are accessed.

...

ClustrixDB distributes data and indexes into slices and uses slices to ensure that only relevant data is accessed. Table partitioning provides no additional query performance benefit for ClustrixDB.

...

Table partitioning to utilize multiple disks. Large tables can be spread across multiple partitions over multiple disks to reduce disk contention.

...

ClustrixDB automatically distributes data slices across multiple nodes of the cluster using its Rebalancer. The Rebalancer automatically arranges data throughout the cluster to lessen disk contention. Explicit table partitioning is not required to utilize multiple disks and does not provide any improvement to disk utilization.

...

Partitioning tables to provide parallel execution.

...

ClustrixDB leverages distributed data slices and distributed query fragments for parallel execution on multiple nodes. Partitioning tables within ClustrixDB will provide no additional parallelism.

...

ClustrixDB leverages data slicing to provide parallelism for fast backup and restore. There is no additional benefit to backup or restore performance by partitioning a table.

ClustrixDB Table Partitioning Syntax

Range Partitioning 

Code Block
languagesql
> CREATE TABLE <table_definition> PARTITION BY RANGE <partition_function> (<partition_definition>);

> ALTER TABLE <table_name> ADD | DROP | TRUNCATE PARTITION  <partition_name>;
 
> ALTER TABLE <table_name> REORGANIZE PARTITION <partition_name> into (<partition_definition>);

> ALTER TABLE <table_name> PARTITION BY RANGE   <partition_function>  (<partition_definition>);

> ALTER TABLE <table_name> REMOVE PARTITIONING;

<partition_function>

The partition_function defines the column(s) and process by which a table will be partitioned. It is generally a function that returns an integer such as: YEAR(column_name).

Functions supported to derive an integer result are:

  • CEILING()

  • FLOOR()

  • TO_DAYS()

  • UNIX_TIMESTAMP()

  • YEAR() (See Caveat regarding YEAR datatype) 

  • YEARWEEK()

<partition_definition>

The partition_definition defines the uppermost value for a table’s partition. Data is placed in a table partition when the result of the partition_function (above) falls between two table partition boundaries. Data must qualify for one and only one partition. It is always advisable to define the last partition of a table using the maximum integer value allowable by specifying the MAXVALUE.

The format for defining partition ranges follows this syntax:

Code Block
languagesql
 PARTITION <partition_name> VALUES LESS THAN (integer),
 [PARTITION <partition_name> VALUES LESS THAN (integer), ...]
 PARTITION <partition_name> VALUES LESS THAN MAXVALUE

<partition_name>

Table partition names must conform to the standard conventions for naming database objects and must be unique for a given table. It is advisable to also make them meaningful.

Important <partition_function> Rules

Include <partition_function> Columns in Unique Indexes

Info

Note: The columns used by the <partition_function> must be part of each unique key defined for a table, including the primary key.

Expand
titleClick here to view/hide errors that occur if this rule is not followed. (ERROR #12314 and #12298)
Code Block
languagesql
Column used for partitioning is not within primary key ...
> CREATE TABLE sample (
  col1 INT NOT NULL,
  col2 DATE NOT NULL,
  col3 INT NOT NULL,
  col4 INT NOT NULL,
  PRIMARY KEY(col1, col3),
  UNIQUE KEY(col2, col4)
 )
 PARTITION BY RANGE (YEAR(col2)) (
 PARTITION p2000 VALUES LESS THAN (2000),
 PARTITION p2010 VALUES LESS THAN (2010),
 PARTITION pmax VALUES LESS THAN MAXVALUE);
 ERROR 1 (HY000): [12314] Invalid partitioning scheme: Partition function column "col2" is not part of primary key. All partition function columns must be part of the primary key.

Column used for partitioning is not within unique key ...
> CREATE TABLE sample (
  col1 INT NOT NULL,
  col2 DATE NOT NULL,
  col3 INT NOT NULL,
  col4 INT NOT NULL,
  PRIMARY KEY(col1, col2),
  UNIQUE KEY sample_UK (col3, col4)
 )
 PARTITION BY RANGE (YEAR(col2)) (
 PARTITION p2000 VALUES LESS THAN (2000),
 PARTITION p2010 VALUES LESS THAN (2010),
 PARTITION pmax VALUES LESS THAN MAXVALUE);
 ERROR 1 (HY000): [12298] Invalid constraint specification: Index sample.sample_UK on a partitioned table must contain the partition column

User Privileges Needed for Partitioning Tables

A user must have the following privileges to manage partitions for a table:

  • CREATE TABLE
  • ALTER TABLE
  • DROP TABLE

SHOW Partition Specifics for a Table

Use the following SQL to display the partitioning characteristics defined for a table. Use the FULL parameter to view more extensive information. Additional database specifics regarding partitions can also be seen by querying the information_schema.partitions view.

Code Block
languagesql
> SHOW [FULL] CREATE TABLE <table_name>;

Examples

Create Table with Partitions

Code Block
languagesql
> CREATE TABLE sample (
  col1 INT NOT NULL,
  col2 DATE NOT NULL,
  col3 INT NOT NULL,
  col4 INT NOT NULL,
  PRIMARY KEY(col1, col2),
  UNIQUE KEY(col2, col4)
 )
 PARTITION BY RANGE (YEAR(col2)) (
 PARTITION p2000 VALUES LESS THAN (2000),
 PARTITION p2010 VALUES LESS THAN (2010),
 PARTITION pmax VALUES LESS THAN MAXVALUE);

Adding More Partitions to a Table

Before the Last Partition

The ranges specified for table partitions cannot overlap. Data must qualify for one and only one table partition. If your table has been defined as the sample shown above with a catch-all partition defined for the upper limit (using MAXVALUE), you must reorganize the top level table partition to enable a new partition to be inserted beneath it.

Code Block
languagesql
> ALTER TABLE sample REORGANIZE PARTITION pmax into (
      PARTITION p2015 VALUES LESS THAN (2015),
      PARTITION pmax VALUES LESS THAN MAXVALUE);

Attempting to add a table partition that conflicts with another partition will result in this error:

Code Block
languagesql
> ALTER TABLE sample ADD PARTITION (PARTITION p2015 VALUES LESS THAN (2015));
ERROR 1 (HY000): [12314] Invalid partitioning scheme: New partition "p2015" (2015) overlaps with existing partition "pmax" (MAXVALUE)

After the Last Partition

If your table is partitioned such that it can be expanded beyond the current upper limit, then SQL similar to the following can be used.

Code Block
languagesql
> ALTER TABLE diff_sample ADD PARTITION (PARTITION p2015 VALUES LESS THAN (2015));

Implement Partitioning for an Unpartitioned Table

Code Block
languagesql
> ALTER TABLE sample_unpartitioned
 PARTITION BY RANGE (YEAR(col2)) (
 PARTITION p2000 VALUES LESS THAN (2000),
 PARTITION p2010 VALUES LESS THAN (2010),
 PARTITION p2015 VALUES LESS THAN (2015),
 PARTITION pmax VALUES LESS THAN MAXVALUE);

Drop Specific Table Partition(s)

Dropping a table partition removes both the data and the physical partition. This is particularly useful to purge historical data if you are partitioning your table by dates. Note that the DROP PARTITION command differs from TRUNCATE PARTITION which purges the data but leaves the partition available for reuse. Multiple table partitions can be dropped at one time.

Code Block
languagesql
> ALTER TABLE sample DROP PARTITION p2000, p2010;

Remove Table Partitioning Completely

Code Block
languagesql
> ALTER TABLE sample REMOVE PARTITIONING;

Modify Table Partition Ranges

To Expand Table Partitions (More Granularity)

To increase the number of partitions presently used for a table, emulate this SQL:

Code Block
languagesql
> ALTER TABLE sample REORGANIZE PARTITION p2015 into (
      PARTITION p2011 VALUES LESS THAN (2011),
      PARTITION p2012 VALUES LESS THAN (2012),
      PARTITION p2013 VALUES LESS THAN (2013),
      PARTITION p2014 VALUES LESS THAN (2014),
      PARTITION p2015 VALUES LESS THAN (2015));

To Consolidate Table Partitions (Less Granularity)

Table partitions can be consolidated provided they are consecutive and contiguous.

Code Block
languagesql
> ALTER TABLE sample REORGANIZE PARTITION 
 p2011, p2012, p2013, p2014, p2015 into (
      PARTITION p2015 VALUES LESS THAN (2015));

Rename Table Partitions

Code Block
languagesql
> ALTER TABLE sample REORGANIZE PARTITION pmax into (
        PARTITION new_name VALUES LESS THAN MAXVALUE);

Truncate Table Partitions

Truncating removes all data from a table partition resulting in an empty partition that can be reused. TRUNCATE PARTITION differs from DROP PARTITION which removes both a table partition and its contents. TRUNCATE is a DDL command and cannot be rolled back. DELETE can be used to remove the same data, but as a DML command, DELETE involves both row locking and logging for rollback and is slower as a result. TRUNCATE and DROP PARTITION are more efficient than DELETE.

ClustrixDB supports truncating a single table partition at a time.

Code Block
languagesql
ALTER TABLE sample TRUNCATE PARTITION P2000;

Unsupported in Partitioned Tables

This is the list of specific MySQL Partitioned Table features that are unsupported in ClustrixDB.

  • LIST, HASH, KEY, LINEAR KEY partitions

  • SUBPARTITIONS

  • Partitioning across functions such as: ABS(), DAYOFMONTH(), DAYOFWEEK(), DAYOFYEAR(), DATEDIFF(), EXTRACT(), MICROSECOND(), MINUTE(), MOD(), MONTH(), HOUR(), SECOND(), QUARTER(), TIME_TO_SEC(), WEEKDAY()

  • Temporary tables may not be partitioned

  • COALESCE PARTITION number

  • ANALYZE PARTITION {partition_names | ALL }

  • CHECK PARTITION {partition_names | ALL }

  • OPTIMIZE PARTITION {partition_names | ALL }

  • REBUILD PARTITION {partition_names | ALL }

  • REPAIR PARTITION {partition_names | ALL }

  • EXPLAIN PARTITION

  • SELECT FROM table_name PARTITION {partition_name | ALL}

Caveats of Using Partitioned Tables

  • To determine the number of slices utilized by a partition, SHOW FULL CREATE TABLE and count the number of HASHES.

  • A partition’s ordinal position (the consecutive sequence in which a partition was added to the database) is unavailable in ClustrixDB.

  • Adding data to a table that does not fit within any partition will cause this error - "ERROR 1 (HY000): [6144] No distribution for representation: No partition found for write".
  • Altering a partitioned table and modifying characteristics of a column used for partitioning requires that you respecify partitioning ranges for the table. For example:

    Code Block
    languagesql
    Given this table definition:
    > CREATE TABLE test (id INT KEY)
     PARTITION BY RANGE (id)
     (PARTITION p0 VALUES LESS THAN (1000),
     PARTITION p1 VALUES LESS THAN (2000));
     Query OK, 0 rows affected (0.19 sec)
    
    Changing characteristics of the partitioned column causes this ALTER to fail:
    > ALTER TABLE test
     MODIFY COLUMN id INT AUTO_INCREMENT;
     ERROR 1 (HY000): [12298] Invalid constraint specification: Index
     test.__idx_test__PRIMARY on a partitioned table must contain the partition column
    
    But respecifying the partitioning details allows this ALTER to succeed:
    > ALTER TABLE test
     MODIFY COLUMN id INT AUTO_INCREMENT
     PARTITION BY RANGE (id)
     (PARTITION p0 VALUES LESS THAN (1000),
     PARTITION p1 VALUES LESS THAN (2000));
     Query OK, 0 rows affected (3.98 sec)

...

If partitioning using a partition key that includes the YEAR datatype, you must also use the YEAR function as shown in this example:

Code Block
languagesql
 CREATE TABLE sample_year_partitioning (year_col YEAR PRIMARY KEY)
 PARTITION BY RANGE (YEAR(year_col))
 (PARTITION p0 VALUES LESS THAN (5),
 PARTITION p1 VALUES LESS THAN (10),
 PARTITION p2 VALUES LESS THAN MAXVALUE);

...

ClustrixDB automatically handles distributed query execution and data distribution without needing to configure table partitions. ClustrixDB supports RANGE partitioning, primarily to expedite removal. 

For more information on partitioned tables:

Why Partition a Table?

Excerpt Include+
scrollEditorUrl/display/ML1/.Why+Partition+a+Table+vFred
scrollEditorDisplayTitleWhy Partition a Table
nopaneltrue
scrollPageId0A5DD23E01695ED818DE65EC0208309D

Managing Partitions

Excerpt Include+
scrollEditorUrl/display/ML1/.Managing+Partitions+vFred
scrollEditorDisplayTitleManaging Partitions
nopaneltrue
scrollPageId0A5DD23E01695F71E013C7CD265803A6

Partitioned Tables Limitations

Excerpt Include+
scrollEditorUrl/display/ML1/.Partitioned+Tables+Limitations+vFred
scrollEditorDisplayTitlePartitioned Tables Limitations
nopaneltrue
scrollPageId0A5DD23E01695ECECF82A5B9175D21D8

Info

Before partitioning your tables, Clustrix Support can help review your partitioning scheme, including the maximum number of planned partitions.