Child pages
  • Bad Query Plan Logging

This is documentation for a previous version of ClustrixDB. Documentation for the latest version can be found here

Skip to end of metadata
Go to start of metadata

If a query reads significantly more data than it returns, its query plan might be faulty. Perhaps no index is being used or the chosen index doesn't constrain the data examination well. You can use the following logging features with the output from EXPLAIN to diagnose query performance and optimize SQL statements.

Options for bad plan query logging are as follows:

  1. session_log_bad_queries: (Boolean) To enable, specify 0. To disable, specify 1.
  2. session_log_bad_read_threshold: Specify the number of rows that a query must exceed for the event to be logged.
  3. session_log_bad_read_ratio: Specify the ratio of rows read to rows used

 

The following command logs a bad plan event if 20 or more rows are read:

SET GLOBAL session_log_bad_read_threshold = 20;

If a table is small enough to be scanned in memory, there might be no performance improvement obtained by adding an index to it. Recommendation: Set the log bad read threshold to 4000 to avoid logging small table scans, especially in production environments.

The session_log_bad_read_ratio variable controls the ratio of rows examined to rows used and logs any query returning results that exceed the specified limit. The following example sets the ratio to four.

SET GLOBAL session_log_bad_read_ratio = 4;

For example, selecting a single value from an unindexed table containing 25 rows requires the DBMS to scan all 25 rows. If log bad read threshold is set lower than 25 (as in the preceding example), a message like the following is added to the log file:

May 1 18:28:15 alpha011 real: INSTR SID:57105409 db=paulm ac=Y sql="select * from x where id = 10" [Ok: 1 rows selected] time 0.5ms; reads: 3; inserts: 1; deletes: 0; updates: 0; counts: 1; rows_read: 25; forwards: 4; rows_output: 3; semaphore_matches: 0


Creating an index on the id column eliminates the need to scan the entire table to satisfy the query. After the table is indexed, the single-value SELECT is no longer logged as having a bad query plan.


However, a simple index might not always solve the problem. For example, if you create and populate two tables as follows:

CREATE TABLE `x` (
   `id` int(11),
    KEY `idx_xid` (`id`)
);
 
-- Insert 25 rows into table x
 
CREATE TABLE `y` (
    `id` int(11),
    `x` int(11),
    KEY `xix` (`x`)
);
 
-- Insert 25 rows into table x


Then query them using a join as follows:

 SELECT * FROM x JOIN y ON x.id = y.x AND y.id = 2;

This query is logged as having a bad plan:

May 1 20:10:34 alpha011 real: INSTR SID:57365505 db=paulm ac=Y sql="select * from x join y on x.id = y.x and y.id = 2" [Ok: 1 rows selected] time 178.6ms; reads: 4;inserts: 1; deletes: 0; updates: 0; counts: 1; rows_read: 26; forwards: 5; e0

Twenty-six rows were read but only one was selected. The EXPLAIN command displays the following statistics:

EXPLAIN SELECT * FROM x JOIN y ON x.id = y.x AND y.id = 2;
 
+---------------------------------------+-----------+-----------+
| Operation                             | Est. Cost | Est. Rows |
+---------------------------------------+-----------+-----------+
| nljoin                                |    253.28 |     46.80 |
|   stream_combine                      |     45.20 |     18.00 |
|     filter (2.id = param(0))          |     14.13 |      6.00 |
|       index_scan 2 := y.__base_y      |     14.00 |      6.67 |
|   index_scan 1 := x.idx_xid, id = 2.x |     11.56 |      2.60 |
+---------------------------------------+-----------+-----------+
5 rows in set (0.00 sec)


Based on statistics, the query planner has decided that the join index x on table y is not very unique, so it scans to find the rows in y where id equals 2, then compares against values from x. An index on y.id enables the planner to use the constraint y.id = 2 to pre-filter rows from y, as follows:

CREATE INDEX ixy_id ON y(id);

When run against the newly-indexed table, the preceding SELECT is no longer logged as having a bad query plan. The EXPLAIN command now displays the following statistics:

EXPLAIN SELECT * FROM x JOIN y ON x.id = y.x AND y.id = 2;
 
+-------------------------------------------------+-----------+-----------+
| Operation                                       | Est. Cost | Est. Rows |
+-------------------------------------------------+-----------+-----------+
| nljoin                                          |     32.99 |      2.63 |
|   nljoin                                        |     21.31 |      1.01 |
|     index_scan 2 := y.ixy_id, id = param(0)     |     10.61 |      1.01 |
|     index_scan 2 := y.__base_y, rowid = 2.rowid |     10.60 |      1.00 |
|   index_scan 1 := x.idx_xid, id = 2.x           |     11.56 |      2.60 |
+-------------------------------------------------+-----------+-----------+
5 rows in set (0.01 sec)