Broadcasts occur when a node requests additional information from another node. This adds parallelism (a good thing) but can also add noise/congestion. Removing unnecessary or unhelpful broadcasts via schema or query changes can be beneficial to the system as a whole.
Index distribution that is not optimal can decrease performance by introducing unwanted broadcasts. Unnecessary broadcasts increase network traffic and resource contention, both of which degrade performance.
Xpand's Query Plan Cache (QPC) stores statistics for recently run queries. Use this query to identify statements that recently used extensive broadcasts.
sql> SELECT statement,
ORDER BY broadcasts DESC
Using ALLNODES to Reduce Broadcasts
Tables that have replicas on all nodes will not require broadcasts to other nodes to obtain data. This is especially helpful when small tables are joined to larger tables.
ALLNODES is best used for tables that meet the following criteria:
Relatively small (Table Size < 10MiB)
Written to infrequently (Write Frequency < 1K)
Read from frequently (Read Frequency > 1M)
Used frequently in joins to other, larger tables (e.g. metadata, lookup tables)
ALLNODES should not be used for:
Tables that take frequent writes
Identifying Potential ALLNODES Tables
sql> SELECT `database`,
FROM (SELECT `database`,
Sum(reads) AS reads,
Sum(inserts) + Sum(deletes) + Sum(replaces) AS writes,
Sum(bytes) AS bytes,
Count(DISTINCT( `index` )) AS paths,
Count(DISTINCT( slice )) AS slices,
Count(DISTINCT( replica )) AS replicas
FROM system.container_stats cs
JOIN system.table_replicas tr
ON cs.replica = tr.replica
GROUP BY 1,
2) AS a
WHERE bytes < 10 * 1024 * 1024
AND writes < 1000
AND reads > 1000000
AND slices > paths
ORDER BY 1, 2;
Modifying a Table to be ALLNODES
sql> ALTER TABLE foo.bar REPLICAS=ALLNODES;
Changing Index Distribution to Reduce Broadcasts
With multi-column indexes having distribution greater than one, there can be a benefit to reducing the distribution in some instances:
Distribution is > the number of columns - Change distribution to the number of columns (or to one)
Indexes that have mostly unique initial columns - Consider reducing distribution if unnecessary broadcasts are being performed.
EXPLAIN queries that use an index and look for stream_combine, which is usually necessary following a broadcast.
Clustrix recommends making these changes during off-peak hours or during a maintenance window.