1 of 6

Query Tuning

MariaDB ColumnStore query tuning optimizes analytics using data types, joins, projection elimination, WHERE clauses, and EXPLAIN for performance insights.

Collecting Statistics with ANALYZE TABLE

Overview

In MariaDB Enterprise ColumnStore 6, the ExeMgr process uses optimizer statistics in its query planning process.

ColumnStore uses the optimizer statistics to add support for queries that contain circular inner joins.

In Enterprise ColumnStore 5 and before, ColumnStore would raise the following error when a query containing a circular inner join was executed:

ERROR 1815 (HY000): Internal error: IDB-1003: Circular joins are not supported.

The optimizer statistics store each column's NDV (Number of Distinct Values), which can help the ExeMgr process choose the optimal join order for queries with circular joins. When Enterprise ColumnStore executes a query with a circular join, the query's execution can take longer if ColumnStore chooses a sub-optimal join order. When you collect optimizer statistics for your ColumnStore tables, the ExeMgr process is less likely to choose a sub-optimal join order.

Enterprise ColumnStore's optimizer statistics can be collected for ColumnStore tables by executing ANALYZE TABLE:

[[analyze-table|ANALYZE TABLE]] columnstore_tab;

Enterprise ColumnStore's optimizer statistics are not updated automatically. To update the optimizer statistics for a ColumnStore table, ANALYZE TABLE must be re-executed.

Enterprise ColumnStore does not implement an interface to show optimizer statistics.

Query Tuning Recommendations

When tuning queries for MariaDB Enterprise ColumnStore, there are some important details to consider.

Avoid Selecting Unnecessary Columns

Enterprise ColumnStore only reads the columns that are necessary to resolve a query.

For example, the following query selects every column in the table:

SELECT * FROM tab;

Whereas the following query only selects two columns in the table, so it requires less I/O:

SELECT col1, col2 FROM tab;

For best performance, only select the columns that are necessary to resolve a query.

Avoid Large Sorts

When Enterprise ColumnStore performs ORDER BY and LIMIT operations, the operations are performed in a single-threaded manner after the rest of the query processing has been completed, and the full unsorted result-set has been retrieved. For large data sets, the performance overhead can be significant.

Avoid Excessive Aggregations

When Enterprise ColumnStore 5 performs aggregations (i.e., DISTINCT, GROUP BY, COUNT(*), etc.), all of the aggregation work happens in-memory by default. As a consequence, more complex aggregation operations require more memory in that version.

For example, the following query could require a lot of memory in Enterprise ColumnStore 5, since it has to calculate many distinct values in memory:

SELECT DISTINCT col1 FROM tab LIMIT 10000;

Whereas the following query could require much less memory in Enterprise ColumnStore 5, since it has to calculate fewer distinct values:

SELECT DISTINCT col1 FROM tab LIMIT 100;

In Enterprise ColumnStore 6, disk-based aggregations can be enabled.

For best performance, avoid excessive aggregations or enable disk-based aggregations.

For additional information, see "Configure Disk-Based Aggregations".

Avoid Non-Distributed Functions

When Enterprise ColumnStore evaluates built-in functions and aggregate functions, it can often evaluate the function in a distributed manner. Distributed evaluation of functions can significantly improve performance.

Enterprise ColumnStore supports distributed evaluation for some built-in functions. For other built-in functions, the function must be evaluated serially on the final result set.

Enterprise ColumnStore also supports distributed evaluation for user-defined functions developed with ColumnStore's User-Defined Aggregate Function (UDAF) C++ API. For functions developed with Enterprise Server's standard User-Defined Function (UDF) API, the function must be evaluated serially on the final result set.

For best performance, avoid non-distributed functions.

Optimize Large Joins

By default, Enterprise ColumnStore performs all joins as in-memory hash joins.

If the joined tables are very large, the in-memory hash join can require too much memory for the default configuration. There are a couple options to work around this:

Enterprise ColumnStore can be configured to use more memory for in-memory hash joins.
Enterprise ColumnStore can be configured to use disk-based joins.
Enterprise ColumnStore can use optimizer statistics to better optimize the join order.

For additional information, see "Configure In-Memory Joins", "Configure Disk-Based Joins", and "Optimizer Statistics".

Load Ordered Data in Proper Order

Enterprise ColumnStore uses extent elimination to optimize queries. Extent elimination uses the minimum and maximum values in the extent map to determine which extents can be skipped for a query.

When data is loaded into Enterprise ColumnStore, it appends the data to the latest extent. When an extent reaches the maximum number of column values, Enterprise ColumnStore creates a new extent. As a consequence, if ordered data is loaded in its proper order, then similar values will be clustered together in the same extent. This can improve query performance, because extent elimination performs best when similar values are clustered together.

For example, if you expect to query a table with a filter on a timestamp column, you should sort the data using the timestamp column before loading it into Enterprise ColumnStore. Later, when the table is queried with a filter on the timestamp column, Enterprise ColumnStore would be able to skip many extents using extent elimination.

For best performance, load ordered data in proper order.

Enable Decimal Overflow Checks

When Enterprise ColumnStore performs mathematical operations with very big values using the DECIMAL, NUMERIC, and FIXED data types, the operation can sometimes overflow ColumnStore's maximum precision or scale. The maximum precision and scale depend on the version of Enterprise ColumnStore:

In Enterprise ColumnStore 6, the maximum precision (M) is 38, and the maximum scale (D) is 38.
In Enterprise ColumnStore 5, the maximum precision (M) is 18, and the maximum scale (D) is 18.

In Enterprise ColumnStore 6, applications can configure Enterprise ColumnStore to check for decimal overflows by setting the columnstore_decimal_overflow_check system variable, but only when the column has a decimal precision that is 18 or more:

SET SESSION columnstore_decimal_overflow_check=ON;

SELECT (big_decimal1 * big_decimal2) AS product
FROM columnstore_tab;

When decimal overflow checks are enabled, math operations have extra overhead.

When the decimal overflow check fails, MariaDB Enterprise ColumnStore raises an error with the ER_INTERNAL_ERROR error SQL, and it writes detailed information about the overflow check failure to the ColumnStore system logs.

User-Defined Aggregate Function (UDAF) C++ API

MariaDB Enterprise ColumnStore supports Enterprise Server's standard User-Defined Function (UDF) API. However, UDFs developed using that API cannot be executed in a distributed manner.

To support distributed execution of custom SQL, MariaDB Enterprise ColumnStore supports a Distributed User Defined Aggregate Functions (UDAF) C++ API:

The Distributed User Defined Aggregate Functions (UDAF) C++ API allows anyone to create aggregate functions of arbitrary complexity for distributed execution in the ColumnStore storage engine.
These functions can also be used as Analytic (Window) functions just like any built-in aggregate function.

Query Plans and Optimizer Trace

MariaDB ColumnStore's query plans and Optimizer Trace show how analytical queries run in parallel across its distributed, columnar architecture, aiding performance tuning.

Execution Plan (CSEP)

Overview

The ColumnStore storage engine uses a ColumnStore Execution Plan (CSEP) to represent a query plan internally.

When the select handler receives the SELECT_LEX object, it transforms it into a CSEP as part of the query planning and optimization process. For additional information, see "MariaDB Enterprise ColumnStore Query Evaluation."

Viewing the CSEP

The CSEP for a given query can be viewed by performing the following:

Calling the calSetTrace(1) function:

SELECT calSetTrace(1);

Executing the query:

SELECT column1, column2
FROM columnstore_tab
WHERE column1 > '2020-04-01'
AND column1 < '2020-11-01';

Calling the calGetTrace() function:

SELECT calGetTrace();

Job Steps

Overview

When Enterprise ColumnStore executes a query, the ExeMgr process on the initiator/aggregator node translates the ColumnStore execution plan (CSEP) into a job list. A job list is a sequence of job steps.

Enterprise ColumnStore uses many different types of job steps that provide different scalability benefits:

Some types of job steps perform operations in a distributed manner, using multiple nodes to operate to different extents. Distributed operations provide horizontal scalability.
Some types of job steps perform operations in a multi-threaded manner using a thread pool. Performing multi-threaded operations provides vertical scalability.

As you increase the number of ColumnStore nodes or the number of cores on each node, Enterprise ColumnStore can use those resources to more efficiently execute job steps.

For additional information, see "MariaDB Enterprise ColumnStore Query Evaluation.".

Batch Primitive Step (BPS)

Enterprise ColumnStore defines a batch primitive step to handle many types of tasks, such as scanning/filtering columns, JOIN operations, aggregation, functional filtering, and projecting (putting values into a SELECT list).

In calGetTrace() output, a batch primitive step is abbreviated BPS.

Batch primitive steps are evaluated on multiple nodes in parallel. The PrimProc process on each node evaluates the batch primitive step to one extent at a time. The PrimProc process uses a thread pool to operate on individual blocks within the extent in parallel.

Cross Engine Step (CES)

Enterprise ColumnStore defines a cross-engine step to perform cross-engine joins, in which a ColumnStore table is joined with a table that uses a different storage engine.

In calGetTrace() output, a cross-engine step is abbreviated CES.

Cross-engine steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

Enterprise ColumnStore can perform cross-engine joins when the mandatory utility user is properly configured.

For additional information, refer to the "Mandatory Utility User Account"

Dictionary Structure Step (DSS)

Enterprise ColumnStore defines a dictionary structure step to scan the dictionary extents that ColumnStore uses to store variable-length string values.

In calGetTrace() output, a dictionary structure step is abbreviated DSS.

Dictionary structure steps are evaluated on multiple nodes in parallel. The PrimProc process on each node evaluates the dictionary structure step to one extent at a time. It uses a thread pool to operate on individual blocks within the extent in parallel.

Dictionary structure steps can require a lot of I/O for a couple of reasons:

Dictionary structure steps do not support extent elimination, so all extents for the column must be scanned.
Dictionary structure steps must read the column extents to find each pointer and the dictionary extents to find each value, so it doubles the number of extents to scan.

It is generally recommended to avoid queries that will cause dictionary scans.

For additional information, see "Avoid Creating Long String Columns".

Hash Join Step (HJS)

Enterprise ColumnStore defines a hash join step to perform a hash join between two tables.

In calGetTrace() output, a hash join step is abbreviated HJS.

Hash join steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

Enterprise ColumnStore performs the hash join in memory by default. If you perform large joins, you may be able get better performance by changing some configuration defaults with mcsSetConfig:

Enterprise ColumnStore can be configured to use more memory for in-memory hash joins.
Enterprise ColumnStore can be configured to use disk-based joins.

For additional information, see "Configure in-memory joins" and "Configure Disk-Based Joins".

Having Step (HVS)

Enterprise ColumnStore defines a having step to evaluate a HAVING clause on a result set.

In calGetTrace() output, a having step is abbreviated HVS.

Subquery Step (SQS)

Enterprise ColumnStore defines a subquery step to evaluate a subquery.

In calGetTrace() output, a subquery step is abbreviated SQS.

Tuple Aggregation Step (TAS)

Enterprise ColumnStore defines a tuple aggregation step to collect intermediate aggregation prior to the final aggregation and evaluation of the results.

In calGetTrace() output, a tuple aggregation step is abbreviated TAS.

Tuple aggregation steps are primarily evaluated by the ExeMgr process on the initiator/aggregator node. However, the PrimProc process on each node also plays a role, since the PrimProc process on each node provides the intermediate aggregation results to the ExeMgr process on the initiator/aggregator node.

Tuple Annexation Step (TNS)

Enterprise ColumnStore defines a tuple annexation step to perform the final aggregation and evaluation of the results.

In calGetTrace() output, a tuple annexation step is abbreviated TNS.

Tuple annexation steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

Enterprise ColumnStore 5 performs aggregation operations in memory. As a consequence, more complex aggregation operations require more memory in that version.

In Enterprise ColumnStore 6, disk-based aggregations can be enabled.

For additional information, see "Configure Disk-Based Aggregations".

Tuple Union Step (TUS)

Enterprise ColumnStore defines a tuple union step to perform a union of two subqueries.

In calGetTrace() output, a tuple union step is abbreviated TUS.

Tuple union steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

Tuple Constant Step (TCS)

Enterprise ColumnStore defines a tuple constant step to evaluate constant values.

In calGetTrace() output, a tuple constant step is abbreviated TCS.

Tuple constant steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

Window Function Step (WFS)

Enterprise ColumnStore defines a window function step to evaluate window functions.

In calGetTrace() output, a window function step is abbreviated WFS.

Window function steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.