Discover MariaDB ColumnStore, the powerful columnar storage engine for analytical workloads. Learn about its architecture, features, and how it enables high-performance data warehousing and analytics.
MariaDB ColumnStore Quickstart Guides provide concise, Docker-friendly steps to quickly set up, configure, and explore the ColumnStore analytic engine.
Quickstart guide for MariaDB ColumnStore
MariaDB ColumnStore is a specialized columnar storage engine designed for high-performance analytical processing and big data workloads. Unlike traditional row-based storage engines, ColumnStore organizes data by columns, which is highly efficient for analytical queries that often access only a subset of columns across vast datasets.
MariaDB ColumnStore is a columnar storage engine that integrates with MariaDB Server. It employs a massively parallel distributed data architecture, making it ideal for processing petabytes of data with linear scalability. It was originally ported from InfiniDB and is released under the GPL license.
Exceptional Analytical Performance: Delivers superior performance for complex analytical queries (OLAP) due to its columnar nature, which minimizes disk I/O by reading only necessary columns.
High Data Compression: Columnar storage allows for much higher compression ratios compared to row-based storage, reducing disk space usage and improving query speed.
Massive Scalability: Designed to scale horizontally across multiple nodes, processing petabytes of data with ease.
Just-in-Time Projection: Only the required columns are processed and returned, further optimizing query execution.
Real-time Analytics: Capable of handling real-time analytical queries efficiently.
MariaDB ColumnStore utilizes a distributed architecture with different components working together:
User Module (UM): Handles incoming SQL queries, optimizes them for columnar processing, and distributes tasks.
Performance Module (PM): Manages data storage, compression, and execution of query fragments on the data segments.
Data Files: Data is stored in column-segments across the nodes, highly compressed.
MariaDB ColumnStore is installed as a separate package that integrates with MariaDB Server. The exact installation steps vary depending on your operating system and desired deployment type (single server or distributed cluster).
General Steps (conceptual):
Install MariaDB Server: Ensure you have a compatible MariaDB Server version installed (e.g., MariaDB 10.5.4 or later).
Install ColumnStore Package: Download and install the specific MariaDB ColumnStore package for your OS. This package includes the ColumnStore storage engine and its associated tools.
Linux (e.g., Debian/Ubuntu): You would typically add the MariaDB repository configured for ColumnStore and then install mariadb-plugin-columnstore
.
Single Server vs. Distributed: For a single-server setup, you install all ColumnStore components on one machine. For a distributed setup, you install and configure components across multiple machines.
Configure MariaDB: After installation, you might need to adjust your MariaDB server configuration (my.cnf
or equivalent) to properly load and manage the ColumnStore engine.
Initialize ColumnStore: Run a specific columnstore-setup
or post-install
script to initialize the ColumnStore environment.
Once MariaDB ColumnStore is installed and configured, you can create and interact with ColumnStore tables using standard SQL.
a. Create a ColumnStore Table:
Specify ENGINE=ColumnStore when creating your table. Note that ColumnStore tables do not support primary keys in the same way as InnoDB, as their primary focus is analytical processing.
CREATE TABLE sales_data (
sale_id INT,
product_name VARCHAR(255),
category VARCHAR(100),
sale_date DATE,
quantity INT,
price DECIMAL(10, 2)
) ENGINE=ColumnStore;
b. Insert Data:
You can insert data using standard INSERT statements. For large datasets, bulk loading utilities (e.g., LOAD DATA INFILE) are highly recommended for performance.
INSERT INTO sales_data (sale_id, product_name, category, sale_date, quantity, price) VALUES
(1, 'Laptop', 'Electronics', '2023-01-15', 1, 1200.00),
(2, 'Mouse', 'Electronics', '2023-01-15', 2, 25.00),
(3, 'Keyboard', 'Electronics', '2023-01-16', 1, 75.00);
c. Query Data:
Perform analytical queries. ColumnStore will efficiently process these, often leveraging its columnar nature and parallelism.
-- Get total sales per category
SELECT category, SUM(quantity * price) AS total_sales
FROM sales_data
WHERE sale_date BETWEEN '2023-01-01' AND '2023-01-31'
GROUP BY category
ORDER BY total_sales DESC;
-- Count distinct products
SELECT COUNT(DISTINCT product_name) FROM sales_data;
Quickstart guide for MariaDB ColumnStore hardware requirements
MariaDB ColumnStore is designed for analytical workloads and scales linearly with hardware resources. While the performance generally improves with more CPU cores, memory, and servers, understanding the minimum hardware specifications is crucial for successful deployment, especially in development and production environments.
MariaDB ColumnStore's performance directly benefits from additional hardware resources. More CPU cores enable greater parallel processing, increased memory allows for more data caching (reducing I/O), and more servers enable a larger distributed architecture.
The specifications differentiate between a basic development environment and a production-ready setup:
1. For Development Environments:
CPU: A minimum of 8 CPU cores.
Memory (RAM): A minimum of 32 GB.
Storage: Local disk storage is acceptable for development purposes.
2. For Production Environments:
CPU: A minimum of 64 CPU cores.
Note: This recommendation underscores the highly parallel nature of ColumnStore, which can effectively utilize a large number of cores for analytical processing.
Memory (RAM): A minimum of 128 GB.
Note: Adequate memory is critical for caching data and intermediate results, directly impacting query performance.
Storage: StorageManager (S3) is recommended.
Note: This implies leveraging cloud-object storage (like AWS S3 or compatible services) for scalable and durable data persistence in production.
Minimum Network: For multi-server ColumnStore deployments, a minimum of a 1 Gigabit (1G) network is recommended.
Note: This facilitates efficient data transfer between nodes via TCP/IP for replication and query processing across the distributed architecture. For optimal performance in heavy-load scenarios, higher bandwidth (e.g., 10G or more) is highly beneficial.
Adhering to these minimum specifications will provide a baseline for ColumnStore functionality. For specific workload requirements, it's always advisable to conduct performance testing and scale hardware accordingly.
MariaDB ColumnStore uses a shared-nothing, distributed architecture with separate modules for SQL and storage, enabling scalable, high-performance analytics.
MariaDB ColumnStore enhances MariaDB Enterprise Server with a columnar engine for OLAP and HTAP workloads, using MPP for scalability. It supports cross-engine JOINs, integrates with S3 storage, and provides high-speed bulk loading with multi-node management via REST API.
MariaDB ColumnStore is a columnar storage engine designed for distributed massively parallel processing (MPP), such as for big data analysis. Deployments can be composed of several MariaDB servers or just one, each running several subprocess working together to provide linear scalability and exceptional performance with real-time response to analytical queries.
It provides a highly available, fault tolerant, and performant columnar storage engine for MariaDB Enterprise Server. MariaDB Enterprise ColumnStore is designed for data warehousing, decision support systems (DSS), online analytical processing (OLAP), and hybrid transactional-analytical processing (HTAP).
Columnar storage engine that enables MariaDB Enterprise Server to perform new workloads
Optimized for online analytical process (OLAP) workloads including data warehousing, decision support systems, and business intelligence
Single-stack solution for hybrid transactional-analytical workloads to eliminate barriers and prevent data silos
Implements cross-engine JOINs to join Enterprise ColumnStore tables with tables using row-based storage engines, such as InnoDB
Smart storage engine that plans and optimizes its own queries using a custom select handler
Scalable query execution using massively parallel processing (MPP) strategies, parallel query execution, and distributed function evaluation
S3-compatible object storage can be used for highly available, low-cost, multi-regional, resilient, scalable, secure, and virtually limitless data storage
High availability and automatic failover by leveraging MariaDB MaxScale
REST API for multi-node administration with the Cluster Management API (CMAPI) server
Connectors for popular BI platforms such as Microsoft Power BI and Tableau
High-speed bulk data loading that bypasses the SQL layer and does not block concurrent read-only queries
MariaDB Enterprise ColumnStore supports multiple topologies. Several options are described below. MariaDB Enterprise ColumnStore can be deployed in other topologies. The topologies on this page are representative of basic product capabilities.
MariaDB products can be deployed to form other topologies that leverage advanced product capabilities and combine the capabilities of multiple topologies.
The MariaDB Enterprise ColumnStore topology with Object Storage delivers production analytics with high availability, fault tolerance, and limitless data storage by leveraging S3-compatible storage.
The topology consists of:
One or more MaxScale nodes
An odd number of ColumnStore nodes (minimum of 3) running ES, Enterprise ColumnStore, and CMAPI
The MaxScale nodes:
Monitor the health and availability of each ColumnStore node using the MariaDB Monitor (mariadbmon)
Accept client and application connections
Route queries to ColumnStore nodes using the Read/Write Split Router (readwritesplit)
The ColumnStore nodes:
Receive queries from MaxScale
Execute queries
Use S3-compatible object storage for data
Use Shared Local Storage for the Storage Manager directory.
The MariaDB Enterprise ColumnStore topology with Shared Local Storage delivers production analytics with high availability and fault tolerance by leveraging shared local storage, such as NFS.
The topology consists of:
One or more MaxScale nodes
An odd number of ColumnStore nodes (minimum of 3) running ES, Enterprise ColumnStore, and CMAPI
The MaxScale nodes:
Monitor the health and availability of each ColumnStore node using the MariaDB Monitor (mariadbmon)
Accept client and application connections
Route queries to ColumnStore nodes using the Read/Write Split Router (readwritesplit)
The ColumnStore nodes:
Receive queries from MaxScale
Execute queries
Use Shared Local Storage for the DB Root directories.
Software Component
Role
• Columnar storage engine • Query execution • Data storage
Enterprise-grade database server
• Storage engine plugin • Integrates MariaDB Enterprise ColumnStore into MariaDB Enterprise Server
• REST API • Used for administrative tasks
• Database proxy • Accepts connections • Routes queries • Performs auto-failover
MariaDB Enterprise ColumnStore is the columnar storage engine that handles data storage and query optimization/execution.
MariaDB Enterprise ColumnStore is a columnar storage engine that is optimized for analytical or online analytical processing (OLAP) workloads, data warehousing, and DSS. MariaDB Enterprise ColumnStore can be used for hybrid transactional-analytical processing (HTAP) workloads when paired with a row-based storage engine, like InnoDB.
MariaDB Enterprise ColumnStore is built on top of MariaDB Enterprise Server. MariaDB Enterprise ColumnStore 5 is included with the standard MariaDB Enterprise Server 10.5 releases, while MariaDB Enterprise ColumnStore 6 is included with the standard MariaDB Enterprise Server 10.6 releases.
Enterprise ColumnStore interfaces with the Enterprise Server SQL engine through the ColumnStore storage engine plugin.
MariaDB has been continually improving the integration of MariaDB Enterprise ColumnStore with MariaDB Enterprise Server:
MariaDB ColumnStore required special custom-built releases of MariaDB Server.
MariaDB Enterprise ColumnStore was included with the standard MariaDB Enterprise Server 10.5 releases up to ES 10.5.5-3. It was the first release to replace the Operations/Administration/Maintenance (OAM) API with the more modern Cluster Management API (CMAPI), which is still in use.
Starting with ES 10.5.6-4, MariaDB Enterprise ColumnStore is included with the standard MariaDB Enterprise Server 10.5 releases.
MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.
The ColumnStore storage engine plugin is a smart storage engine that implements a custom select handler to fully take advantage of Enterprise ColumnStore's capabilities, such as:
Using a custom query planner
Selecting data by column instead of by row
Parallel query evaluation
Distributed aggregations
Distributed functions
Extent elimination
As a smart storage engine, the ColumnStore storage engine plugin tightly integrates Enterprise ColumnStore with ES, but it has enough independence to efficiently execute analytical queries using a completely unique approach.
For additional information, see "ColumnStore Storage Engine".
The Cluster Management API (CMAPI) server provides a REST API that can be used to configure and manage Enterprise ColumnStore.
CMAPI must run on every ColumnStore node in a multi-node deployment but is not required in a single-node deployment.
The REST API can be used to perform multiple operations:
Add ColumnStore nodes
Remove ColumnStore nodes
Start Enterprise ColumnStore
Shutdown Enterprise ColumnStore
Check the status of Enterprise ColumnStore
MariaDB Enterprise ColumnStore leverages MariaDB MaxScale as an advanced database proxy and query router.
Multi-node Enterprise ColumnStore deployments must have one or more MaxScale nodes. MaxScale performs many different roles:
Routing write queries to the primary server
Load balancing read queries on replica servers
Monitoring node health
Performing automatic failover if a node fails
MariaDB Enterprise ColumnStore's storage architecture provides a columnar storage engine with high availability, fault tolerance, compression, and automatic partitioning for production analytics and data warehousing.
For additional information, see "MariaDB Enterprise ColumnStore Storage Architecture".
MariaDB Enterprise ColumnStore is a columnar storage engine for MariaDB Enterprise Server. MariaDB Enterprise ColumnStore enables ES to perform analytical workloads, including online analytical processing (OLAP), data warehousing, decision support systems (DSS), and hybrid transactional-analytical processing (HTAP) workloads.
Most traditional relational databases use row-based storage engines. In row-based storage engines, all columns for a table are stored contiguously. Row-based storage engines perform very well for transactional workloads, but are less performant for analytical workloads.
Columnar storage engines store each column separately. Columnar storage engines perform very well for analytical workloads. Analytical workloads are characterized by ad hoc queries on very large data sets by relatively few users.
MariaDB Enterprise ColumnStore automatically partitions each column into extents, which helps improve query performance without using indexes.
MariaDB Enterprise ColumnStore supports S3-compatible object storage.
S3-compatible object storage is optional, but highly recommended. If S3-compatible object storage is used, Enterprise ColumnStore requires the Storage Manager directory to use Shared Local Storage (such as NFS) for high availability.
S3-compatible object storage is:
Compatible: Many object storage services are compatible with the Amazon S3 API.
Economical: S3-compatible object storage is often very low cost.
Flexible: S3-compatible object storage is available for both cloud and on-premises deployments.
Limitless: S3-compatible object storage is often virtually limitless.
Resilient: S3-compatible object storage is often low maintenance and highly available, since many services use resilient cloud infrastructure.
Scalable: S3-compatible object storage is often highly optimized for read and write scaling.
Secure: S3-compatible object storage is often encrypted-at-rest.
Many S3-compatible object storage services exist. MariaDB Corporation cannot make guarantees about all S3-compatible object storage services, because different services provide different functionality.
If you have any questions about using specific S3-compatible object storage with MariaDB Enterprise ColumnStore, contact us.
MariaDB Enterprise ColumnStore can use shared local storage.
Shared local storage is required for high availability. The specific Shared Local Storage requirements depend on whether Enterprise ColumnStore is configured to use S3-compatible object storage:
When S3-compatible object storage is used, Enterprise ColumnStore requires the Storage Manager directory to use shared local storage for high availability.
When S3-compatible object storage is not used, Enterprise ColumnStore requires the DB Root directories to use shared local storage for high availability.
The most common shared local storage options for on-premises and cloud deployments are:
NFS (Network File System)
GlusterFS
The most common shared local storage options for AWS (Amazon Web Services) deployments are:
EBS (Elastic Block Store) Multi-Attach
EFS (Elastic File System)
The most common shared local storage option for GCP (Google Cloud Platform) deployments is:
Filestore
MariaDB Enterprise ColumnStore uses distributed query execution and massively parallel processing (MPP) techniques to achieve vertical and horizontal scalability for production analytics and data warehousing.
For additional information, see "MariaDB Enterprise ColumnStore Query Evaluation".
MariaDB Enterprise ColumnStore uses extent elimination to scale query evaluation as table size increases.
Most databases are row-based databases that use manually-created indexes to achieve high performance on large tables. This works well for transactional workloads. However, analytical queries tend to have very low selectivity, so traditional indexes are not typically effective for analytical queries.
Enterprise ColumnStore uses extent elimination to achieve high performance, without requiring manually created indexes. Enterprise ColumnStore automatically partitions all data into extents. Enterprise ColumnStore stores the minimum and maximum values for each extent in the extent map. Enterprise ColumnStore uses the minimum and maximum values in the extent map to perform extent elimination.
When Enterprise ColumnStore performs extent elimination, it compares the query's join conditions and filter conditions (i.e., WHERE clause) to the minimum and maximum values for each extent in the extent map. If the extent's minimum and maximum values fall outside the bounds of the query's conditions, Enterprise ColumnStore skips that extent for the query.
Extent elimination is automatically performed for every query. It can significantly decrease I/O for columns with clustered values. For example, extent elimination works effectively for series, ordered, patterned, and time-based data.
The ColumnStore storage engine plugin implements a custom select handler to fully take advantage of Enterprise ColumnStore's capabilities.
All storage engines interact with ES using an internal handler API, which is highly extensible. Storage engines can implement different features by implementing different methods within the handler API.
For select statements, the handler API transforms each query into a SELECT_LEX object, which is provided to the select handler.
The generic select handler is not optimal for Enterprise ColumnStore, because:
Enterprise ColumnStore selects data by column, but the generic select handler selects data by row
Enterprise ColumnStore supports parallel query evaluation, but the generic select handler does not
Enterprise ColumnStore supports distributed aggregations, but the generic select handler does not
Enterprise ColumnStore supports distributed functions, but the generic select handler does not
Enterprise ColumnStore supports extent elimination, but the generic select handler does not
Enterprise ColumnStore has its own query planner, but the generic select handler cannot use it
The ColumnStore storage engine plugin is known as a smart storage engine, because it implements a custom select handler. MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.
If a storage engine implements a custom select handler, it is known as a smart storage engine.
As a smart storage engine, the ColumnStore storage engine plugin tightly integrates Enterprise ColumnStore with ES, but it has enough independence to efficiently execute analytical queries using a completely unique approach.
The ColumnStore storage engine plugin is a smart storage engine, so MariaDB Enterprise ColumnStore to plan its own queries using the custom select handler.
MariaDB Enterprise ColumnStore's query planning is divided into two steps:
ES provides the query's SELECT_LEX object to the custom select handler. The custom select handler builds a ColumnStore Execution Plan (CSEP).
The custom select handler provides the CSEP to the ExeMgr process on the same node. The ExeMgr process performs extent elimination and creates a job list.
When Enterprise ColumnStore executes a query, the ExeMgr process on the initiator/aggregator node translates the ColumnStore execution plan (CSEP) into a job list. A job list is a sequence of job steps.
Enterprise ColumnStore uses many different types of job steps that provide different scalability benefits:
Some types of job steps perform operations in a distributed manner using multiple nodes to operate on different extents. Distributed operations provide horizontal scalability.
Some types of job steps perform operations in a multi-threaded manner using a thread pool. Performing multi-threaded operations provides vertical scalability.
As you increase the number of ColumnStore nodes or the number of cores on each node, Enterprise ColumnStore can use those resources to more efficiently execute job steps.
MariaDB Enterprise ColumnStore leverages common technologies to provide highly available production analytics with automatic failover:
Technology
Role
• HA for data • Optional.
• With S3: HA for Storage Manager directory • Without S3: HA for DB Root directories
• Schema replication (ColumnStore tables) • Schema and data replication (non-ColumnStore tables) • Database object replication
• Monitoring • Automatic failover • Load balancing
• REST API • Administration • Add nodes • Remove nodes
MariaDB Enterprise ColumnStore can use shared local storage.
Shared local storage is required for high availability. The specific Shared Local Storage requirements depend on whether Enterprise ColumnStore is configured to use S3-compatible object storage:
When S3-compatible object storage is used, Enterprise ColumnStore requires the Storage Manager directory to use shared local storage for high availability.
When S3-compatible object storage is not used, Enterprise ColumnStore requires the DB Root directories to use Shared Local Storage for high availability.
The most common shared local storage options for on-premises and cloud deployments are:
NFS (Network File System)
GlusterFS
The most common shared local storage options for AWS (Amazon Web Services) deployments are:
EBS (Elastic Block Store) Multi-Attach
EFS (Elastic File System)
The most common shared local storage option for GCP (Google Cloud Platform) deployments is:
Filestore
MariaDB Enterprise ColumnStore requires MariaDB Replication to synchronize various database objects on multiple nodes for high availability.
MariaDB replication synchronizes:
The schemas for all ColumnStore tables on all nodes
The schemas and data for all non-ColumnStore tables on all nodes
All other databases objects (stored procedures, stored functions, user accounts, and other objects) on all nodes
MariaDB Enterprise ColumnStore requires MariaDB MaxScale to achieve high availability, automatic failover, and load balancing.
MariaDB Monitor (mariadbmon) in MaxScale monitors the health of each Enterprise ColumnStore node.
MaxScale provides load balancing by routing queries and/or connections to healthy nodes by:
Providing query-based routing using Read/Write Split Router (readwritesplit).
Providing connection-based routing using Read Connection Router (readconnroute).
When MaxScale's MariaDB Monitor notices the primary node fail, MariaDB Monitor performs automatic failover by:
Promoting a replica node to become the new primary node.
Re-configuring all replica nodes to replicate from the new primary node.
MariaDB Enterprise ColumnStore requires the [[Cluster Management API (CMAPI) Server for high availability.
The CMAPI server provides a REST API that can be used to manage and configure Enterprise ColumnStore.
The CMAPI server has a role in automatic failover. After MaxScale performs automatic failover, the CMAPI server detects the topology change and automatically re-configures the roles of each Enterprise ColumnStore node.
MariaDB Enterprise ColumnStore performs bulk data loads very efficiently using a variety of mechanisms including the cpimport tool, specialized handling of certain SQL statements, and minimal locking during data import.
For additional information, see "MariaDB Enterprise ColumnStore Data Loading".
MariaDB Enterprise ColumnStore includes a bulk data loading tool called cpimport, which provides several benefits:
Bypasses the SQL layer to decrease overhead
Does not block read queries
Requires a write metadata lock on the table, which can be monitored with the METADATA_LOCK_INFO plugin.
Appends the new data to the table. While the bulk load is in progress, the newly appended data is temporarily hidden from queries. After the bulk load is complete, the newly appended data is visible to queries.
Inserts each row in the order the rows are read from the source file. Users can optimize data loads for Enterprise ColumnStore's automatic partitioning by loading presorted data files. For additional information, see "Load Ordered Data in Proper Order".
Supports parallel distributed bulk loads
Imports data from text files
Imports data from binary files
Imports data from standard input (stdin)
MariaDB Enterprise ColumnStore enables batch insert mode by default.
When batch insert mode is enabled, MariaDB Enterprise ColumnStore, MariaDB Enterprise ColumnStore has special handling for the following statements:
Enterprise ColumnStore uses the following rules:
If the statement is executed outside of a transaction, Enterprise ColumnStore loads the data using cpimport, which is a command-line utility that is designed to efficiently load data in bulk. Enterprise ColumnStore executes cpimport using a wrapper called cpimport.bin.
If the statement is executed inside of a transaction, Enterprise ColumnStore loads the data using the DML interface, which is slower.
Batch insert mode can be disabled by setting the columnstore_use_import_for_batchinsert system variable. When batch insert mode is disabled, Enterprise ColumnStore executes the statements using the DML interface, which is slower.
MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.
When a bulk data load is running:
Read queries will not be blocked.
Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.
The write metadata lock (MDL) can be monitored with the METADATA_LOCK_INFO plugin.
MariaDB Enterprise ColumnStore supports backup and restore using well-known tools and methods.
Component
Backup Methods
• S3 snapshot
• File system snapshot • File copy
• MariaDB Enterprise Backup
For additional information, see "MariaDB Enterprise ColumnStore Backup and Restore".
MariaDB Enterprise ColumnStore can leverage S3 snapshots to backup S3-compatible object storage when it is used for Enterprise ColumnStore's data.
The S3-compatible object storage can be backed up by:
Locking the database on the primary node
Performing an S3 snapshot using the vendor's standard snapshot functionality
MariaDB Enterprise ColumnStore can leverage file system snapshots or file copy tools (such as rsync) to backup shared local storage when it is used for the Storage Manager directory or the DB Root directories.
The shared local storage can be backed up by:
Locking the database on the primary node
Performing a file system snapshot or using a file copy tool (such as rsync) to copy the contents of the Storage Manager directory and/or the DB Root directories.
MariaDB Enterprise ColumnStore can leverage the standard MariaDB Enterprise Backup utility to backup the Enterprise Server data directory.
The backup contains:
All ColumnStore schemas
All non-ColumnStore schemas and data
All other database objects
It does not contain:
ColumnStore data
This page is licensed: CC BY-SA / Gnu FDL
Clients issue a query to the MariaDB Server, which has the ColumnStore storage engine installed. MariaDB Server parses the SQL, identifies the involved ColumnStore tables, and creates an initial logical query execution plan.
Using the ColumnStore storage engine interface (ha_columnstore), MariaDB Server converts involved table references into ColumnStore internal objects. These are then handed off to the ExeMgr, which is responsible for managing and orchestrating query execution across the cluster.
The ExeMgr analyzes the query plan and translates it into a distributed ColumnStore execution plan. It determines the necessary query steps and the execution order, including any required parallelization.
The ExeMgr then references the extent map to identify which PrimProc instances hold the relevant data segments. It applies extent elimination to exclude any PrimProc nodes whose extents do not match the query’s filter criteria. \
The ExeMgr dispatches commands to the selected PrimProc instances to perform data block I/O operations.
The PrimProc components perform operations such as
Predicate filtering
Join processing
Initial aggregation
Data retrieval from local disk or external storage (e.g., S3 or cloud object storage)
They then return intermediate result sets back to the ExeMgr.
The ExeMgr handles:\
Final-stage aggregation
Window function evaluation
Result-set sorting and shaping
The completed result set is returned to the MariaDB Server, which performs any remaining SQL operations like ORDER BY, LIMIT, or computed expressions in the SELECT list.
Finally, the MariaDB Server returns the result set to the client.
MariaDB Enterprise ColumnStore's storage architecture is designed to provide great performance for analytical queries.
MariaDB Enterprise ColumnStore is a columnar storage engine for MariaDB Enterprise Server. MariaDB Enterprise ColumnStore enables ES to perform analytical workloads, including online analytical processing (OLAP), data warehousing, decision support systems (DSS), and hybrid transactional-analytical processing (HTAP) workloads.
Most traditional relational databases use row-based storage engines. In row-based storage engines, all columns for a table are stored contiguously. Row-based storage engines perform very well for transactional workloads, but are less performant for analytical workloads.
Columnar storage engines store each column separately. Columnar storage engines perform very well for analytical workloads. Analytical workloads are characterized by ad hoc queries on very large data sets by relatively few users.
MariaDB Enterprise ColumnStore automatically partitions each column into extents, which helps improve query performance without using indexes.
MariaDB Enterprise ColumnStore enables MariaDB Enterprise Server to perform analytical or online analytical processing (OLAP) workloads.
OLAP workloads are generally characterized by ad hoc queries on very large data sets. Some other typical characteristics are:
Each query typically reads a subset of columns in the table
Most activity typically consists of read-only queries that perform aggregations, window functions, and various calculations
Analytical applications typically require only a few concurrent queries
Analytical applications typically require scalability of large, complex queries
Analytical applications typically require efficient bulk loads of new data
OLAP workloads are typically required for:
Business intelligence (BI)
Health informatics
Historical data mining
Row-based storage engines have a disadvantage for OLAP workloads. Indexes are not usually very useful for OLAP workloads, because the large size of the data set and the ad hoc nature of the queries preclude the use of indexes to optimize queries.
Columnar storage engines are much better suited for OLAP workloads. MariaDB Enterprise ColumnStore is a columnar storage engine that is designed for OLAP workloads:
When a query reads a subset of columns in the table, Enterprise ColumnStore is able to reduce I/O by reading those columns and ignoring all others, because each column is stored separately
When most activity consists of read-only queries that perform aggregations, window functions, and various calculations, Enterprise ColumnStore is able to efficiently execute those queries using extent elimination, distributed query execution, and massively parallel processing (MPP) techniques
When only a few concurrent queries are required, Enterprise ColumnStore is able to maximize the use of system resources by using multiple threads and multiple nodes to perform work for each query
When scalability of large, complex queries is required, Enterprise ColumnStore is able to achieve horizontal and vertical scalability using distributed query execution and massively parallel processing (MPP) techniques
When efficient bulk loads of new data is required, Enterprise ColumnStore is able to bulk load new data without affecting existing data using automatic partitioning with the extent map
MariaDB Enterprise Server has had excellent performance for transactional or online transactional processing (OLTP) workloads since the beginning.
OLTP workloads are generally characterized by a fixed set of queries using a relatively small data set. Some other typical characteristics are:
Each query typically reads and/or writes many columns in the table
Most activity typically consists of small transactions that only read and/or write a small number of rows
Transactional applications typically require many concurrent transactions
Transactional applications typically require a fast response time and low latency
Transactional applications typically require ACID properties to protect data
OLTP workloads are typically required for:
Financial transactions performed by financial institutions and e-commerce sites
Store inventory changes performed by brick-and-mortar stores and e-commerce sites
Account metadata changes performed by many sites that stores personal data
Row-based storage engines have several advantages for OLTP workloads:
When a query reads and/or writes many columns in the table, row-based storage engines can find all columns on a single page, so the I/O costs of the operation are low
When a transaction reads/writes a small number of rows, row-based storage engines can use an index to find the page for each row without a full table scan
When many concurrent transactions are operating, row-based storage engines can implement transactional isolation by storing multiple versions of changed rows
When a fast response time and low latency are required, row-based storage engines can use indexes to optimize the most common queries
When ACID properties are required, row-based storage engines can implement consistency and durability with fewer performance trade-offs, since each row's columns are stored contiguously
InnoDB is ES's default storage engine, and it is a highly performant row-based storage engine.
MariaDB Enterprise ColumnStore enables MariaDB Enterprise Server to function as a single-stack solution for Hybrid transactional-analytical processing (HTAP) workloads.
Hybrid workloads are characterized by a mix of transactional and analytical queries. Hybrid workloads are also known as "Smart Transactions", "Augmented Transactions" "Translytical", or "Hybrid Operational-Analytical Processing (HOAP)".
Hybrid workloads are typically required for applications that require real-time analytics that lead to immediate action:
Financial institutions use transactional queries to handle financial transactions and analytical queries to analyze the transactions for business intelligence
Insurance companies use transactional queries to accept/process claims and analytical queries to analyze those claims for business opportunities or risks
Health providers use transactional queries to track electronic health records (EHR) and analytical queries to analyze the EHRs to discover health trends or prevent adverse drug interactions
MariaDB Enterprise Server provides multiple components to perform hybrid workloads:
For analytical queries, the Enterprise ColumnStore storage engine can be used.
For transactional queries, row-based storage engines, such as InnoDB, can be used.
For queries that reference both analytical and transactional data, ES's cross-engine join functionality can be used to join Enterprise ColumnStore tables with InnoDB tables.
MariaDB MaxScale is a high-performance database proxy, which can dynamically route analytical queries to Enterprise ColumnStore and transactional queries to the transactional storage engine.
MariaDB Enterprise ColumnStore supports multiple storage types:
Storage Type
Description
• S3-compatible object storage is optional, but recommended • Enterprise ColumnStore can use S3-compatible object storage to store data. •With multi-node Enterprise ColumnStore, the Storage Manager directory should use shared local storage for high availability.
• Required for multi-node Enterprise ColumnStore with high availability • Enterprise ColumnStore can use shared local storage to store data and metadata. •If S3-compatible storage is used for data, the shared local storage will only be used for the Storage Manager directory.
Non-Shared Local Storage
• Appropriate for single-node Enterprise ColumnStore. • Enterprise ColumnStore can use non-shared local storage to store data and metadata.
MariaDB Enterprise ColumnStore supports S3-compatible object storage.
S3-compatible object storage is optional, but highly recommended. If S3-compatible object storage is used, Enterprise ColumnStore requires the Storage Manager directory to use Shared Local Storage (such as NFS) for high availability.
S3-compatible object storage is:
Compatible: Many object storage services are compatible with the Amazon S3 API.
Economical: S3-compatible object storage is often very low cost.
Flexible: S3-compatible object storage is available for both cloud and on-premises deployments.
Limitless: S3-compatible object storage is often virtually limitless.
Resilient: S3-compatible object storage is often low maintenance and highly available, since many services use resilient cloud infrastructure.
Scalable: S3-compatible object storage is often highly optimized for read and write scaling.
Secure: S3-compatible object storage is often encrypted-at-rest.
Many S3-compatible object storage services exist. MariaDB Corporation cannot make guarantees about all S3-compatible object storage services, because different services provide different functionality.
If you have any questions about using specific S3-compatible object storage with MariaDB Enterprise ColumnStore, contact us.
MariaDB Enterprise ColumnStore can use any object store that is compatible with the Amazon S3 API.
Many object storage services are compatible with the Amazon S3 API, and compatible object storage services are available for cloud deployments and on-premises deployments, so vendor lock-in is not a concern.
MariaDB Enterprise ColumnStore's Storage Manager enables remote S3-compatible object storage to be efficiently used. The Storage Manager uses a persistent local disk cache for read/write operations, so that network latency has minimal performance impact on Enterprise ColumnStore. In some cases, it will even perform better than local disk operations.
Enterprise ColumnStore only uses the Storage Manager when S3-compatible storage is used for data.
Storage Manager is configured using storagemanager.cnf.
MariaDB Enterprise ColumnStore's Storage Manager directory is at the following path by default:
/var/lib/columnstore/storagemanager
To enable high availability when S3-compatible object storage is used, the Storage Manager directory should use Shared Local Storage and be mounted on every ColumnStore node.
When you want to use S3-compatible storage for Enterprise ColumnStore, you must configure Enterprise ColumnStore's S3 Storage Manager to use S3-compatible storage.
To configure Enterprise ColumnStore to use S3-compatible storage, edit /etc/columnstore/storagemanager.cnf
:
[ObjectStorage]
…
service = S3
…
[S3]
region = your_columnstore_bucket_region
bucket = your_columnstore_bucket_name
endpoint = your_s3_endpoint
aws_access_key_id = your_s3_access_key_id
aws_secret_access_key = your_s3_secret_key
# iam_role_name = your_iam_role
# sts_region = your_sts_region
# sts_endpoint = your_sts_endpoint
# ec2_iam_mode=enabled
# port_number = your_port_number
[Cache]
cache_size = your_local_cache_size
path = your_local_cache_path
The S3-compatible object storage options are configured under [S3]:
The bucket option must be set to the name of the bucket.
The endpoint option must be set to the endpoint for the S3-compatible object storage.
The aws_access_key_id
and aws_secret_access_key
options must be set to the access key ID and secret access key for the S3-compatible object storage.
To use a specific IAM role, you must uncomment and set iam_role_name, sts_region, and sts_endpoint
.
To use the IAM role assigned to an EC2 instance, you must uncomment ec2_iam_mode=enabled
.
To use a non-default port number, you must set port_number to the desired port.
The local cache options are configured under [Cache]:
The cache_size
option is set to 2 GB by default.
The path option is set to /var/lib/columnstore/storagemanager/cache
by default.
Ensure that the specified path has sufficient storage space for the specified cache size.
MariaDB Enterprise ColumnStore can use shared local storage.
Shared local storage is required for high availability. The specific Shared Local Storage requirements depend on whether Enterprise ColumnStore is configured to use S3-compatible object storage:
When S3-compatible object storage is used, Enterprise ColumnStore requires the Storage Manager directory to use shared local storage for high availability.
When S3-compatible object storage is not used, Enterprise ColumnStore requires the DB Root directories to use shared local storage for high availability.
The most common shared local storage options for on-premises and cloud deployments are:
NFS (Network File System)
GlusterFS
The most common shared local storage options for AWS (Amazon Web Services) deployments are:
EBS (Elastic Block Store) Multi-Attach
EFS (Elastic File System)
The most common shared local storage option for GCP (Google Cloud Platform) deployments is:
Filestore
The most common options for shared local storage are:
Shared Local Storage
Description
EBS (Elastic Block Store) Multi-Attach
• EBS is a high-performance block-storage service for AWS (Amazon Web Services). • EBS Multi-Attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported. • For deployments in AWS, EBS Multi-Attach is a recommended option for the Storage Manager directory , and Amazon S3 storage is the recommended option for data.
EFS (Elastic File System)
• EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services). • For deployments in AWS, EFS is a recommended option for the Storage Manager directory , and Amazon S3 storage is the recommended option for data.
Filestore
• Filestore is high-performance, fully managed storage for GCP (Google Cloud Platform). • For deployments in GCP, Filestore is the recommended option for the Storage Manager directory, and Google Object Storage (S3-compatible) is the recommended option for data.
NFS (Network File System)
• NFS is a distributed file system. • If NFS is used, the storage should be mounted with the sync option to ensure that each node flushes its changes immediately. • For on-premises deployments, NFS is the recommended option for the Storage Manager directory , and any S3-compatible storage is the recommended option for data.
GlusterFS
• GlusterFS is a distributed file system. • GlusterFS supports replication and failover.
Multi-node MariaDB Enterprise ColumnStore requires some directories to use shared local storage for high availability. The specific requirements depend on if MariaDB Enterprise ColumnStore is configured to use S3-compatible object storage:
Using S3-Compatible Object Storage?
Directories to use Shared Local Storage
For best results, MariaDB Corporation would recommend the following storage options:
Environment
Object Storage For Data
Shared Local Storage For Storage Manager
AWS
Amazon S3 storage
EBS Multi-Attach or EFS
GCP
Google Object Storage (S3-compatible)
Filestore
On-premises
Any S3-compatible object storage
NFS
MariaDB Enterprise ColumnStore's storage format is optimized for analytical queries.
MariaDB Enterprise ColumnStore stores data in DB Root directories when S3-compatible object storage is not configured.
In multi-node Enterprise ColumnStore, each node has its own DB Root directory.
The DB Root directories are at the following path by default:
/var/lib/columnstore/dataN
The N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment. For example, with a 3-node Enterprise ColumnStore deployment, this would refer to the following directories:
/var/lib/columnstore/data1
/var/lib/columnstore/data2
/var/lib/columnstore/data3
To enable high availability for the DB Root directories, each directory should be mounted on every ColumnStore node using Shared Local Storage.
Each column in a table is stored in units called extents.
By default, each extent contains the column values for 8 million rows. The physical size of each extent can range from 8 MB to 64 MB. When an extent reaches the maximum number of column values, Enterprise ColumnStore creates a new extent.
Each extent is stored in 8 KB blocks, and each block has a logical block identifier (LBID).
If a string column is longer than 8 characters, the value is stored in a separate dictionary file, and a pointer to the value is stored in the extent.
A segment file is used to store Enterprise ColumnStore data within a DB Root directory.
A segment file always contains two extents. When a segment file reaches its maximum size, Enterprise ColumnStore creates a new segment file.
The relevant configuration options are:
Option
Description
ExtentsPerSegmentFile
• Configures the maximum number of extents that can be stored in each segment file. • Default value is 2.
For example, to configure Enterprise ColumnStore to store more extents in each segment file using the mcsSetConfig utility:
$ mcsSetConfig ExtentMap ExtentsPerSegmentFile 4
Enterprise ColumnStore automatically groups a column's segment files into column partitions.
On disk, each column partition is represented by a directory in the DB Root. The directory contains the segment files for the column partition.
By default, a column partition can contain four segment files, but you can configure Enterprise ColumnStore to store more segment files in each column partition. When a column partition reaches the maximum number of segment files, Enterprise ColumnStore creates a new column partition.
The relevant configuration options are:
Option
Description
FilesPerColumnPartition
• Configures the maximum number of segment files that can be stored in each column partition. • Default value is 4.
For example, to configure Enterprise ColumnStore to store more segment files in each column partition using the mcsSetConfig utility:
$ mcsSetConfig ExtentMap FilesPerColumnPartition 8
Enterprise ColumnStore maintains an Extent Map to determine which values are located in each extent.
The Extent Map identifies each extent using its logical block identifier (LBID) values, and it maintains the minimum and maximum values within each extent.
The Extent Map is used to implement a performance optimization called Extent Elimination.
The primary node has a master copy of the Extent Map. When Enterprise ColumnStore is started, the primary node copies the Extent Map to the replica nodes.
While Enterprise ColumnStore is running, each node maintains a copy of the Extent Map in its main memory, so that it can be accessed quickly without additional I/O.
If the Extent Map gets corrupted, the mcsRebuildEM utility can rebuild the Extent Map from the contents of the database file system. The mcsRebuildEM utility is available starting in MariaDB Enterprise ColumnStore 6.2.2.
Enterprise ColumnStore automatically compresses all data on disk using either snappy or LZ4 compression. See the columnstore_compression_type system variable for how to select the desired compression type.
Since Enterprise ColumnStore stores a single column's data in each segment file, the data in each segment file tends to be very similar. The similar data usually allows for excellent compressibility. However, the specific data compression ratio will depend on a lot of factors, such as the randomness of the data and the number of distinct values.
Enterprise ColumnStore's compression strategy is tuned to optimize the performance of I/O-bound queries, because the decompression rate is optimized to maximize read performance.
Enterprise ColumnStore uses the version buffer to store blocks that are being modified.
The version buffer is used for multiple tasks:
It is used to rollback a transaction.
It is used for multi-version concurrency control (MVCC). With MVCC, Enterprise ColumnStore can implement read snapshots, which allows a statement to have a consistent view of the database, even if some of the underlying rows have changed. The snapshot for a given statement is identified by the system change number (SCN).
The version buffer is split between data structures that are in-memory and on-disk.
The in-memory data structures are hash tables that keep track of in-flight transaction. The hash tables store the LBIDs for each block that is being modified by a transaction. The in-memory hash tables start at 4 MB, and they grow as-needed. The size of the hash tables increases as the number of modified blocks increases.
An on-disk version buffer file is stored in each DB Root. By default, the on-disk version buffer file is 1 GB, but you can configure Enterprise ColumnStore to use a different file size. The relevant configuration options are:
Option
Description
VersionBufferFileSize
• Configures the size of the on-disk version buffer in each DB Root. • Default value is 1 GB.
For example, to configure Enterprise ColumnStore to use a larger on-disk version buffer file using the mcsSetConfig utility:
$ mcsSetConfig VersionBuffer VersionBufferFileSize 2GB
Using the Extent Map, ColumnStore can perform logical range partitioning and only retrieve the blocks needed to satisfy the query. This is done through Extent Elimination, the process of eliminating Extents from the results that don't meet the given join and filter conditions of the query, which reduces the overall I/O operations.
In Extent Elimination, ColumnStore scans the columns in join and filter conditions. It then extracts the logical horizontal partitioning information of each extent along with the minimum and maximum values for the column to further eliminate Extents. To eliminate an Extent when a column scan involves a filter, that filter is compared to the minimum and maximum values stored in each extent for the column. If the filter value is outside the Extents minimum and maximum value range, ColumnStore eliminates the Extent.
This behavior is automatic and well suited for series, ordered, patterned and time-based data, where the data is loaded frequently and often referenced by time. Any column with clustered values is a good candidate for Extent Elimination.
This page is licensed: CC BY-SA / Gnu FDL
When using ColumnStore, MariaDB Server creates a series of system databases used for operational purposes.
Database
Description
calpontsys
Database maintains table metadata about ColumnStore tables.
infinidb_querystats
Database maintains information about query performance. For more information, see Query Analysis.
columnstore_info
The database for stored procedures is used to retrieve information about ColumnStore usage. For more information, see the ColumnStore Information Schema tables.
This page is licensed: CC BY-SA / Gnu FDL
MariaDB Enterprise ColumnStore minimizes locking for analytical workloads, bulk data loads, and online schema changes.
MariaDB Enterprise ColumnStore supports lockless reads.
MariaDB Enterprise ColumnStore requires a table lock for write operations.
MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.
When a bulk data load is running:
Read queries will not be blocked.
Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.
The write metadata lock (MDL) can be monitored with the METADATA_LOCK_INFO plugin.
For additional information, see "MariaDB Enterprise ColumnStore Data Loading".
MariaDB Enterprise ColumnStore supports online schema changes, so that supported DDL operations can be performed without blocking reads. The supported DDL operations only require a write metadata lock (MDL) on the target table.
This page is: Copyright © 2025 MariaDB. All rights reserved.
MariaDB Enterprise ColumnStore is a smart storage engine designed to efficiently execute analytical queries using distributed query execution and massively parallel processing (MPP) techniques.
MariaDB Enterprise ColumnStore is designed to achieve vertical and horizontal scalability for production analytics using distributed query execution and massively parallel processing (MPP) techniques.
Enterprise ColumnStore evaluates each query as a sequence of job steps using sophisticated techniques to get the best performance for complex analytical queries. Some types of job steps are designed to scale with the system's resources. As you increase the number of ColumnStore nodes or the number of cores on each node, Enterprise ColumnStore can use those resources to more efficiently execute those types of job steps.
Enterprise ColumnStore stores each column on disk in extents. The storage format is designed to maintain scalability, even as the table grows. If an operation does not read parts of a large table, I/O costs are reduced. Enterprise ColumnStore uses a technique called extent elimination that compares the maximum and minimum values in the extent map to the query's conditions, and it avoids scanning extents that don't satisfy the conditions.
Enterprise ColumnStore provides exceptional scalability for analytical queries. Enterprise ColumnStore's design supports targeted scale-out to address increased workload requirements, whether it is a larger query load or increased storage and query processing capacity.
MariaDB Enterprise ColumnStore provides horizontal scalability by executing some types of job steps in a distributed manner using multiple nodes.
When Enterprise ColumnStore is evaluating a job step, the ExeMgr process or facility on the initiator/aggregator node requests the PrimProc process on each node to perform the job step on different extents in parallel. As more nodes are added, Enterprise ColumnStore can perform more work in parallel.
Enterprise ColumnStore also uses massively parallel processing (MPP) techniques to speed up some types of job steps. For some types of aggregation operations, each node can perform an initial local aggregation, and then the initiator/aggregator node only needs to combine the local results and perform a final aggregation. This technique can be very efficient for some types of aggregation operations, such as for queries that use the AVG(), COUNT(), or SUM() aggregate functions.
MariaDB Enterprise ColumnStore provides vertical scalability by executing some types of job steps in a multi-threaded manner using a thread pool.
When the PrimProc process on a node receives work, it executes the job step on an extent in a multi-threaded manner using a thread pool. Each thread operates on a different block within the extent. As more CPUs are added, Enterprise ColumnStore can work on more blocks in parallel.
MariaDB Enterprise ColumnStore uses extent elimination to scale query evaluation as table size increases.
Most databases are row-based databases that use manually-created indexes to achieve high performance on large tables. This works well for transactional workloads. However, analytical queries tend to have very low selectivity, so traditional indexes are not typically effective for analytical queries.
Enterprise ColumnStore uses extent elimination to achieve high performance, without requiring manually created indexes. Enterprise ColumnStore automatically partitions all data into extents. Enterprise ColumnStore stores the minimum and maximum values for each extent in the extent map. Enterprise ColumnStore uses the minimum and maximum values in the extent map to perform extent elimination.
When Enterprise ColumnStore performs extent elimination, it compares the query's join conditions and filter conditions (i.e., WHERE clause) to the minimum and maximum values for each extent in the extent map. If the extent's minimum and maximum values fall outside the bounds of the query's conditions, Enterprise ColumnStore skips that extent for the query.
Extent elimination is automatically performed for every query. It can significantly decrease I/O for columns with clustered values. For example, extent elimination works effectively for series, ordered, patterned, and time-based data.
The ColumnStore storage engine plugin implements a custom select handler to fully take advantage of Enterprise ColumnStore's capabilities.
All storage engines interact with ES using an internal handler API, which is highly extensible. Storage engines can implement different features by implementing different methods within the handler API.
For select statements, the handler API transforms each query into a SELECT_LEX
object, which is provided to the select handler.
The generic select handler is not optimal for Enterprise ColumnStore, because:
Enterprise ColumnStore selects data by column, but the generic select handler selects data by row
Enterprise ColumnStore supports parallel query evaluation, but the generic select handler does not
Enterprise ColumnStore supports distributed aggregations, but the generic select handler does not
Enterprise ColumnStore supports distributed functions, but the generic select handler does not
Enterprise ColumnStore supports extent elimination, but the generic select handler does not
Enterprise ColumnStore has its own query planner, but the generic select handler cannot use it
The ColumnStore storage engine plugin is known as a smart storage engine, because it implements a custom select handler. MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.
If a storage engine implements a custom select handler, it is known as a smart storage engine.
As a smart storage engine, the ColumnStore storage engine plugin tightly integrates Enterprise ColumnStore with ES, but it has enough independence to efficiently execute analytical queries using a completely unique approach.
The ColumnStore storage engine can use either the custom select handler or the generic select handler. The select handler can be configured using the columnstore_select_handler system variable:
Value
Description
AUTO
• When set to AUTO, Enterprise ColumnStore automatically chooses the best select handler for a given SELECT query. • AUTO was added in Enterprise ColumnStore 6.
OFF
• When set to OFF, Enterprise ColumnStore uses the generic select handler for all SELECT queries. • It is not recommended to use this value, unless recommended by MariaDB Support.
ON
• When set to ON, Enterprise ColumnStore uses the custom select handler for all SELECT queries. • ON is the default in Enterprise ColumnStore 5 and Enterprise ColumnStore 6.
MariaDB Enterprise ColumnStore performs join operations using hash joins.
By default, hash joins are performed in memory.
MariaDB Enterprise ColumnStore can be configured to allocate more memory for hash joins.
The relevant configuration options are:
Section
Option
Description
HashJoin
PmMaxMemorySmallSide
• Configures the amount of memory available for a single join. • Valid values are from 0 to 4 GB. • Default value is 1 GB.
HashJoin
TotalUmMemory
• Configures the amount of memory available for all joins. • Values can be specified as a percentage of total system memory or as a specific amount of memory. • Valid percentage values are from 0 to 100%. • Default value is 25%.
For example, to configure Enterprise ColumnStore to use more memory for hash joins using the mcsSetConfig utility:
$ mcsSetConfig HashJoin PmMaxMemorySmallSide 2G
$ mcsSetConfig HashJoin TotalUmMemory '40%'
MariaDB Enterprise ColumnStore can be configured to perform disk-based joins.
The relevant configuration options are:
Section
Option
Description
HashJoin
AllowDiskBasedJoin
• Enables disk-based joins • Valid values are Y and N • Default value is N.
HashJoin
TempFileCompression
• Enables compression for temporary files used by disk-based joins. • Valid values are Y and N • Default value is N.
SystemConfig
SystemTempFileDir
Configures the directory used for temporary files used by disk-based joins and aggregations • Default value is /tmp/columnstore_tmp_files
For example, to configure Enterprise ColumnStore to perform disk-based joins using the mcsSetConfig utility:
$ mcsSetConfig HashJoin AllowDiskBasedJoin Y
$ mcsSetConfig HashJoin TempFileCompression Y
$ mcsSetConfig SystemConfig SystemTempFileDir /mariadb/tmp
MariaDB Enterprise ColumnStore performs aggregation operations on all nodes in a distributed manner, and then all nodes send their results to a single node, which combines the results and performs the final aggregation.
By default, aggregation operations are performed in memory.
In Enterprise ColumnStore 5.6.1 and later, disk-based aggregations can be configured.
The relevant configuration options are:
Section
Option
Description
RowAggregation
AllowDiskBasedAggregation
• Enables disk-based joins • Valid values are Y and N • Default value is N.
RowAggregation
Compression
• Enables compression for temporary files used by disk-based joins. • Valid values are Y and N • Default value is N.
SystemConfig
SystemTempFileDir
Configures the directory used for temporary files used by disk-based joins and aggregations • Default value is /tmp/columnstore_tmp_files
For example, to configure Enterprise ColumnStore to perform disk-based aggregations using the mcsSetConfig utility:
$ mcsSetConfig RowAggregation AllowDiskBasedAggregation Y
$ mcsSetConfig RowAggregation Compression SNAPPY
$ mcsSetConfig SystemConfig SystemTempFileDir /mariadb/tmp
The ColumnStore storage engine plugin is a smart storage engine, so MariaDB Enterprise ColumnStore to plan its own queries using the custom select handler.
MariaDB Enterprise ColumnStore's query planning is divided into two steps:
ES provides the query's SELECT_LEX
object to the custom select handler. The custom select handler builds a ColumnStore Execution Plan (CSEP).
The custom select handler provides the CSEP to the ExeMgr process or facility on the same node. ExeMgr performs extent elimination and creates a job list.
The ColumnStore storage engine provides the CSEP to the ExeMgr process or facility on the same node, which will act as the initiator/aggregator node for the query.
Starting with MariaDB Enterprise ColumnStore 22.08, the ExeMgr facility has been integrated into the PrimProc process, so it is no longer a separate process.
ExeMgr performs multiple tasks:
Performs extent elimination.
Views the optimizer statistics.
Transforms the CSEP to a job list, which consists of job steps.
Assigns distributed job steps to the PrimProc process on each node.
Evaluates non-distributed job steps itself.
Provides final query results to ES.
When Enterprise ColumnStore executes a query, it goes through the following process:
The client or application sends the query to MariaDB MaxScale's listener port.
The query is processed by the Read/Write Split Router (readwritesplit
) service associated with the listener.
The service routes the query to the ES TCP port on a ColumnStore node.
MariaDB Enterprise Server (ES) evaluates the query using the handler interface.
The handler interface builds a SELECT_LEX
object to represent the query.
The handler interface provides the SELECT_LEX
object to the ColumnStore storage engine's select handler.
The select handler transforms the SELECT_LEX
object into a ColumnStore Execution Plan (CSEP).
The select handler provides the CSEP to the ExeMgr facility on the same node, which will act as the initiator/aggregator node for the query.
ExeMgr
transforms the CSEP into a job list, which consists of job steps.
ExeMgr
evaluates each job step sequentially.
If it is a non-distributed job step, ExeMgr
evaluates the job step itself.
If it is a distributed job step, ExeMgr
provides the job step to the PrimProc process on each node. The PrimProc process on each node evaluates the job step in a multi-threaded manner using a thread pool. After the PrimProc process on each node evaluates its job step, the results are returned to ExeMgr
on the initiator/aggregator node as a Row Group.
After all job steps are evaluated, ExeMgr
returns the results to ES.
ES returns the results to MaxScale.
MaxScale returns the results to the client or application.
This page is: Copyright © 2025 MariaDB. All rights reserved.
MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.
For deployment instructions and available documentation, see "MariaDB Enterprise ColumnStore".
Feature Summary The ColumnStore storage engine has the following features:
Feature
Detail
Resources
Storage Engine
ColumnStore
ACID-compliant
Yes
Main Memory Caching
Yes
To create a ColumnStore table, use the CREATE TABLE statement with the ENGINE=ColumnStore
option:
CREATE DATABASE columnstore_db;
CREATE TABLE columnstore_db.analytics_test (
id INT,
str VARCHAR(50)
) ENGINE = ColumnStore;
To deploy a multi-node Enterprise ColumnStore deployment, a configuration similar to below is required:
[mariadb]
log_error = mariadbd.err
character_set_server = utf8
collation_server = utf8_general_ci
log_bin = mariadb-bin
log_bin_index = mariadb-bin.index
relay_log = mariadb-relay
relay_log_index = mariadb-relay.index
log_slave_updates = ON
gtid_strict_mode = ON
# This must be unique on each cluster node
server_id = 1
To configure the mandatory utility user account, use the mcsSetConfig command:
$ sudo mcsSetConfig CrossEngineSupport Host 127.0.0.1
$ sudo mcsSetConfig CrossEngineSupport Port 3306
$ sudo mcsSetConfig CrossEngineSupport User cross_engine
$ sudo mcsSetConfig CrossEngineSupport Password cross_engine_passwd
This page is: Copyright © 2025 MariaDB. All rights reserved.
Managing MariaDB ColumnStore involves setup, configuration, and tools like mcsadmin and cpimport for efficient analytics.
The following table outlines the minimum recommended production server specifications which can be followed for both on premise and cloud deployments:
Item
Development Environment
Production Environment
Physical Server
8 Core CPU, 32 GB Memory
64 Core CPU, 128 GB Memory
Storage
Local disk
StorageManager (S3)
Network Interconnect
In a multi server deployment data will be passed around via TCP/IP networking. At least a 1G network is recommended.
These are minimum recommendations and in general the system will perform better with more hardware:
More CPU cores and servers will improve query processing response time.
More memory will allow the system to cache more data blocks in memory. We have users running system with anywhere from 64G RAM to 512 G RAM for UM and 32 to 64 G RAM for PM.
Faster network will allow data to flow faster between UM and PM nodes.
SSD's may be used, however the system is optimized towards block streaming which may perform well enough with HDD's for lower cost.
Where it is an option, it is recommended to use bare metal servers for additional performance since ColumnStore will fully consume CPU cores and memory.
In general it makes more sense to use a higher core count / higher memory server for single server or 2 server combined deployments.
In a deployment with multiple UM nodes the system will round robin requests from the mysqld handling the query to any ExeMgr in the cluster for load balancing. A higher bandwidth network such as 10g or 40g will be of benefit for large result set queries.
For AWS our own internal testing generally uses m4.4xlarge instance types as a cost effective middle ground. The R4.8xlarge has also been tested and performs about twice as fast for about twice the price.
This page is licensed: CC BY-SA / Gnu FDL
Managing MariaDB ColumnStore means deploying its architecture, scaling modules, and maintaining performance through monitoring, optimization, and backups.
MariaDB ColumnStore automatically creates logical horizontal partitions across every column. For ordered or semi-ordered data fields such as an order date this will result in a highly effective partitioning scheme based on that column. This allows for increased performance of queries filtering on that column since partition elimination can be performed. It also allows for data lifecycle management as data can be disabled or dropped by partition cheaply. Caution should be used when disabling or dropping partitions as these commands are destructive.
It is important to understand that a Partition in ColumnStore terms is actually 2 extents (16 million rows) and that extents & partitions are created according to the following algorithm in 1.0.x:
Create 4 extents in 4 files
When these are filled up (after 32M rows), create 4 more extents in the 4 files created in step 1.
When these are filled up (after 64M rows), create a new partition.
Information about all partitions for a given column can be retrieved using the calShowPartitions stored procedure which takes either two or three mandatory parameters: [database_name], table_name, and column_name. If two parameters are provided the current database is assumed. For example:
select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Enabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 row in set (0.05 sec)
The calDisablePartitions stored procedure allows for disabling of one or more partitions. A disabled partition still exists on the file system (and can be enabled again at a later time) but will not participate in any query, DML or import activity. The procedure takes either two or three mandatory parameters: [database_name], table_name, and partition_numbers separated by commas. If two parameters are provided the current database is assumed. For example:
select calDisablePartitions('orders','0.0.1');
+----------------------------------------+
| calDisablePartitions('orders','0.0.1') |
+----------------------------------------+
| Partitions are disabled successfully. |
+----------------------------------------+
1 row in set (0.28 sec)
The result showing the first partition has been disabled:
select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Disabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 row in set (0.05 sec)
The calEnablePartitions stored procedure allows for enabling of one or more partitions. The procedure takes the same set of parameters as calDisablePartitions. For example:
select calEnablePartitions('orders', '0.0.1');
+----------------------------------------+
| calEnablePartitions('orders', '0.0.1') |
+----------------------------------------+
| Partitions are enabled successfully. |
+----------------------------------------+
1 row in set (0.28 sec)
The result showing the first partition has been enabled:
select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Enabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 rows in set (0.05 sec)
The calDropPartitions stored procedure allows for dropping of one or more partitions. Dropping means that the underlying storage is deleted and the partition is completely removed. A partition can be dropped from either enabled or disabled state. The procedure takes the same set of parameters as calDisablePartitions. Extra caution should be used with this procedure since it is destructive and cannot be reversed. For example:
select calDropPartitions('orders', '0.0.1');
+--------------------------------------+
| calDropPartitions('orders', '0.0.1') |
+--------------------------------------+
| Partitions are enabled successfully |
+--------------------------------------+
1 row in set (0.28 sec)
The result showing the first partition has been dropped:
select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min Max Status
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 row in set (0.05 sec)
Information about a range of parititions for a given column can be retrieved using the calShowPartitionsByValue stored procedure. This procedure takes either four or five mandatory parameters: [database_name], table_name, column_name, start_value, and end_value. If four parameters are provided the current database is assumed. Only casual partition column types (INTEGER, DECIMAL, DATE, DATETIME, CHAR up to 8 bytes andVARCHAR up to 7 bytes) are supported for this function.
The function returns a list of partitions whose minimum and maximum values for the column 'col_name' fall completely within the range of 'start_value' and 'end_value'. For example:
select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02', '2010-07-24') |
+----------------------------------------------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Enabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 row in set (0.05 sec)
The calDisablePartitionsByValue stored procedure allows for disabling of one or more partitions by value. A disabled partition still exists on the file system (and can be enabled again at a later time) but will not participate in any query, DML or import activity. The procedure takes the same set of arguments as calShowPartitionsByValue. A good practice is to use calShowPartitionsByValue to identify the partitions to be disabled and then the same argument values used to construct the calDisablePartitionsByValue call. For example:
select calDisablePartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
+---------------------------------------------------------------------------------+
| caldisablepartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
+---------------------------------------------------------------------------------+
| Partitions are disabled successfully |
+---------------------------------------------------------------------------------+
1 row in set (0.28 sec)
The result showing the first partition has been disabled:
select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24’ ) |
+----------------------------------------------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Disabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 row in set (0.05 sec)
The calEnablePartitionsbyValue stored procedure allows for enabling of one or more partitions by value. The procedure takes the same set of arguments as calShowPartitionsByValue. A good practice is to use calShowPartitionsByValue to identify the partitions to be enabled and then the same argument values used to construct the calEnablePartitionsbyValue call. For example:
select calEnablePartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
+--------------------------------------------------------------------------------+
| calenablepartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
+--------------------------------------------------------------------------------+
| Partitions are enabled successfully |
+--------------------------------------------------------------------------------+
1 row in set (0.28 sec)
The result showing the first partition has been enabled:
select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24' ) |
+----------------------------------------------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Enabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 rows in set (0.05 sec)
The calDropPartitionsByValue stored procedure allows for dropping of one or more partitions by value. Dropping means that the underlying storage is deleted and the partition is completely removed. A partition can be dropped from either enabled or disabled state. The procedure takes the same set of arguments as calShowPartitionsByValue. A good practice is to use calShowPartitionsByValue to identify the partitions to be enabled and then the same argument values used to construct the calDropPartitionsByValue call. Extra caution should be used with this procedure since it is destructive and cannot be reversed. For example:
select calDropPartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
+------------------------------------------------------------------------------+
| caldroppartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
+------------------------------------------------------------------------------+
| Partitions are enabled successfully. |
+------------------------------------------------------------------------------+
1 row in set (0.28 sec)
The result showing the first partition has been dropped:
select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24' ) |
+----------------------------------------------------------------------------+
| Part# Min Max Status
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 row in set (0.05 sec)
Since the partitioning scheme is system maintained the min and max values are not directly specified but influenced by the order of data loading. If the goal is to drop a specific date range then additional deletes are required to achieve this. The following cases may occur:
For semi-ordered data, there may be overlap between min and max values between partitions.
As in the example above, the partition ranged from 1992-01-01 to 1998-08-02. Potentially it may be desirable to drop the remaining 1998 rows.
A bulk delete statement can be used to delete the remaining rows that do not fall exactly within partition ranges. The partition drops will be fastest, however the system optimizes bulk delete statements to delete by block internally so are still relatively fast.
delete from orders where orderdate <= '1998-12-31';
This page is licensed: CC BY-SA / Gnu FDL
MariaDB server includes an optional Audit Plugin that enables logging and tracking of all user access and statements. This is included and can be enabled for ColumnStore
To enable the audit plugin for the currently running instance (but no across restarts) run the following as mcsmysql with the default root account:
INSTALL PLUGIN server_audit
SONAME 'server_audit.so';
To have this persist across restarts edit the ColumnStore my.cnf file (example shown for root install):
$ vi /usr/local/mariadb/columnstore/mysql/my.cnf
[mysqld]
...
plugin_load=server_audit=server_audit.so
For more details see the audit plugin installation guide
To enable audit logging the following global variable must be set to ON:
SET GLOBAL server_audit_logging=ON;
To ensure this persists across restarts edit the ColumnStore my.cnf file (example shown for root install):
$ vi /usr/local/mariadb/columnstore/mysql/my.cnf
[server]
...
server_audit_logging=ON
This will enable logging to the file /usr/local/mariadb/columnstore/mysql/db/server_audit.log. For example:
20170914 17:31:24,centos,root,localhost,11,114,QUERY,loans,'SELECT grade, AVG(loan_amnt) avg, FROM loanstats GROUP BY grade ORDER BY grade',0
To have the log entries written to syslog the global variable server_audit_output_type should be changed from 'file' to 'syslog'. In this case the 'syslog_info' entry contains the ColumnStore server instance name, for example:
Sep 14 17:46:51 centos mysql-server_auditing: columnstore-1 centos,root,localhost,11,117,QUERY,loans,'SELECT grade, AVG(loan_amnt) avg,FROM loanstats GROUP BY grade ORDER BY grade',0
For additional configuration and customization options see the Audit Plugin documentation.
This page is licensed: CC BY-SA / Gnu FDL
In the case where an entry in the MariaDB ColumnStore's configuration needs to be updated and distributed, this can be done from the command line from Performance Module #1. All changes made to MariaDB ColumnStore's configuration file need to be applied on PM1.
NOTE: 'mcsadmin distributeconfigfile' only needs to be run if the system is Active. If the system is down, then just make the change and when the system is started, the update will get distributed.
Here is an example
/usr/local/mariadb/columnstore/bin/configxml.sh setconfig SystemConfig SystemName mcs-1
mcsadmin distributeconfigfile
In the cases where the MariaDB ColumnStore's configuration files gets out of sync with the PM1 copy, run the following to get MariaDB ColumnStore's configuration file redistribute to the other nodes.
To do this run the following on PM1:
mscadmin distributeconfigfile
This page is licensed: CC BY-SA / Gnu FDL
When new PM nodes are added to a running instance it may be desirable to redistribute the data in current nodes across all of the nodes. This is not strictly required as ongoing data ingestion will prioritize the new empty nodes for data loading to rebalance the system.
An important point is that the operation works at Partition granularity, so a minimal data set is 64M rows in a table for this to run.
A command redistributeData is available in the admin console to initiate a data distribution:
mcsadmin> redistributeData start
redistributedata Tue Dec 13 04:42:31 2016
redistributeData START
Source dbroots: 1 2
Destination dbroots: 1 2
WriteEngineServer returned status 1: Cleared.
WriteEngineServer returned status 2: Redistribute is started.
The command has 3 possible options:
Start : start a new redistribution to redistribute data equally amongst the current set of DBRoots in the system.
Stop : abort a redistribution leaving the system in a usable state.
Status : return status information on an active redistribution.
The start command can take an option of Remove. The Start Remove option is used to remove all data from the enumerated list of dbroots and redistribute the data to the remaining dbroots. This should be done before taking a dbroot out of service in order to preserve the data on that dbroot. The dbroot list is a space delimited list of integers representing the dbroots to be emptied.
As mentioned above, the operation works at partition granularity, which means that a minimal move is 64 million rows. Any table smaller than that will not be redistributed and there may be as much as one full Partition difference in the resulting balance. The redistribute logic does not currently consolidate individually deleted records.
Redistribute can take a long time. During this time, it is required that all data manipulation including bulk inserts are suspended. SuspendDatabaseWrites must be called before redistributedata and ResumeDatabaseWrites should be called when the redistribution is complete.
If "redistributeData stop" is called, all processing stops where it's at, but in a usable state. "redistributeData status" can be used to see how much has been done. A further "redistributeData start" will start over using the new state of the system. This may lead to a less optimal distribution, so stop-start sequences aren't recommended.
While the system is working, "redistributeData status" can be called to see what's happening. a -r option can be used on the status command line to repeat the call and act as a monitor.
To see how much data resides on any given DBRoot for a table, you can use a query like:
select count(*) from <table> where idbdbroot(<any column>)=<dbrootnum>;
This page is licensed: CC BY-SA / Gnu FDL
ColumnStore is designed to be somewhat self managing and healing. The following 2 processes help achieve this:
ProcMon runs on each node and is responsible for ensuring that the other required ColumnStore processes are started and automatically restarted as appropriate on that server. This in turn is started and monitored by the run.sh shell script which ensures it is restarted should it be killed. The run.sh script is invoked and automatically started by the columnstore systemd service at bootup time. This can also be utilized to restart the service on an individual node though generally it is preferred to use the mcsadmin stop, shutdown, and start commands from the PM1 node.
ProcMgr runs on each PM node with only one taking an active role at a time, the others remaining in warm standby mode. This process manager is responsible for overall system health, resource monitoring, and PM node failover management.
To provide additional monitoring guarantees, an external monitoring tool should monitor the health of these 3 processes and potentially all. If the run.sh process fails then the system is at potential risk of not being able to self heal.
A number of system configuration variables exist to allow fine tuning of the system monitoring capabilities. In general the default values will work relatively well for many cases.
The configuration parameters are maintained in the /usr/local/mariadb/columnstore/etc/Columnstore.xml file. In a multiple server deployment these should only be edited on the PM1 server as this will be automatically replicated to other servers by the system. A system restart will be required for the configuration change to take affect.
Convenience utility programs getConfig and setConfig are available to safely update the Columnstore.xml without needing to be comfortable with editing XML files. The -h argument will display usage information. The section value will be SystemConfig for all settings in this document. For example:
# ./setConfig SystemConfig ModuleHeartbeatPeriod 5
# ./getConfig SystemConfig ModuleHeartbeatPeriod
5
Heartbeat monitoring occurs between modules (both UM and PM) to determine the module is up and functioning. The module heartbeat settings are the same for all modules.
ModuleHeartbeatPeriod refers to how often the heartbeat test is performed. For example, if you set the period to 5, then the heartbeat test is performed every 5 seconds. The initial default value is 1. To disable heartbeat monitoring set the value to -1.
ModuleHeartbeatCount refers to how many failures in a row must take place before a fault is processed. The initial default value is 3.
Thresholds can be set to trigger a local alert when file system usage crosses a specified percentage of a file system on a server. Critical, Major or Minor thresholds can be set for the disk usage for each server. However it is recommend to use an external system monitoring tool configured to monitor for free disk space to perform proactive external alerting or paging. Actual columnstore data is stored within the data directories of the installation and mariadb db files are stored under the mysql/db directory.
ExternalMinorThreshold - Percentage threshold for when a minor local alarm is triggered. Default value is 70.
ExternalMajorThreshold - Percentage threshold for when a minor local alarm is triggered. Default value is 80.
ExternalCriticalThreshold - Percentage threshold for when a minor local alarm is triggered. Default value is 90.
The value is a numeric percentage value between 0 and 100. To disable a particular threshold use value 0. To disable a threshold alarm, set it to 0.
A couple of mcsadmin commands provide convenience functions for monitoring memory utilization across nodes. getSystemMemory returns server level memory statistics and getSystemMemoryUsers shows the top 5 processes by server. The following examples are for a 2 server combined setup:
mcsadmin> getSystemMemory
getsystemmemory Tue Nov 29 11:14:21 2016
System Memory Usage per Module (in K bytes)
Module Mem Total Mem Used Cache Mem Usage % Swap Total Swap Used Swap Usage %
------ --------- -------- ------- ----------- ---------- --------- ------------
pm1 7979488 1014772 6438432 12 3145724 0 0
pm2 3850724 632712 1134324 16 3145724 0 0
mcsadmin> getSystemMemoryUsers
getsystemmemoryusers Tue Nov 29 11:41:10 2016
System Process Top Memory Users per Module
Module 'pm1' Top Memory Users (in bytes)
Process Memory Used Memory Usage %
----------------- ----------- --------------
mysqld 19621 3
PrimProc 18990 3
gnome-shell 10192 2
systemd-journald 4236 1
DDLProc 3004 1
Module 'pm2' Top Memory Users (in bytes)
Process Memory Used Memory Usage %
----------------- ----------- --------------
mysqld 19046 5
PrimProc 18891 5
ProcMon 2343 1
workernode 1806 1
WriteEngineServ 1507 1
To view the storage configuration, use the getStorageConfig command in mcsadmin, or simply use mcsadmin getStorageConfig from the operating system prompt. This will provide information on DBRoots and which PM they are assigned to, if any.
Example:
# mcsadmin getstorageconfig Wed Mar 28 10:40:34 2016
System Storage Configuration
Storage Type = internal
System DBRoot count = 6
DBRoot IDs assigned to 'pm1' = 1
DBRoot IDs assigned to 'pm2' = 2
DBRoot IDs assigned to 'pm3' = 3
DBRoot IDs assigned to 'pm4' = 4
DBRoot IDs assigned to 'pm5' = 5
DBRoot IDs assigned to 'pm6' = 6
An internal alarm system is used to keep track of internal notable events as a convenience or reference point. It is recommended to use a dedicated system monitoring tool for more proactive alerting of critical CPU, memory, or disk utilization issues for each of the servers.
Alarms are logged to the /var/log/mariadb/columnstore/alarm.log file and a summary is displayed in mcsadmin. The getActiveAlarms command in mcsadmin can be used to retrieve current alarm conditions.
For each module (PM and UM), the following resource monitoring parameters can be configured:
Resource Monitoring Parameter
mcsadmin command
CPU thresholds
setModuleTypeConfig (module name) ModuleCPU(Clear/ Minor/Major/Critical)Threshold n (where n= percentage of CPU usage)
Disk file system use threshold
setModuleTypeConfig (module name) ModuleDisk(Minor/ Major/Critical)Threshold n (where n= percentage of disk system used)
Module swap thresholds
setModuleTypeConfig (module name) ModuleSwap(Minor/ Major/Crictical)Threshold n (where n= percentage of swap space used)
For an alarm, a threshold can be set for how many times the alarm can be triggered in 30 minutes. The default threshold is 100.
setAlarmConfig (alarmID#) Threshold n
(where n= maximum number of times an alarm can be triggered in 30 minutes),
Example to change Alarm ID 22's threshold to 50:
# mcsadmin setAlarmConfig 22 Threshold 50
The resetAlarm command is used to clear and acknowledge the issue is resolved. The resetAlarm command can be invoked with the argument ALL to clear all outstanding local alarms.
ColumnStore by default has behavior that will restart a server should swap space utilization exceed the configured module swap major threshold (default is 80%). At this point the system will likely be near unusable and so this is an attempt to recover from very large queries or data loads. The behavior of this is configured by the SystemConfig section configuration variable SwapAction which contains the oam command to be run if the threshold is exceeded. The default value is 'restartSystem' but it can be set to 'none' to disable this behavior. The fact that this has happened can be determined by the following log entry:
Nov 01 11:23:13 [ServerMonitor] 13.306324 |0|0|0| C 09 CAL0000: Swap Space usage over Major threashold, perform OAM command restartSystem
There are five levels of logging in MariaDB ColumnStore.
Critical
Error
Warning
Info
Debug
Application log files are written to /var/log/mariadb/columnstore on each server and log rotation / archiving is configured to manage these automatically.
To get details about current logging configuration:
# mcsadmin getlogconfig
getlogconfig Wed Oct 19 06:58:47 2016
MariaDB Columnstore System Log Configuration Data
System Logging Configuration File being used: /etc/rsyslog.d/49-columnstore.conf
Module Configured Log Levels
------ ---------------------------------------
pm1 Critical Error Warning Info
The system logging configuration file referenced is a standard syslog configuration file and may be edited to enable and or disable specific levels, for example to disable debug logging and to only log at the specific level in each file:
# cat /etc/rsyslog.d/49-columnstore.conf
# MariaDb Columnstore Database Platform Logging
local1.=crit -/var/log/mariadb/columnstore/crit.log
local1.=err -/var/log/mariadb/columnstore/err.log
local1.=warning -/var/log/mariadb/columnstore/warning.log
local1.=info -/var/log/mariadb/columnstore/info.log
After making changes to this restart the syslog process, e.g:
# systemctl restart rsyslog
Log rotation and archiving are also configured by the installer and the settings for this may be found and managed similarly in the file /etc/logrotate.d/columnstore. If the current log files are manually deleted restart the syslog process to resume logging.
This page is licensed: CC BY-SA / Gnu FDL
MariaDB ColumnStore backup and restore manage distributed data using snapshots or tools like mariadb-backup, with restoration ensuring cluster sync via cpimport or file system recovery.
MariaDB Enterprise ColumnStore supports backup and restore.
Before you determine a backup strategy for your Enterprise ColumnStore deployment, it is a good idea to determine the system of record for your Enterprise ColumnStore data.
A system of record is the authoritative data source for a given piece of information. Organizations often store duplicate information in several systems, but only a single system can be the authoritative data source.
Enterprise ColumnStore is designed to handle analytical processing for OLAP, data warehousing, DSS, and hybrid workloads on very large data sets. Analytical processing does not generally happen on the system of record. Instead, analytical processing generally occurs on a specialized database that is loaded with data from the separate system of record. Additionally, very large data sets can be difficult to back up. Therefore, it may be beneficial to only backup the system of record.
If Enterprise ColumnStore is not acting as the system of record for your data, you should determine how the system of record affects your backup plan:
If your system of record is another database server, you should ensure that the other database server is properly backed up and that your organization has procedures to reload Enterprise ColumnStore from the other database server.
If your system of record is a set of data files, you should ensure that the set of data files is properly backed up and that your organization has procedures to reload Enterprise ColumnStore from the set of data files.
MariaDB Enterprise ColumnStore supports full backup and restore for all storage types. A full backup includes:
Enterprise ColumnStore's data and metadata
With S3: an S3 snapshot of the S3-compatible object storage and a file system snapshot or copy of the Storage Manager directory Without S3: a file system snapshot or copy of the DB Root directories.
The MariaDB data directory from the primary node
To see the procedure to perform a full backup and restore, choose the storage type:
This page is: Copyright © 2025 MariaDB. All rights reserved.
MariaDB Enterprise ColumnStore supports backup and restore. If Enterprise ColumnStore uses S3-compatible object storage for data and shared local storage for the Storage Manager directory, the S3 bucket, the Storage Manager directory, and the MariaDB data directory must be backed up separately.
MariaDB Enterprise ColumnStore supports multiple storage options.
This page discusses how to backup and restore Enterprise ColumnStore when it uses S3-compatible object storage for data and shared local storage (such as NFS) for the Storage Manager directory.
Any file can become corrupt due to hardware issues, crashes, power loss, and other reasons. If the Enterprise ColumnStore data or metadata were to become corrupt, Enterprise ColumnStore could become unusable, and data loss could occur.
If Enterprise ColumnStore is your system of record, it should be backed up regularly.
If Enterprise ColumnStore uses S3-compatible object storage for data and shared local storage for the Storage Manager directory, the following items must be backed up:
The MariaDB Data directory is backed up using mariadb-backup.
The S3 bucket must be backed up using the vendor's snapshot procedure.
The Storage Manager directory must be backed up.
See the instructions below for more details.
Use the following process to take a backup:
Determine which node is the primary server using curl to send the status command to the CMAPI Server:
$ curl -k -s https://mcs1:8640/cmapi/0.4.0/mcs cluster status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .
The output will show "dbrm_mode
": "master" for the primary server:
{
"timestamp": "2020-12-15 00:40:34.353574",
"192.0.2.1": {
"timestamp": "2020-12-15 00:40:34.362374",
"uptime": 11467,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [
"1"
],
"module_id": 1,
"services": [
{
"name": "workernode",
"pid": 19202
},
{
"name": "controllernode",
"pid": 19232
},
{
"name": "PrimProc",
"pid": 19254
},
{
"name": "ExeMgr",
"pid": 19292
},
{
"name": "WriteEngine",
"pid": 19316
},
{
"name": "DMLProc",
"pid": 19332
},
{
"name": "DDLProc",
"pid": 19366
}
]
}
Connect to the primary server using MariaDB Client as a user account that has privileges to lock the database:
$ mariadb --host=192.0.2.1 \
--user=root \
--password
Lock the database with the FLUSH TABLES WITH READ LOCK statement:
FLUSH TABLES WITH READ LOCK;
Ensure that the client remains connected to the primary server, so that the lock is held for the remaining steps.
Make a copy or snapshot of the Storage Manager directory. By default, it is located at /var/lib/columnstore/storagemanager
.
For example, to make a copy of the directory with rsync:
$ sudo mkdir -p /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/storagemanager /backups/columnstore/202101291600/
Use MariaDB Backup to backup the MariaDB data directory:
$ sudo mkdir -p /backups/mariadb/202101291600/
$ sudo mariadb-backup --backup \
--target-dir=/backups/mariadb/202101291600/ \
--user=mariadb-backup \
--password=mbu_passwd
Use MariaDB Backup to prepare the backup:
$ sudo mariadb-backup --prepare \
--target-dir=/backups/mariadb/202101291600/
Create a snapshot of the S3-compatible storage. Consult the storage vendor's manual for details on how to do this.
Ensure that all previous operations are complete.
In the original client connection to the primary server, unlock the database with the UNLOCK TABLES statement:
UNLOCK TABLES;
Use the following process to restore a backup:
Deploy Enterprise ColumnStore, so that you can restore the backup to an empty deployment.
Ensure that all services are stopped on each node:
$ sudo systemctl stop mariadb-columnstore-cmapi
$ sudo systemctl stop mariadb-columnstore
$ sudo systemctl stop mariadb
Restore the backup of the Storage Manager directory. By default, it is located at /var/lib/columnstore/storagemanager
.
For example, to restore the backup with rsync:
$ sudo rsync -av /backups/columnstore/202101291600/storagemanager/ /var/lib/columnstore/storagemanager/
$ sudo chown -R mysql:mysql /var/lib/columnstore/storagemanager
Use MariaDB Backup to restore the backup of the MariaDB data directory:
$ sudo mariadb-backup --copy-back \
--target-dir=/backups/mariadb/202101291600/
$ sudo chown -R mysql:mysql /var/lib/mysql
Restore the snapshot of your S3-compatible storage to the new S3 bucket that you plan to use. Consult the storage vendor's manual for details on how to do this.
Update storagemanager.cnf to configure Enterprise ColumnStore to use the S3 bucket. By default, it is located at /etc/columnstore/storagemanager.cnf
.
For example:
[ObjectStorage]
…
service = S3
…
[S3]
bucket = your_columnstore_bucket_name
endpoint = your_s3_endpoint
aws_access_key_id = your_s3_access_key_id
aws_secret_access_key = your_s3_secret_key
# iam_role_name = your_iam_role
# sts_region = your_sts_region
# sts_endpoint = your_sts_endpoint
[Cache]
cache_size = your_local_cache_size
path = your_local_cache_path
The default local cache size is 2 GB.
The default local cache path is /var/lib/columnstore/storagemanager/cache
.
Ensure that the local cache path has sufficient store space to store the local cache.
The bucket option must be set to the name of the bucket that you created from your snapshot in the previous step.
To use an IAM role, you must also uncomment and set iam_role_name, sts_region, and sts_endpoint
.
Start the services on each node:
$ sudo systemctl start mariadb
$ sudo systemctl start mariadb-columnstore-cmapi
This page is: Copyright © 2025 MariaDB. All rights reserved.
MariaDB Enterprise ColumnStore supports backup and restore. If Enterprise ColumnStore uses shared local storage for the DB Root directories, the DB Root directories and the MariaDB data directory must be backed up separately.
MariaDB Enterprise ColumnStore supports multiple storage options.
This page discusses how to backup and restore Enterprise ColumnStore when it uses shared local storage (such as NFS) for the DB Root directories.
Any file can become corrupt due to hardware issues, crashes, power loss, and other reasons. If the Enterprise ColumnStore data or metadata were to become corrupt, Enterprise ColumnStore could become unusable, and data loss could occur.
If Enterprise ColumnStore is your system of record, it should be backed up regularly.
If Enterprise ColumnStore uses shared local storage for the DB Root directories, the following items must be backed up:
The MariaDB Data directory is backed up using MariaDB Backup
The Storage Manager directory must be backed up
Each DB Root directories must be backed up
See the instructions below for more details.
Use the following process to take a backup:
Determine which node is the primary server using curl to send the status command to the CMAPI Server:
$ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .
The output will show "dbrm_mode": "master" for the primary server:
{
"timestamp": "2020-12-15 00:40:34.353574",
"192.0.2.1": {
"timestamp": "2020-12-15 00:40:34.362374",
"uptime": 11467,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [
"1"
],
"module_id": 1,
"services": [
{
"name": "workernode",
"pid": 19202
},
{
"name": "controllernode",
"pid": 19232
},
{
"name": "PrimProc",
"pid": 19254
},
{
"name": "ExeMgr",
"pid": 19292
},
{
"name": "WriteEngine",
"pid": 19316
},
{
"name": "DMLProc",
"pid": 19332
},
{
"name": "DDLProc",
"pid": 19366
}
]
}
Connect to the primary server using MariaDB Client as a user account that has privileges to lock the database:
$ mariadb --host=192.0.2.1 \
--user=root \
--password
Lock the database with the FLUSH TABLES WITH READ LOCK statement:
FLUSH TABLES WITH READ LOCK;
Ensure that the client remains connected to the primary server, so that the lock is held for the remaining steps.
Make a copy or snapshot of the Storage Manager directory. By default, it is located at /var/lib/columnstore/storagemanager
.
For example, to make a copy of the directory with rsync:
$ sudo mkdir -p /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/storagemanager /backups/columnstore/202101291600/
Make a copy or snapshot of the DB Root directories. By default, they are located at /var/lib/columnstore/dataN
, where the N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment.
For example, to make a copy of the directories with rsync in a 3-node deployment:
$ sudo rsync -av /var/lib/columnstore/data1 /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/data2 /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/data3 /backups/columnstore/202101291600/
Use MariaDB Backup to backup the Storage Manager directory:
$ sudo mkdir -p /backups/mariadb/202101291600/
$ sudo mariadb-backup --backup \
--target-dir=/backups/mariadb/202101291600/ \
--user=mariadb-backup \
--password=mbu_passwd
Use MariaDB Backup to prepare the backup:
$ sudo mariadb-backup --prepare \
--target-dir=/backups/mariadb/202101291600/
Ensure that all previous operations are complete.
In the original client connection to the primary server, unlock the database with the UNLOCK TABLES statement:
UNLOCK TABLES;
Use the following process to restore a backup:
Deploy Enterprise ColumnStore, so that you can restore the backup to an empty deployment.
Ensure that all services are stopped on each node:
$ sudo systemctl stop mariadb-columnstore-cmapi
$ sudo systemctl stop mariadb-columnstore
$ sudo systemctl stop mariadb
Restore the backup of the Storage Manager directory. By default, it is located at /var/lib/columnstore/storagemanager
.
For example, to restore the backup with rsync:
$ sudo rsync -av /backups/columnstore/202101291600/storagemanager/ /var/lib/columnstore/storagemanager/
$ sudo chown -R mysql:mysql /var/lib/columnstore/storagemanager
Restore the backup of the DB Root directories. By default, they are located at /var/lib/columnstore/dataN
, where the N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment.
For example, to restore the backup with rsync in a 3-node deployment:
$ sudo rsync -av /backups/columnstore/202101291600/data1/ /var/lib/columnstore/data1/
$ sudo rsync -av /backups/columnstore/202101291600/data2/ /var/lib/columnstore/data2/
$ sudo rsync -av /backups/columnstore/202101291600/data3/ /var/lib/columnstore/data3/
$ sudo chown -R mysql:mysql /var/lib/columnstore/data1
$ sudo chown -R mysql:mysql /var/lib/columnstore/data2
$ sudo chown -R mysql:mysql /var/lib/columnstore/data3
Use MariaDB Backup to restore the backup of the MariaDB data directory:
$ sudo mariadb-backup --copy-back \
--target-dir=/backups/mariadb/202101291600/
$ sudo chown -R mysql:mysql /var/lib/mysql
Start the services on each node:
$ sudo systemctl start mariadb
$ sudo systemctl start mariadb-columnstore-cmapi
This page is: Copyright © 2025 MariaDB. All rights reserved.
MariaDB ColumnStore utilizes an Extent Map to manage data distribution across extents—logical blocks within physical segment files ranging from 8 to 64 MB. Each extent holds a consistent number of rows, with the Extent Map cataloging these extents, their corresponding block identifiers (LBIDs), and the minimum and maximum values for each column's data within the extent.
The primary node maintains the master copy of the Extent Map. Upon system startup, this map is loaded into memory and propagated to other nodes for redundancy and quick access. Corruption of the master Extent Map can render the system unusable and lead to data loss.
ColumnStore's extent map is a smart structure that underpins its performance. By providing a logical partitioning scheme, it avoids the overhead associated with indexing and other common row-based database optimizations.
The primary node in a ColumnStore cluster holds the master copy of the extent map. Upon system startup, this master copy is read into memory and then replicated to all other participating nodes for high availability and disaster recovery. Nodes keep the extent map in memory for rapid access during query processing. As data within extents is modified, these updates are broadcast to all participating nodes to maintain consistency.
If the master copy of the extent map becomes corrupted, the entire system could become unusable, potentially leading to data loss. Having a recent backup of the extent map allows for a much faster recovery compared to reloading the entire database in such a scenario.
To safeguard against potential Extent Map corruption, regularly back up the master copy:
Lock Table:
mariadb -e "FLUSH TABLES WITH READ LOCK;"
Save BRM:
save_brm
Create Backup Directory:
mkdir -p /extent_map_backup
Copy Extent Map:
cp -f /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /extent_map_backup
Unlock Tables:
mariadb -e "UNLOCK TABLES;"
Stop ColumnStore:
systemctl stop mariadb-columnstore
Rename Corrupted Map:
mv /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /tmp/BRM_saves_em.bad
Clear Versioning Files:
> /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vbbm > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vss
Restore Backup:
cp -f /extent_map_backup/BRM_saves_em /var/lib/columnstore/data1/systemFiles/dbrm/
Set Ownership:
chown -R mysql:mysql /var/lib/columnstore/data1/systemFiles/dbrm/
Start ColumnStore:
systemctl start mariadb-columnstore
Shutdown Cluster:
curl -s -X PUT https://127.0.0.1:8640/cmapi/0.4.0/cluster/shutdown \ --header 'Content-Type:application/json' \ --header 'x-api-key:your_api_key' \ --data '{"timeout":60}' -k
Rename Corrupted Map:
mv /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /tmp/BRM_saves_em.bad
Clear Versioning Files:
> /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vbbm > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vss
Restore Backup:
mv cp -f /extent_map_backup/BRM_saves_em /var/lib/columnstore/data1/systemFiles/dbrm/
Set Ownership:
chown -R mysql:mysql /var/lib/columnstore/data1/systemFiles/dbrm
Start Cluster:
curl -s -X PUT https://127.0.0.1:8640/cmapi/0.4.0/cluster/start \ --header 'Content-Type:application/json' \ --header 'x-api-key:your_api_key' \ --data '{"timeout":60}' -k
Incorporate the save_brm
command into your data import scripts (e.g., those using cpimport
) to automate Extent Map backups. This practice ensures regular backups without manual intervention. Refer to the MariaDB ColumnStore Backup Script for an example implementation.
This page is licensed: CC BY-SA / Gnu FDL
MariaDB ColumnStore uses MariaDB Server’s security—encryption, access control, auditing, and firewall—for secure analytics.
This page is about security vulnerabilities that have been fixed for or still affect MariaDB ColumnStore. In addition, links are included to fixed security vulnerabilities in MariaDB Server since MariaDB ColumnStore is based on MariaDB Server.
Sensitive security issues can be sent directly to the persons responsible for MariaDB security: security [AT] mariadb (dot) org.
CVE® stands for "Common Vulnerabilities and Exposures". It is a publicly available and free-to-use database of known software vulnerabilities maintained at
The appropriate release notes listed here document CVEs fixed within a given release. Additional information can also be found at Security Vulnerabilities Fixed in MariaDB.
There are no known CVEs on ColumnStore-specific infrastructure outside of the MariaDB server at this time.
This page is licensed: CC BY-SA / Gnu FDL
Starting with MariaDB Enterprise ColumnStore 6.2.3, ColumnStore supports encryption for user passwords stored in Columnstore.xml:
Encryption keys are created with the cskeys utility
Passwords are encrypted using the cspasswd utility
MariaDB Enterprise ColumnStore 6
MariaDB Enterprise ColumnStore 22.08
MariaDB Enterprise ColumnStore 23.02
MariaDB Enterprise ColumnStore stores its password encryption keys in the plain-text file /var/lib/columnstore/.secrets.
The encryption keys are not created by default, but can be generated by executing the cskeys utility:
$ cskeys
In a multi-node Enterprise ColumnStore cluster, every ColumnStore node should have the same encryption keys. Therefore, it is recommended to execute cskeys on the primary server, and then copy /var/lib/columnstore/.secrets to every other ColumnStore node and fix the file's permissions:
$ scp 192.0.2.1:/var/lib/columnstore/.secrets /var/lib/columnstore/.secrets
$ sudo chown mysql:mysql /var/lib/columnstore/.secrets
$ sudo chmod 0400 /var/lib/columnstore/.secrets
To encrypt a password:
Generate an encrypted password using the cspasswd utility:
$ cspasswd util_user_passwd
If the --interactive command-line option is specified, cspasswd prompts for the password.
Set the encrypted password in Columnstore.xml using the mcsSetConfig utility:
$ sudo mcsSetConfig CrossEngineSupport Password util_user_encrypted_passwd
To decrypt a password, execute the cspasswd utility and specify the --decrypt command-line option:
$ cspasswd --decrypt util_user_encrypted_passwd
This page is: Copyright © 2025 MariaDB. All rights reserved.
MariaDB ColumnStore is ideal for real-time analytics and complex queries on large datasets across industries.
MariaDB ColumnStore is a columnar storage engine that utilizes a massively parallel distributed data architecture. It's a columnar storage system built by porting InfiniDB 4.6.7 to MariaDB and released under the GPL license.
MariaDB 10.5.4 is available as a storage engine for MariaDB Server. Before then, it is available as a separate download.
It is designed for big data scaling to process petabytes of data, linear scalability, and exceptional performance with real-time response to analytical queries. It leverages the I/O benefits of columnar storage, compression, just-in-time projection, and horizontal and vertical partitioning to deliver tremendous performance when analyzing large data sets.
Links:
A Google Group exists for MariaDB ColumnStore that can be used to discuss ideas and issues and communicate with the community: Send email to mariadb-columnstore@googlegroups.com or use the forum interface
Bugs can be reported in MariaDB Jira: jira.mariadb.org (see Reporting Bugs). Please file bugs under the MCOL project and include the output from the support utility if possible.
MariaDB ColumnStore is released under the GPL license.
This page is licensed: CC BY-SA / Gnu FDL
MariaDB ColumnStore ensures high availability with multi-node setups and shared storage, while MaxScale adds monitoring and failover for continuous analytics.
A number of system configuration variables exist to allow fine tuning of the system to suit the physical hardware and query characteristics. In general the default values will work relatively well for many cases.
The configuration parameters are maintained in the /usr/local/mariadb/columnstore/etc/Columnstore.xml
file. In a multiple server deployment these should only be edited on the PM1 server as this will be automatically replicated to other servers by the system. A system restart will be required for the configuration change to take affect.
Convenience utility programs getConfig and setConfig are available to safely update the Columnstore.xml without needing to be comfortable with editing XML files. The -h argument will display usage information.
The NumBlocksPct configuration parameter specifies the percentage of physical memory to utilize for disk block caching. For a Single Server or Combined Multi Server deployment the default value is 50 to ensure enough physical memory for the UM and for a non combined multi serve deployment the default value is 70.
The TotalUmMemory configuration parameter specifies the percentage of physical memory to utilize for joins, intermediate results and set operations on the UM. This specifies an upper limit for small table results in joins rather than a pre-allocation of memory.
In a single server or combined deployment, the sum of NumBlocksPct and TotalUmMemory should typically not exceed 75% of physical memory. With very large memory servers this could be raised but the key point is to leave enough memory for other processes including mysqld.
With version 1.2.2 onwards these can be set to static numeric limits instead of percentages by entering a number with 'M' or 'G' at the end to signify MiB or GiB.
ColumnStore handles concurrent query execution by managing the rate of concurrent batch primitive steps from the UM to the PM. This is configured using the MaxOutstandingRequests parameter and has a default value of 20. Each batch primitive step is executed within the context of 1 extent column according to this high level process:
The UM issues up to MaxOutstandingRequests number of batch primitive steps.
The PM processes the request using many threads and returns its response. These generally take a fraction of a second up to a low number of seconds depending on the amount of Physical I/O and the performance of that storage.
The UM will issue new requests as prior requests complete maintaining the maximum number of outstanding requests.
This scheme allows for large queries to use all available resources when not otherwise being consumed and for smaller queries to execute with minimal delay. Lower values optimize for higher throughput of smaller queries while a larger value optimizes for response time of that query. The default value should work well under most circumstances however the value should be increased as the number of PM nodes is increased.
How many Queries are running and how many queries are currently in the queue can be checked with
select calgetsqlcount();
ColumnStore maintains statistics for table and utilizes this to determine which is the larger table of the two. This is based both on the number of blocks in that table and estimation of the predicate cardinality. The first step is to apply any filters as appropriate to the smaller table and returning this data set to the UM. The size of this data set is compared against the configuration parameter PmMaxMemorySmallSide which has a default value of 64 (MB). This value can be set all the way up to 4GB. This default allows for approximately 1M rows on the small table side to be joined against billions (or trillions) on the large table side. If the size of the small data set is less than PmMaxMemorySmallSide the dataset will be sent to the PM for creation of a distributed hashmap otherwise it is created on the UM. Thus this setting is important to tuning of joins and whether the operation can be distributed or not. This should be set to support your largest expected small table join size up to available memory:
Although this will increase the size of data sent from the UM to PM to support the join, it means that the join and subsequent aggregates are pushed down, scaled out, and a smaller data set is returned back to the UM.
In a multiple PM deployment, the sizing should be based from available physical memory on the PM servers, how much memory to reserve for block caching, and the number of simultaneous join operations that can be expected to run times the average small table join data size.
The above logic for a single table join extrapolates out to multi table joins where the small table values are precalculated and performed as one single scan against the large table. This works well for the typical star schema case joining multiple dimension tables with a large fact table. For some join scenarios it may be necessary to sequence joins to create the intermediate datasets for joining, this would happen for instance with a snowflake schema structure. In some extreme cases it may be hard for the optimizer to be able to determine the most optimal join path. In this case a hint is available to force a join ordering. The INFINIDB_ORDERED
hint will force the first table in the from clause to be considered the largest table and override any statistics based decision, for example:
select /*! INFINIDB_ORDERED */ r_regionkey
from region r, customer c, nation n
where r.r_regionkey = n.n_regionkey
and n.n_nationkey = c.c_nationkey
Note: INFINIDB\_ORDERED
is deprecated and does not work anymore for ColumnStore 1.2 and above.
use set infinidb_ordered_only=ON;
and for 1.4 set columnstore_ordered_only=ON;
When a join is very large and exceeds the PmMaxMemorySmallSide setting it is performed in memory in the UM server. For very large joins this could exceed the available memory in which case this is detected and a query error reported. A number of configuration parameters are available to enable and configure usage of disk overflow should this occur:
AllowDiskBasedJoin – Controls the option to use disk Based joins or not. Valid values are Y (enabled) or N (disabled). By default, this option is disabled.
TempFileCompression – Controls whether the disk join files are compressed or noncompressed. Valid values are Y
(use compressed files) or N
(use non-compressed files).
TempFilePath – The directory path used for the disk joins. By default, this path is the tmp directory for your installation (i.e., /usr/local/mariadb/columnstore/tmp
). Files (named infinidb-join-data*
) in this directory will be created and cleaned on an as needed basis. The entire directory is removed and recreated by ExeMgr
at startup. It is strongly recommended that this directory is stored on a dedicated partition.
A mariadb global or session variable is available to specify a memory limit at which point the query is switched over to disk based joins:
infinidb_um_mem_limit
- Memory limit in MB per user (i.e. switch to disk based join if this limit is exceeded). By default, this limit is not set (value of 0).
This page is licensed: CC BY-SA / Gnu FDL
MariaDB ColumnStore query tuning optimizes analytics using data types, joins, projection elimination, WHERE clauses, and EXPLAIN for performance insights.
In MariaDB Enterprise ColumnStore 6, the ExeMgr process uses optimizer statistics in its query planning process.
ColumnStore uses the optimizer statistics to add support for queries that contain circular inner joins.
In Enterprise ColumnStore 5 and before, ColumnStore would raise the following error when a query containing a circular inner join was executed:
ERROR 1815 (HY000): Internal error: IDB-1003: Circular joins are not supported.
The optimizer statistics store each column's NDV (Number of Distinct Values), which can help the ExeMgr process choose the optimal join order for queries with circular joins. When Enterprise ColumnStore executes a query with a circular join, the query's execution can take longer if ColumnStore chooses a sub-optimal join order. When you collect optimizer statistics for your ColumnStore tables, the ExeMgr process is less likely to choose a sub-optimal join order.
Enterprise ColumnStore's optimizer statistics can be collected for ColumnStore tables by executing ANALYZE TABLE:
[[analyze-table|ANALYZE TABLE]] columnstore_tab;
Enterprise ColumnStore's optimizer statistics are not updated automatically. To update the optimizer statistics for a ColumnStore table, ANALYZE TABLE must be re-executed.
Enterprise ColumnStore does not implement an interface to show optimizer statistics.
This page is: Copyright © 2025 MariaDB. All rights reserved.
When tuning queries for MariaDB Enterprise ColumnStore, there are some important details to consider.
Enterprise ColumnStore only reads the columns that are necessary to resolve a query.
For example, the following query selects every column in the table:
SELECT * FROM tab;
Whereas the following query only selects two columns in the table, so it requires less I/O:
SELECT col1, col2 FROM tab;
For best performance, only select the columns that are necessary to resolve a query.
When Enterprise ColumnStore performs ORDER BY
and LIMIT
operations, the operations are performed in a single-threaded manner after the rest of the query processing has been completed, and the full unsorted result-set has been retrieved. For large data sets, the performance overhead can be significant.
When Enterprise ColumnStore 5 performs aggregations (i.e., DISTINCT, GROUP BY, COUNT(*)
, etc.), all of the aggregation work happens in-memory by default. As a consequence, more complex aggregation operations require more memory in that version.
For example, the following query could require a lot of memory in Enterprise ColumnStore 5, since it has to calculate many distinct values in memory:
SELECT DISTINCT col1 FROM tab LIMIT 10000;
Whereas the following query could require much less memory in Enterprise ColumnStore 5, since it has to calculate fewer distinct values:
SELECT DISTINCT col1 FROM tab LIMIT 100;
In Enterprise ColumnStore 6, disk-based aggregations can be enabled.
For best performance, avoid excessive aggregations or enable disk-based aggregations.
For additional information, see "Configure Disk-Based Aggregations".
When Enterprise ColumnStore evaluates built-in functions and aggregate functions, it can often evaluate the function in a distributed manner. Distributed evaluation of functions can significantly improve performance.
Enterprise ColumnStore supports distributed evaluation for some built-in functions. For other built-in functions, the function must be evaluated serially on the final result set.
Enterprise ColumnStore also supports distributed evaluation for user-defined functions developed with ColumnStore's User-Defined Aggregate Function (UDAF) C++ API. For functions developed with Enterprise Server's standard User-Defined Function (UDF) API, the function must be evaluated serially on the final result set.
For best performance, avoid non-distributed functions.
By default, Enterprise ColumnStore performs all joins as in-memory hash joins.
If the joined tables are very large, the in-memory hash join can require too much memory for the default configuration. There are a couple options to work around this:
Enterprise ColumnStore can be configured to use more memory for in-memory hash joins.
Enterprise ColumnStore can be configured to use disk-based joins.
Enterprise ColumnStore can use optimizer statistics to better optimize the join order.
For additional information, see "Configure In-Memory Joins", "Configure Disk-Based Joins", and "Optimizer Statistics".
Enterprise ColumnStore uses extent elimination to optimize queries. Extent elimination uses the minimum and maximum values in the extent map to determine which extents can be skipped for a query.
When data is loaded into Enterprise ColumnStore, it appends the data to the latest extent. When an extent reaches the maximum number of column values, Enterprise ColumnStore creates a new extent. As a consequence, if ordered data is loaded in its proper order, then similar values will be clustered together in the same extent. This can improve query performance, because extent elimination performs best when similar values are clustered together.
For example, if you expect to query a table with a filter on a timestamp column, you should sort the data using the timestamp column before loading it into Enterprise ColumnStore. Later, when the table is queried with a filter on the timestamp column, Enterprise ColumnStore would be able to skip many extents using extent elimination.
For best performance, load ordered data in proper order.
When Enterprise ColumnStore performs mathematical operations with very big values using the DECIMAL, NUMERIC, and FIXED data types, the operation can sometimes overflow ColumnStore's maximum precision or scale. The maximum precision and scale depends on the version of Enterprise ColumnStore:
In Enterprise ColumnStore 6, the maximum precision (M) is 38, and the maximum scale (D) is 38.
In Enterprise ColumnStore 5, the maximum precision (M) is 18, and the maximum scale (D) is 18.
In Enterprise ColumnStore 6, applications can configure Enterprise ColumnStore to check for decimal overflows by setting the columnstore_decimal_overflow_check system variable, but only when the column has a decimal precision that is 18 or more:
SET SESSION columnstore_decimal_overflow_check=ON;
SELECT (big_decimal1 * big_decimal2) AS product
FROM columnstore_tab;
When decimal overflow checks are enabled, math operations have extra overhead.
When the decimal overflow check fails, MariaDB Enterprise ColumnStore raises an error with the ER_INTERNAL_ERROR error sql, and it writes detailed information about the overflow check failure to the ColumnStore system logs.
MariaDB Enterprise ColumnStore supports Enterprise Server's standard User-Defined Function (UDF) API. However, UDFs developed using that API cannot be executed in a distributed manner.
To support distributed execution of custom sql, MariaDB Enterprise ColumnStore supports a Distributed User Defined Aggregate Functions (UDAF) C++ API:
The Distributed User Defined Aggregate Functions (UDAF) C++ API allows anyone to create aggregate functions of arbitrary complexity for distributed execution in the ColumnStore storage engine.
These functions can also be used as Analytic (Window) functions just like any built-in aggregate function.
This page is: Copyright © 2025 MariaDB. All rights reserved.
MariaDB ColumnStore's query plans and Optimizer Trace show how analytical queries run in parallel across its distributed, columnar architecture, aiding performance tuning.
The ColumnStore storage engine uses a ColumnStore Execution Plan (CSEP) to represent a query plan internally.
When the select handler receives the SELECT_LEX
object, it transforms it into a CSEP as part of the query planning and optimization process. For additional information, see "MariaDB Enterprise ColumnStore Query Evaluation."
The CSEP for a given query can be viewed by performing the following:
Calling the calSetTrace(1)
function:
SELECT calSetTrace(1);
Executing the query:
SELECT column1, column2
FROM columnstore_tab
WHERE column1 > '2020-04-01'
AND column1 < '2020-11-01';
Calling the calGetTrace()
function:
SELECT calGetTrace();
This page is: Copyright © 2025 MariaDB. All rights reserved.
When Enterprise ColumnStore executes a query, the ExeMgr process on the initiator/aggregator node translates the ColumnStore execution plan (CSEP) into a job list. A job list is a sequence of job steps.
Enterprise ColumnStore uses many different types of job steps that provide different scalability benefits:
Some types of job steps perform operations in a distributed manner, using multiple nodes to operate to different extents. Distributed operations provide horizontal scalability.
Some types of job steps perform operations in a multi-threaded manner using a thread pool. Performing multi-threaded operations provides vertical scalability.
As you increase the number of ColumnStore nodes or the number of cores on each node, Enterprise ColumnStore can use those resources to more efficiently execute job steps.
For additional information, see "MariaDB Enterprise ColumnStore Query Evaluation.".
Enterprise ColumnStore defines a batch primitive step to handle many types of tasks, such as scanning/filtering columns, JOIN operations, aggregation, functional filtering, and projecting (putting values into a SELECT
list).
In calGetTrace()
output, a batch primitive step is abbreviated BPS.
Batch primitive steps are evaluated on multiple nodes in parallel. The PrimProc process on each node evaluates the batch primitive step to one extent at a time. The PrimProc process uses a thread pool to operate on individual blocks within the extent in parallel.
Enterprise ColumnStore defines a cross-engine step to perform cross-engine joins, in which a ColumnStore table is joined with a table that uses a different storage engine.
In calGetTrace()
output, a cross-engine step is abbreviated CES.
Cross-engine steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.
Enterprise ColumnStore can perform cross-engine joins when the mandatory utility user is properly configured.
For additional information, refer to the "Mandatory Utility User Account."
Enterprise ColumnStore defines a dictionary structure step to scan the dictionary extents that ColumnStore uses to store variable-length string values.
In calGetTrace()
output, a dictionary structure step is abbreviated DSS.
Dictionary structure steps are evaluated on multiple nodes in parallel. The PrimProc process on each node evaluates the dictionary structure step to one extent at a time. It uses a thread pool to operate on individual blocks within the extent in parallel.
Dictionary structure steps can require a lot of I/O for a couple of reasons:
Dictionary structure steps do not support extent elimination, so all extents for the column must be scanned.
Dictionary structure steps must read the column extents to find each pointer and the dictionary extents to find each value, so it doubles the number of extents to scan.
It is generally recommended to avoid queries that will cause dictionary scans.
For additional information, see "Avoid Creating Long String Columns".
Enterprise ColumnStore defines a hash join step to perform a hash join between two tables.
In calGetTrace()
output, a hash join step is abbreviated HJS.
Hash join steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.
Enterprise ColumnStore performs the hash join in memory by default. If you perform large joins, you may be able get better performance by changing some configuration defaults with mcsSetConfig:
Enterprise ColumnStore can be configured to use more memory for in-memory hash joins.
Enterprise ColumnStore can be configured to use disk-based joins.
For additional information, see "Configure in-memory joins" and "Configure Disk-Based Joins".
Enterprise ColumnStore defines a having step to evaluate a HAVING clause on a result set.
In calGetTrace()
output, a having step is abbreviated HVS.
Enterprise ColumnStore defines a subquery step to evaluate a subquery.
In calGetTrace() output, a subquery step is abbreviated SQS.
Enterprise ColumnStore defines a tuple aggregation step to collect intermediate aggregation prior to the final aggregation and evaluation of the results.
In calGetTrace()
output, a tuple aggregation step is abbreviated TAS.
Tuple aggregation steps are primarily evaluated by the ExeMgr process on the initiator/aggregator node. However, the PrimProc process on each node also plays a role, since the PrimProc process on each node provides the intermediate aggregation results to the ExeMgr process on the initiator/aggregator node.
Enterprise ColumnStore defines a tuple annexation step to perform the final aggregation and evaluation of the results.
In calGetTrace()
output, a tuple annexation step is abbreviated TNS.
Tuple annexation steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.
Enterprise ColumnStore 5 performs aggregation operations in memory. As a consequence, more complex aggregation operations require more memory in that version.
In Enterprise ColumnStore 6, disk-based aggregations can be enabled.
For additional information, see "Configure Disk-Based Aggregations".
Enterprise ColumnStore defines a tuple union step to perform a union of two subqueries.
In calGetTrace()
output, a tuple union step is abbreviated TUS.
Tuple union steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.
Enterprise ColumnStore defines a tuple constant step to evaluate constant values.
In calGetTrace() output, a tuple constant step is abbreviated TCS.
Tuple constant steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.
Enterprise ColumnStore defines a window function step to evaluate window functions.
In calGetTrace()
output, a window function step is abbreviated WFS.
Window function steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.
This page is: Copyright © 2025 MariaDB. All rights reserved.
MariaDB ColumnStore supports standard MariaDB tools, BI connectors (e.g., Tableau, Power BI), data ingestion (cpimport, Kafka), and REST APIs for admin.
The ColumnStore StorageManager manages columnar data storage and retrieval, optimizing analytical queries.
IBM Cloud Object Storage (Formerly known as CleverSafe)
Due to the frequent code changes and deviation from the AWS standards, none are approved at this time.
This page is licensed: CC BY-SA / Gnu FDL
# Sample storagemanager.cnf
[ObjectStorage]
service = S3
object_size = 5M
metadata_path = /var/lib/columnstore/storagemanager/metadata
journal_path = /var/lib/columnstore/storagemanager/journal
max_concurrent_downloads = 21
max_concurrent_uploads = 21
common_prefix_depth = 3
[S3]
region = us-west-1
bucket = my_columnstore_bucket
endpoint = s3.amazonaws.com
aws_access_key_id = AKIAR6P77BUKULIDIL55
aws_secret_access_key = F38aR4eLrgNSWPAKFDJLDAcax0gZ3kYblU79
[LocalStorage]
path = /var/lib/columnstore/storagemanager/fake-cloud
fake_latency = n
max_latency = 50000
[Cache]
cache_size = 2g
path = /var/lib/columnstore/storagemanager/cache
This page is licensed: CC BY-SA / Gnu FDL
We have added a new feature in Columnstore 5.5.2 that allows you to use AWS IAM roles in order to connect to S3 buckets without explicitly entering credentials into the storagemanager.cnf config file.
You will need to modify the IAM role of your Amazon EC2 instance to allow for this. Please follow the AWS documentation before beginning this process.
It is important to note that you must update the AWS S3 endpoint based on your chosen region; otherwise, you might face delays in propagation as discussed here and here.
For a complete list of AWS service endpoints, please visit the AWS reference guide.
Edit your Storage Manager configuration file located at /etc/columnstore/storagemanager.cnf in order to look similar to the example below (replacing those in the [S3] section with your own custom variables):
[ObjectStorage]
service = S3
object_size = 5M
metadata_path = /var/lib/columnstore/storagemanager/metadata
journal_path = /var/lib/columnstore/storagemanager/journal
max_concurrent_downloads = 21
max_concurrent_uploads = 21
common_prefix_depth = 3
[S3]
ec2_iam_mode=enabled
bucket = my_mcs_bucket
region = us-west-2
endpoint = s3.us-west-2.amazonaws.com
[LocalStorage]
path = /var/lib/columnstore/storagemanager/fake-cloud
fake_latency = n
max_latency = 50000
[Cache]
cache_size = 2g
path = /var/lib/columnstore/storagemanager/cache
This page is licensed: CC BY-SA / Gnu FDL
Learn about data ingestion for MariaDB ColumnStore. This section covers various methods and tools for efficiently loading large datasets into your columnar database for analytical workloads.
ColumnStore provides several mechanisms to ingest data:
cpimport provides the fastest performance for inserting data and ability to route data to particular PM nodes. Normally this should be the default choice for loading data .
LOAD DATA INFILE provides another means of bulk inserting data.
By default with autocommit on it will internally stream the data to an instance of the cpimport process. This requires some memory overhead on the UM server so may be less reliable than cpimport for very large imports.
In transactional mode DML inserts are performed which will be significantly slower plus it will consume both binlog transaction files and ColumnStore VersionBuffer files.
DML, i.e. INSERT, UPDATE, and DELETE, provide row level changes. ColumnStore is optimized towards bulk modifications and so these operations are slower than they would be in say InnoDB.
Currently ColumnStore does not support operating as a replication slave target.
Bulk DML operations will in general perform better than multiple individual statements.
INSERT INTO SELECT with autocommit behaves similarly to LOAD DATE INFILE in that internally it is mapped to cpimport for higher performance.
Bulk update operations based on a join with a small staging table can be relatively fast especially if updating a single column.
Using ColumnStore Bulk Write SDK or ColumnStore Streaming Data Adapters
cpimport is a high-speed bulk load utility that imports data into ColumnStore tables in a fast and efficient manner. It accepts as input any flat file containing data that contains a delimiter between fields of data (i.e. columns in a table). The default delimiter is the pipe (‘|’) character, but other delimiters such as commas may be used as well. The data values must be in the same order as the create table statement, i.e. column 1 matches the first column in the table and so on. Date values must be specified in the format 'yyyy-mm-dd'.
cpimport – performs the following operations when importing data into a MariaDB ColumnStore database:
Data is read from specified flat files.
Data is transformed to fit ColumnStore’s column-oriented storage design.
Redundant data is tokenized and logically compressed.
Data is written to disk.
It is important to note that:
The bulk loads are an append operation to a table so they allow existing data to be read and remain unaffected during the process.
The bulk loads do not write their data operations to the transaction log; they are not transactional in nature but are considered an atomic operation at this time. Information markers, however, are placed in the transaction log so the DBA is aware that a bulk operation did occur.
Upon completion of the load operation, a high water mark in each column file is moved in an atomic operation that allows for any subsequent queries to read the newly loaded data. This append operation provides for consistent read but does not incur the overhead of logging the data.
There are two primary steps to using the cpimport utility:
Optionally create a job file that is used to load data from a flat file into multiple tables.
Run the cpimport utility to perform the data import.
The simplest form of cpimport command is
cpimport dbName tblName [loadFile]
The full syntax is like this:
cpimport dbName tblName [loadFile]
[-h] [-m mode] [-f filepath] [-d DebugLevel]
[-c readBufferSize] [-b numBuffers] [-r numReaders]
[-e maxErrors] [-B libBufferSize] [-s colDelimiter] [-E EnclosedByChar]
[-C escChar] [-j jobID] [-p jobFilePath] [-w numParsers]
[-n nullOption] [-P pmList] [-i] [-S] [-q batchQty]
positional parameters:
dbName Name of the database to load
tblName Name of table to load
loadFile Optional input file name in current directory,
unless a fully qualified name is given.
If not given, input read from STDIN.
Options:
-b Number of read buffers
-c Application read buffer size(in bytes)
-d Print different level(1-3) debug message
-e Max number of allowable error per table per PM
-f Data file directory path.
Default is current working directory.
In Mode 1, -f represents the local input file path.
In Mode 2, -f represents the PM based input file path.
In Mode 3, -f represents the local input file path.
-l Name of import file to be loaded, relative to -f path. (Cannot be used with -p)
-h Print this message.
-q Batch Quantity, Number of rows distributed per batch in Mode 1
-i Print extended info to console in Mode 3.
-j Job ID. In simple usage, default is the table OID.
unless a fully qualified input file name is given.
-n NullOption (0-treat the string NULL as data (default);
1-treat the string NULL as a NULL value)
-p Path for XML job description file.
-r Number of readers.
-s The delimiter between column values.
-B I/O library read buffer size (in bytes)
-w Number of parsers.
-E Enclosed by character if field values are enclosed.
-C Escape character used in conjunction with 'enclosed by'
character, or as part of NULL escape sequence ('\N');
default is '\'
-I Import binary data; how to treat NULL values:
1 - import NULL values
2 - saturate NULL values
-P List of PMs ex: -P 1,2,3. Default is all PMs.
-S Treat string truncations as errors.
-m mode
1 - rows will be loaded in a distributed manner across PMs.
2 - PM based input files loaded onto their respective PM.
3 - input files will be loaded on the local PM.
In this mode, you run the cpimport from your primary node (mcs1). The source file is located at this primary location and the data from cpimport is distributed across all the nodes. If no mode is specified, then this is the default.
Example:
cpimport -m1 mytest mytable mytable.tbl
In this mode, you run the cpimport from your primary node (mcs1). The source data is in already partitioned data files residing on the PMs. Each PM should have the source data file of the same name but containing the partitioned data for the PM
Example:
cpimport -m2 mytest mytable -l /home/mydata/mytable.tbl
In this mode, you run cpimport from the individual nodes independently, which will import the source file that exists on that node. Concurrent imports can be executed on every node for the same table.
Example:
cpimport -m3 mytest mytable /home/mydata/mytable.tbl
Note:
The bulk loads are an append operation to a table so they allow existing data to be read and remain unaffected during the process.
The bulk loads do not write their data operations to the transaction log; they are not transactional in nature but are considered an atomic operation at this time. Information markers, however, are placed in the transaction log so the DBA is aware that a bulk operation did occur.
Upon completion of the load operation, a high water mark in each column file is moved in an atomic operation that allows for any subsequent queries to read the newly loaded data. This append operation provides for consistent read but does not incur the overhead of logging the data.
Data can be loaded from STDIN into ColumnStore by simply not including the loadFile parameter
Example:
cpimport db1 table1
Similarly the AWS cli utility can be utilized to read data from an s3 bucket and pipe the output into cpimport allowing direct loading from S3. This assumes the aws cli program has been installed and configured on the host:
Example:
aws s3 cp --quiet s3://dthompson-test/trades_bulk.csv - | cpimport test trades -s ","
For troubleshooting connectivity problems remove the --quiet option which suppresses client logging including permission errors.
Standard in can also be used to directly pipe the output from an arbitrary SELECT statement into cpimport. The select statement may select from non-columnstore tables such as MyISAM or InnoDB. In the example below, the db2.source_table is selected from, using the -N flag to remove non-data formatting. The -q flag tells the mysql client to not cache results which will avoid possible timeouts causing the load to fail.
Example:
mariadb -q -e 'select * from source_table;' -N <source-db> | cpimport -s '\t' <target-db> <target-table>
Let's create a sample ColumnStore table:
CREATE DATABASE `json_columnstore`;
USE `json_columnstore`;
CREATE TABLE `products` (
`product_name` varchar(11) NOT NULL DEFAULT '',
`supplier` varchar(128) NOT NULL DEFAULT '',
`quantity` varchar(128) NOT NULL DEFAULT '',
`unit_cost` varchar(128) NOT NULL DEFAULT ''
) ENGINE=Columnstore DEFAULT CHARSET=utf8;
Now let's create a sample products.json file like this:
[{
"_id": {
"$oid": "5968dd23fc13ae04d9000001"
},
"product_name": "Sildenafil Citrate",
"supplier": "Wisozk Inc",
"quantity": 261,
"unit_cost": "$10.47"
}, {
"_id": {
"$oid": "5968dd23fc13ae04d9000002"
},
"product_name": "Mountain Juniperus Ashei",
"supplier": "Keebler-Hilpert",
"quantity": 292,
"unit_cost": "$8.74"
}, {
"_id": {
"$oid": "5968dd23fc13ae04d9000003"
},
"product_name": "Dextromethorphan HBR",
"supplier": "Schmitt-Weissnat",
"quantity": 211,
"unit_cost": "$20.53"
}]
We can then bulk load data from JSON into Columnstore by first piping the data to jq and then to cpimport using a one line command.
Example:
cat products.json | jq -r '.[] | [.product_name,.supplier,.quantity,.unit_cost] | @csv' | cpimport json_columnstore products -s ',' -E '"'
In this example, the JSON data is coming from a static JSON file but this same method will work for and output streamed from any datasource using JSON such as an API or NoSQL database. For more information on 'jq', please view the manual here here.
There are two ways multiple tables can be loaded:
Run multiple cpimport jobs simultaneously. Tables per import should be unique or PMs for each import should be unique if using mode 3.
Use colxml utility : colxml creates an XML job file for your database schema before you can import data. Multiple tables may be imported by either importing all tables within a schema or listing specific tables using the -t option in colxml. Then, using cpimport, that uses the job file generated by colxml. Here is an example of how to use colxml and cpimport to import data into all the tables in a database schema
colxml mytest -j299
cpimport -m1 -j299
Usage: colxml [options] dbName
Options:
-d Delimiter (default '|')
-e Maximum allowable errors (per table)
-h Print this message
-j Job id (numeric)
-l Load file name
-n "name in quotes"
-p Path for XML job description file that is generated
-s "Description in quotes"
-t Table name
-u User
-r Number of read buffers
-c Application read buffer size (in bytes)
-w I/O library buffer size (in bytes), used to read files
-x Extension of file name (default ".tbl")
-E EnclosedByChar (if data has enclosed values)
-C EscapeChar
-b Debug level (1-3)
The following tables comprise a database name ‘tpch2’:
MariaDB[tpch2]> show tables;
+---------------+
| Tables_in_tpch2 |
+--------------+
| customer |
| lineitem |
| nation |
| orders |
| part |
| partsupp |
| region |
| supplier |
+--------------+
8 rows in set (0.00 sec)
First, put delimited input data file for each table in /usr/local/mariadb/columnstore/data/bulk/data/import. Each file should be named .tbl.
Run colxml for the load job for the ‘tpch2’ database as shown here:
/usr/local/mariadb/columnstore/bin/colxml tpch2 -j500
Running colxml with the following parameters:
2015-10-07 15:14:20 (9481) INFO :
Schema: tpch2
Tables:
Load Files:
-b 0
-c 1048576
-d |
-e 10
-j 500
-n
-p /usr/local/mariadb/columnstore/data/bulk/job/
-r 5
-s
-u
-w 10485760
-x tbl
File completed for tables:
tpch2.customer
tpch2.lineitem
tpch2.nation
tpch2.orders
tpch2.part
tpch2.partsupp
tpch2.region
tpch2.supplier
Normal exit.
Now actually run cpimport to use the job file generated by the colxml execution
/usr/local/mariadb/columnstore/bin/cpimport -j 500
Bulkload root directory : /usr/local/mariadb/columnstore/data/bulk
job description file : Job_500.xml
2015-10-07 15:14:59 (9952) INFO : successfully load job file /usr/local/mariadb/columnstore/data/bulk/job/Job_500.xml
2015-10-07 15:14:59 (9952) INFO : PreProcessing check starts
2015-10-07 15:15:04 (9952) INFO : PreProcessing check completed
2015-10-07 15:15:04 (9952) INFO : preProcess completed, total run time : 5 seconds
2015-10-07 15:15:04 (9952) INFO : No of Read Threads Spawned = 1
2015-10-07 15:15:04 (9952) INFO : No of Parse Threads Spawned = 3
2015-10-07 15:15:06 (9952) INFO : For table tpch2.customer: 150000 rows processed and 150000 rows inserted.
2015-10-07 15:16:12 (9952) INFO : For table tpch2.nation: 25 rows processed and 25 rows inserted.
2015-10-07 15:16:12 (9952) INFO : For table tpch2.lineitem: 6001215 rows processed and 6001215 rows inserted.
2015-10-07 15:16:31 (9952) INFO : For table tpch2.orders: 1500000 rows processed and 1500000 rows inserted.
2015-10-07 15:16:33 (9952) INFO : For table tpch2.part: 200000 rows processed and 200000 rows inserted.
2015-10-07 15:16:44 (9952) INFO : For table tpch2.partsupp: 800000 rows processed and 800000 rows inserted.
2015-10-07 15:16:44 (9952) INFO : For table tpch2.region: 5 rows processed and 5 rows inserted.
2015-10-07 15:16:45 (9952) INFO : For table tpch2.supplier: 10000 rows processed and 10000 rows inserted.
If there are some differences between the input file and table definition then the colxml utility can be utilized to handle these cases:
Different order of columns in the input file from table order
Input file column values to be skipped / ignored.
Target table columns to be defaulted.
In this case run the colxml utility (the -t argument can be useful for producing a job file for one table if preferred) to produce the job xml file and then use this a template for editing and then subsequently use that job file for running cpimport.
Consider the following simple table example:
create table emp (
emp_id int,
dept_id int,
name varchar(30),
salary int,
hire_date date) engine=columnstore;
This would produce a colxml file with the following table element:
<Table tblName="test.emp"
loadName="emp.tbl" maxErrRow="10">
<Column colName="emp_id"/>
<Column colName="dept_id"/>
<Column colName="name"/>
<Column colName="salary"/>
<Column colName="hire_date"/>
</Table>
If your input file had the data such that hire_date comes before salary then the following modification will allow correct loading of that data to the original table definition (note the last 2 Column elements are swapped):
<Table tblName="test.emp"
loadName="emp.tbl" maxErrRow="10">
<Column colName="emp_id"/>
<Column colName="dept_id"/>
<Column colName="name"/>
<Column colName="hire_date"/>
<Column colName="salary"/>
</Table>
The following example would ignore the last entry in the file and default salary to it's default value (in this case null):
<Table tblName="test.emp"
loadName="emp.tbl" maxErrRow="10">
<Column colName="emp_id"/>
<Column colName="dept_id"/>
<Column colName="name"/>
<Column colName="hire_date"/>
<IgnoreField/>
<DefaultColumn colName="salary"/>
</Table>
IgnoreFields instructs cpimport to ignore and skip the particular value at that position in the file.
DefaultColumn instructs cpimport to default the current table column and not move the column pointer forward to the next delimiter.
Both instructions can be used indepedently and as many times as makes sense for your data and table definition.
It is possible to import using a binary file instead of a CSV file using fixed length rows in binary data. This can be done using the '-I' flag which has two modes:
-I1 - binary mode with NULLs accepted Numeric fields containing NULL will be treated as NULL unless the column has a default value
-I2 - binary mode with NULLs saturated NULLs in numeric fields will be saturated
Example
cpimport -I1 mytest mytable /home/mydata/mytable.bin
The following table shows how to represent the data in the binary format:
Datatype
Description
INT/TINYINT/SMALLINT/BIGINT
Little-endian format for the numeric data
FLOAT/DOUBLE
IEEE format native to the computer
CHAR/VARCHAR
Data padded with '\0' for the length of the field. An entry that is all '\0' is treated as NULL
DATE
Using the Date struct below
DATETIME
Using the DateTime struct below
DECIMAL
Stored using an integer representation of the DECIMAL without the decimal point. With precision/width of 2 or less 2 bytes should be used, 3-4 should use 3 bytes, 4-9 should use 4 bytes and 10+ should use 8 bytes
For NULL values the following table should be used:
Datatype
Signed NULL
Unsigned NULL
BIGINT
0x8000000000000000ULL
0xFFFFFFFFFFFFFFFEULL
INT
0x80000000
0xFFFFFFFE
SMALLINT
0x8000
0xFFFE
TINYINT
0x80
0xFE
DECIMAL
As equiv. INT
As equiv. INT
FLOAT
0xFFAAAAAA
N/A
DOUBLE
0xFFFAAAAAAAAAAAAAULL
N/A
DATE
0xFFFFFFFE
N/A
DATETIME
0xFFFFFFFFFFFFFFFEULL
N/A
CHAR/VARCHAR
Fill with '\0'
N/A
struct Date
{
unsigned spare : 6;
unsigned day : 6;
unsigned month : 4;
unsigned year : 16
};
The spare bits in the Date struct "must" be set to 0x3E.
struct DateTime
{
unsigned msecond : 20;
unsigned second : 6;
unsigned minute : 6;
unsigned hour : 6;
unsigned day : 6;
unsigned month : 4;
unsigned year : 16
};
As of version 1.4, cpimport uses the /var/lib/columnstore/bulk
folder for all work being done. This folder contains:
Logs
Rollback info
Job info
A staging folder
The log folder typically contains:
-rw-r--r--. 1 root root 0 Dec 29 06:41 cpimport_1229064143_21779.err
-rw-r--r--. 1 root root 1146 Dec 29 06:42 cpimport_1229064143_21779.log
A typical log might look like this:
2020-12-29 06:41:44 (21779) INFO : Running distributed import (mode 1) on all PMs...
2020-12-29 06:41:44 (21779) INFO2 : /usr/bin/cpimport.bin -s , -E " -R /tmp/columnstore_tmp_files/BrmRpt112906414421779.rpt -m 1 -P pm1-21779 -T SYSTEM -u388952c1-4ab8-46d6-9857-c44827b1c3b9 bts flights
2020-12-29 06:41:58 (21779) INFO2 : Received a BRM-Report from 1
2020-12-29 06:41:58 (21779) INFO2 : Received a Cpimport Pass from PM1
2020-12-29 06:42:03 (21779) INFO2 : Received a BRM-Report from 2
2020-12-29 06:42:03 (21779) INFO2 : Received a Cpimport Pass from PM2
2020-12-29 06:42:03 (21779) INFO2 : Received a BRM-Report from 3
2020-12-29 06:42:03 (21779) INFO2 : BRM updated successfully
2020-12-29 06:42:03 (21779) INFO2 : Received a Cpimport Pass from PM3
2020-12-29 06:42:04 (21779) INFO2 : Released Table Lock
2020-12-29 06:42:04 (21779) INFO2 : Cleanup succeed on all PMs
2020-12-29 06:42:04 (21779) INFO : For table bts.flights: 374573 rows processed and 374573 rows inserted.
2020-12-29 06:42:04 (21779) INFO : Bulk load completed, total run time : 20.3052 seconds
2020-12-29 06:42:04 (21779) INFO2 : Shutdown of all child threads Finished!!
Prior to version 1.4, this folder was located at /usr/local/mariadb/columnstore/bulk
.
This page is licensed: CC BY-SA / Gnu FDL
The ColumnStore Bulk Data API enables the creation of higher performance adapters for ETL integration and data ingestions. The Streaming Data Adapters are out of box adapters using these API for specific data sources and use cases.
MaxScale CDC Data Adapter is integration of the MaxScale CDC streams into MariaDB ColumnStore.
Kafka Data Adapter is integration of the Kafka streams into MariaDB ColumnStore.
The MaxScale CDC Data Adapter has been deprecated.
The MaxScale CDC Data Adapter allows to stream change data events(binary log events) from MariaDB Master hosting non-columnstore engines(InnoDB, MyRocks, MyISAM) to MariaDB ColumnStore. In another words replicate data from MariaDB Master to MariaDB ColumnStore. It acts as a CDC Client for MaxScale and uses the events received from MaxScale as input to MariaDB ColumnStore Bulk Data API to push the data to MariaDB ColumnStore.
It registers with MariaDB MaxScale as a CDC Client using the MaxScale CDC Connector API, receiving change data records from MariaDB MaxScale (that are converted from binlog events received from the Master on MariaDB TX) in a JSON format. Then, using the MariaDB ColumnStore bulk write SDK, converts the JSON data into API calls and streams it to a MariaDB PM node. The adapter has options to insert all the events in the same schema as the source database table or insert each event with metadata as well as table data. The event meta data includes the event timestamp, the GTID, event sequence and event type (insert, update, delete).
Download and install MaxScale CDC Connector API from connector
Download and install MariaDB ColumnStore bulk write SDK from columnstore-bulk-write-sdk.md
sudo yum -y install epel-release
sudo yum -y install <data adapter>.rpm
sudo apt-get update
sudo dpkg -i <data adapter>.deb
sudo apt-get -f install
sudo echo "deb http://httpredir.debian.org/debian jessie-backports main contrib non-free" >> /etc/apt/sources.list
sudo apt-get update
sudo dpkg -i <data adapter>.deb
sudo apt-get -f install
Usage: mxs_adapter [OPTION]... DATABASE TABLE
-f FILE TSV file with database and table names to stream (must be in `database TAB table NEWLINE` format)
-h HOST MaxScale host (default: 127.0.0.1)
-P PORT Port number where the CDC service listens (default: 4001)
-u USER Username for the MaxScale CDC service (default: admin)
-p PASSWORD Password of the user (default: mariadb)
-c CONFIG Path to the Columnstore.xml file (default: '/usr/local/mariadb/columnstore/etc/Columnstore.xml')
-a Automatically create tables on ColumnStore
-z Transform CDC data stream from historical data to current data (implies -n)
-s Directory used to store the state files (default: '/var/lib/mxs_adapter')
-r ROWS Number of events to group for one bulk load (default: 1)
-t TIME Connection timeout (default: 10)
-n Disable metadata generation (timestamp, GTID, event type)
-i TIME Flush data every TIME seconds (default: 5)
-l FILE Log output to FILE instead of stdout
-v Print version and exit
-d Enable verbose debug output
To stream multiple tables, use the -f parameter to define a path to a TSV formatted file. The file must have one database and one table name per line. The database and table must be separated by a TAB character and the line must be terminated in a newline \n.
Here is an example file with two tables, t1 and t2 both in the test database.
test t1
test t2
You can have the adapter automatically create the tables on the ColumnStore instance with the -a option. In this case, the user used for cross-engine queries will be used to create the table (the values in Columnstore.CrossEngineSupport). This user will require CREATE privileges on all streamed databases and tables.
The -z option enables the data transformation mode. In this mode, the data is converted from historical, append-only data to the current version of the data. In practice, this replicates changes from a MariaDB master server to ColumnStore via the MaxScale CDC.
Download and install both MaxScale and ColumnStore.
Copy the Columnstore.xml file from /usr/local/mariadb/columnstore/etc/Columnstore.xml
from one of the ColumnStore UM or PMnodese to the server where the adapter is installed.
Configure MaxScale according to the CDC tutorial.
Create a CDC user by executing the following MaxAdmin command on the MaxScale server. Replace the <service>
with the name of the avrorouter service and <user>
and <password>
with the credentials that are to be created.
maxadmin call command cdc add_user <service> <user> <password>
Then we can start the adapter by executing the following command.
mxs_adapter -u <user> -p <password> -h <host> -P <port> -c <path to Columnstore.xml> <database><table>
The <database>
and <table>
define the table that is streamed to ColumnStore. This table should exist on the master server where MaxScale is reading events from. If the table is not created on ColumnStore, the adapter will print instructions on how to define it in the correct way.
The <user>
and <password>
are the users created for the CDC user, <host>
is the MaxScale address and <port>
is the port where the CDC service listener is listening.
The -c
flag is optional if you are running the adapter on the server where ColumnStore is located.
The Kafka data adapter streams all messages published to Apache Kafka topics in Avro format to MariaDB AX automatically and continuously - enabling data from many sources to be streamed and collected for analysis without complex code. The Kafka adapter is built using librdkafka and the MariaDB ColumnStore bulk write SDK
A tutorial for the Kafka adapter for ingesting Avro formatted data can be found in the kafka-to-columnstore-data-adapter document.
Starting with MariaDB ColumnStore 1.1.4, a data adapter for Pentaho Data Integration (PDI) / Kettle is available to import data directly into ColumnStore’s WriteEngine. It is built on MariaDB’s rapid-paced Bulk Write SDK.
The plugin was designed for the following software composition:
Operating system: Windows 10 / Ubuntu 16.04 / RHEL/CentOS 7+
MariaDB ColumnStore >= 1.1.4
MariaDB Java Database client* >= 2.2.1
Java >= 8
Pentaho Data Integration >= 7
+not officially supported by Pentaho.
*Only needed if you want to execute DDL.
The following steps are necessary to install the ColumnStore Data adapter (bulk loader plugin):
Extract the archive mariadb-columnstore-kettle-bulk-exporter-plugin-*.zip into your PDI installation directory $PDI-INSTALLATION/plugins.
Copy MariaDB's JDBC Client mariadb-java-client-2.2.x.jar into PDI's lib directory $PDI-INSTALLATION/lib.
Install the additional library dependencies
sudo apt-get install libuv1 libxml2 libsnappy1v5
sudo yum install epel-release
sudo yum install libuv libxml2 snappy
On Windows the installation of the Visual Studio 2015/2017 C++ Redistributable (x64) is required.
Each MariaDB ColumnStore Bulk Loader block needs to be configured. On the one hand, it needs to know how to connect to the underlying Bulk Write SDK to inject data into ColumnStore, and on the other hand, it needs to have a proper JDBC connection to execute DDL.
Both configurations can be set in each block’s settings tab.
The database connection configuration follows PDI’s default schema.
By default the plugin tries to use ColumnStore's default configuration /usr/local/mariadb/columnstore/etc/Columnstore.xml to connect to the ColumnStore instance through the Bulk Write SDK. In addition, individual paths or variables can be used too.
Information on how to prepare the Columnstore.xml configuration file can be found here.
Once a block is configured and all inputs are connected in PDI, the inputs have to be mapped to ColumnStore’s table format.
One can either choose “Map all inputs”, which sets target columns of adequate type, or choose a custom mapping based on the structure of the existing table.
The SQL button can be used to generate DDL based on the defined mapping and to execute it.
This plugin is a beta release.
In addition, it can't handle blob data types and only supports multiple inputs to one block if the input field names are equal for all input sources.
This page is licensed: CC BY-SA / Gnu FDL
Learn how to import data into MariaDB ColumnStore. This section covers various methods and tools for efficiently loading large datasets into your columnar database for analytical workloads.
MariaDB Enterprise ColumnStore supports very efficient bulk data loads.
MariaDB Enterprise ColumnStore performs bulk data loads very efficiently using a variety of mechanisms, including the cpimport tool, specialized handling of certain SQL statements, and minimal locking during data import.
MariaDB Enterprise ColumnStore includes a bulk data loading tool called cpimport, which provides several benefits:
Bypasses the SQL layer to decrease overhead
Does not block read queries
Requires a write metadata lock on the table, which can be monitored with the METADATA_LOCK_INFO plugin
Appends the new data to the table. While the bulk load is in progress, the newly appended data is temporarily hidden from queries. After the bulk load is complete, the newly appended data is visible to queries.
Inserts each row in the order the rows are read from the source file. Users can optimize data loads for Enterprise ColumnStore's automatic partitioning by loading presorted data files. For additional information, see "Load Ordered Data in Proper Order".
Supports parallel distributed bulk loads
Imports data from text files
Imports data from binary files
Imports data from standard input (stdin)
MariaDB Enterprise ColumnStore enables batch insert mode by default.
When batch insert mode is enabled, MariaDB Enterprise ColumnStore has special handling for the following statements:
[[|load-data-infileLOAD DATA [ LOCAL ] INFILE]]
Enterprise ColumnStore uses the following rules:
If the statement is executed outside of a transaction, Enterprise ColumnStore loads the data using cpimport, which is a command-line utility that is designed to efficiently load data in bulk. It executes cpimport using a wrapper called cpimport.bin
.
If the statement is executed inside of a transaction, Enterprise ColumnStore loads the data using the DML
interface, which is slower.
Batch insert mode can be disabled by setting the columnstore_use_import_for_batchinsert
system variable to OFF. When batch insert mode is disabled, Enterprise ColumnStore executes the statements using the DML
interface, which is slower.
MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport
.
When a bulk data load is running:
Read queries will not be blocked.
Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.
The write metadata lock (MDL) can be monitored with the METADATA_LOCK_INFO plugin.
Performance
Method
Interface
Format(s)
Location(s)
Benefits
Fastest
Shell
• Text file. • Binary file • Standard input (stdin)
• Server file system
Lowest latency. • Bypasses SQL layer. • Non-blocking
Fast
columnstore_info.load_from_s3
SQL
• Text file.
• S3-compatible object storage
• Loads data from the cloud. • Translates operation to cpimport command. • Non-blocking
Fast
SQL
• Text file.
• Server file system • Client file system
• Translates operation to cpimport command. • Non-blocking
Slow
SQL
• Other table(s).
• Same MariaDB server
• Translates operation to cpimport command. • Non-blocking
MariaDB Enterprise ColumnStore includes a bulk data loading tool called cpimport, which bypasses the SQL layer to decrease the overhead of bulk data loading.
Refer to, the cpimport modes for additional information and ColumnStore Bulk Data Loading.
The cpimport tool:
Bypasses the SQL layer to decrease overhead
Does not block read queries
Requires a write metadata lock on the table, which can be monitored with the METADATA_LOCK_INFO plugin
Appends the new data to the table. While the bulk load is in progress, the newly appended data is temporarily hidden from queries. After the bulk load is complete, the newly appended data is visible to queries.
Inserts each row in the order the rows are read from the source file. Users can optimize data loads for Enterprise ColumnStore's automatic partitioning by loading presorted data files.
Supports parallel distributed bulk loads
Imports data from text files
Imports data from binary files
Imports data from standard input (stdin)
You can load data using the cpimport tool in the following cases:
You are loading data into a ColumnStore table from a text file stored on the primary node's file system.
You are loading data into a ColumnStore table from a binary file stored on the primary node's file system.
You are loading data into a ColumnStore table from the output of a command running on the primary node.
MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.
When a bulk data load is running:
Read queries will not be blocked.
Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.
The write metadata lock (MDL) can be monitored with the METADATA_LOCK_INFO plugin.
Before data can be imported into the tables, the schema must be created.
Connect to the primary server using MariaDB Client:
$ mariadb --host 192.168.0.100 --port 5001 \
--user db_user --password \
--default-character-set=utf8
After the command is executed, it will prompt you for a password.
For each database that you are importing, create the database with the CREATE DATABASE statement:
CREATE DATABASE inventory;
For each table that you are importing, create the table with the CREATE TABLE statement:
CREATE TABLE inventory.products (
product_name varchar(11) NOT NULL DEFAULT '',
supplier varchar(128) NOT NULL DEFAULT '',
quantity varchar(128) NOT NULL DEFAULT '',
unit_cost varchar(128) NOT NULL DEFAULT ''
) ENGINE=Columnstore DEFAULT CHARSET=utf8;
When MariaDB Enterprise ColumnStore performs a bulk data load, it appends data to the table in the order in which the data is read. Appending data reduces the I/O requirements of bulk data loads, so that larger data sets can be loaded very efficiently.
While the bulk load is in progress, the newly appended data is temporarily hidden from queries.
After the bulk load is complete, the newly appended data is visible to queries.
When MariaDB Enterprise ColumnStore performs a bulk data load, it appends data to the table in the order in which the data is read.
The order of data can have a significant effect on performance with Enterprise ColumnStore, so it can be helpful to sort the data in the input file prior to importing it.
For additional information, see "Load Ordered Data in Proper Order".
Before importing a file into MariaDB Enterprise ColumnStore, confirm that the field delimiter is not present in the data.
The default field delimiter for the cpimport tool is a pipe (|).
To use a different delimiter, you can set the field delimiter.
The cpimport tool can import data from a text file if a file is provided as an argument after the database and table name.
For example, to import the file inventory-products.txt
into the products table in the inventory database:
$ sudo cpimport \
inventory products \
inventory-products.txt
The cpimport tool can import data from a binary file if the -I1 or -I2
option is provided and a file is provided as an argument after the database and table name.
For example, to import the file inventory-products.bin
into the products table in the inventory database:
$ sudo cpimport -I1 \
inventory products \
inventory-products.bin
The -I1 and -I2
options allow two different binary import modes to be selected:
Option
Description
-I1
Numeric fields containing NULL will be treated as NULL unless the column has a default value
-I2
Numeric fields containing NULL will be saturated
The binary file should use the following format for data:
Data Type(s)
Format
BIGINT
Little-endian integer format Signed NULL: 0x8000000000000000ULL Unsigned NULL: 0xFFFFFFFFFFFFFFFEULL
CHAR
String padded with '0' to match the length of the field NULL: '0' for the full length of the field
DATE
Use the format represented by the struct Date NULL: 0xFFFFFFFE
DATETIME
Use the format represented by the struct DateTime NULL: 0xFFFFFFFFFFFFFFFEULL
DECIMAL
Use an integer representation of the value without the decimal point Sizing depends on precision: * 1-2: use 2 bytes * 3-4: use 3 bytes * 4-9: use 4 bytes * 10+: use 8 bytes Signed and unsigned NULL: See equivalent-sized integer
DOUBLE
Native IEEE floating point format NULL: 0xFFFAAAAAAAAAAAAAULL
FLOAT
Native IEEE floating point format NULL: 0xFFAAAAAA
INT
Little-endian integer format Signed NULL: 0x80000000 Unsigned NULL: 0xFFFFFFFE
SMALLINT
Little-endian integer format Signed NULL: 0x8000 Unsigned NULL: 0xFFFE
TINYINT
Little-endian integer format Signed NULL: 0x80 Unsigned NULL: 0xFE
VARCHAR
String padded with '0' to match the length of the field NULL: '0' for the full length of the field
In binary input files, the cpimport tool expects DATE columns to be in the following format:
struct Date
{
unsigned spare : 6;
unsigned day : 6;
unsigned month : 4;
unsigned year : 16
};
In binary input files, the cpimport tool expects DATETIME columns to be in the following format:
struct DateTime
{
unsigned msecond : 20;
unsigned second : 6;
unsigned minute : 6;
unsigned hour : 6;
unsigned day : 6;
unsigned month : 4;
unsigned year : 16
};
The cpimport tool can import data from standard input (stdin) if no file is provided as an argument.
Importing from standard input is useful in many scenarios.
One scenario is when you want to import data from a remote database. You can use MariaDB Client to query the table using the SELECT statement, and then pipe the results into the standard input of the cpimport tool:
$ mariadb --quick \
--skip-column-names \
--execute="SELECT * FROM inventory.products" \
| cpimport -s '\t' inventory products
The cpimport tool can import data from a file stored in a remote S3 bucket.
You can use the AWS CLI to copy the file from S3, and then pipe the contents into the standard input of the cpimport tool:
$ aws s3 cp --quiet s3://columnstore-test/inventory-products.csv - \
| cpimport -s ',' inventory products
Alternatively, the columnstore_info.load_from_s3 stored procedure can import data from S3-compatible cloud object storage.
The default field delimiter for the cpimport tool is a pipe (|).
If your data file uses a different field delimiter, you can specify the field delimiter with the -s option.
For a TSV (tab-separated values) file:
$ sudo cpimport -s '\t' \
inventory products \
inventory-products.tsv
For a CSV (comma-separated values) file:
$ sudo cpimport -s ',' \
inventory products \
inventory-products.csv
By default, the cpimport tool does not expect fields to be quoted.
If your data file uses quotes around fields, you can specify the quote character with the -E option.
To load a TSV (tab-separated values) file that uses double quotes:
$ sudo cpimport -s '\t' -E '"' \
inventory products \
inventory-products.tsv
To load a CSV (comma-separated values) file that uses optional single quotes:
$ sudo cpimport -s ',' -E "'" \
inventory products \
inventory-products.csv
The cpimport tool writes logs to different directories, depending on the Enterprise ColumnStore version:
In Enterprise ColumnStore 5.5.2 and later, logs are written to /var/log/mariadb/columnstore/bulk/
In Enterprise ColumnStore 5 releases before 5.5.2, logs are written to /var/lib/columnstore/data/bulk/
In Enterprise ColumnStore 1.4, logs are written to /usr/local/mariadb/columnstore/bulk/
The cpimport tool requires column values to be in the same order in the input file as the columns in the table definition.
The cpimport tool requires DATE values to be specified in the format YYYY-MM-DD.
The cpimport tool does not write bulk data loads to the transaction log, so they are not transactional.
The cpimport tool does not write bulk data loads to the binary log, so they cannot be replicated using MariaDB replication.
When Enterprise ColumnStore uses object storage and the Storage Manager directory uses EFS in the default Bursting Throughput mode, the cpimport tool can have performance problems if multiple data load operations are executed consecutively. The performance problems can occur because the Bursting Throughput mode scales the rate relative to the size of the file system, so the burst credits for a small Storage Manager volume can be fully consumed very quickly.
When this problem occurs, some solutions are:
Avoid using burst credits by using Provisioned Throughput mode instead of Bursting Throughput mode
Monitor burst credit balances in AWS and run data load operations when burst credits are available
Increase the burst credit balance by increasing the file system size (for example, by creating a dummy file)
Additional information is available here.
This page is: Copyright © 2025 MariaDB. All rights reserved.
MariaDB Enterprise ColumnStore automatically translates INSERT INTO .. SELECT FROM ..
statements into bulk data loads. By default, it translates the statement into a bulk data load that uses cpimport.bin
, which is an internal wrapper around the cpimport
tool.
You can load data using INSERT INTO .. SELECT FROM ..
in the following cases:
You are loading data into a ColumnStore table by querying one or more local tables.
MariaDB Enterprise ColumnStore enables batch insert mode by default.
When batch insert mode is enabled, MariaDB Enterprise ColumnStore has special handling for INSERT INTO .. SELECT FROM .. statements.
Enterprise ColumnStore uses the following rules:
If the statement is executed outside of a transaction, Enterprise ColumnStore loads the data using cpimport
, which is a command-line utility that is designed to efficiently load data in bulk. Enterprise ColumnStore executes cpimport
using a wrapper called cpimport.bin.
If the statement is executed inside of a transaction, Enterprise ColumnStore loads the data using the DML interface, which is slower.
Batch insert mode can be disabled by setting the columnstore_use_import_for_batchinsert system variable to OFF. When batch insert mode is disabled, Enterprise ColumnStore executes the statements using the DML interface, which is slower.
MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.
When a bulk data load is running:
Read queries will not be blocked.
Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.
The write metadata lock (MDL) can be monitored with the METADATA_LOCK_INFO plugin.
Before data can be imported into the tables, the schema must be created.
Connect to the primary server using MariaDB Client:
$ mariadb --host 192.168.0.100 --port 5001 \
--user db_user --password \
--default-character-set=utf8
After the command is executed, it will prompt you for a password.
For each database that you are importing, create the database with the CREATE DATABASE statement:
CREATE DATABASE inventory;
For each table that you are importing, create the table with the CREATE TABLE statement:
CREATE TABLE inventory.products (
product_name varchar(11) NOT NULL DEFAULT '',
supplier varchar(128) NOT NULL DEFAULT '',
quantity varchar(128) NOT NULL DEFAULT '',
unit_cost varchar(128) NOT NULL DEFAULT ''
) ENGINE=Columnstore DEFAULT CHARSET=utf8;
When MariaDB Enterprise ColumnStore performs a bulk data load, it appends data to the table in the order in which the data is read. Appending data reduces the I/O requirements of bulk data loads, so that larger data sets can be loaded very efficiently.
While the bulk load is in progress, the newly appended data is temporarily hidden from queries.
After the bulk load is complete, the newly appended data is visible to queries.
When MariaDB Enterprise ColumnStore performs a bulk data load, it appends data to the table in the order in which the data is read.
The order of data can have a significant effect on performance with Enterprise ColumnStore. If your data is not already sorted, it can be helpful to sort the query results using an ORDER BY clause.
For example:
# Perform import from other table
# and insert in sorted order
INSERT INTO inventory.orders
SELECT *
FROM innodb_inventory.orders
ORDER BY order_date;
For additional information, see "Load Ordered Data in Proper Order".
Before importing a table's data into MariaDB Enterprise ColumnStore, confirm that the field delimiter is not present in the data.
The field delimiter is determined by the columnstore_import_for_batchinsert_delimiter
system variable. By default, Enterprise ColumnStore sets the field delimiter to the ASCII value 7
, which corresponds to the BEL
character.
To use a different delimiter, you can set the field delimiter.
When the data is passed to cpimport, each value is separated by a field delimiter. The field delimiter is determined by the columnstore_import_for_batchinsert_delimiter
system variable.
By default, Enterprise ColumnStore sets the field delimiter to the ASCII value 7
, which corresponds to the BEL
character. In general, setting the field delimiter is only required if your data contains this value.
Set the field delimiter by setting the columnstore_import_for_batchinsert_delimiter
system variable to the ASCII
value for the desired delimiter character.
For example, if you want to use a comma (,) as the field delimiter, you would set columnstore_import_for_batchinsert_delimiter
to 44:
# Set field delimiter
SET SESSION columnstore_import_for_batchinsert_delimiter=44;
# Perform import from other table
INSERT INTO inventory.products
SELECT *
FROM innodb_inventory.products;
When the data is passed to cpimport, each value is enclosed by a quote character. The quote character is determined by the columnstore_import_for_batchinsert_enclosed_by
system variable.
By default, Enterprise ColumnStore sets the quote character to the ASCII value 17
, which corresponds to the DC1
character. In general, setting the quote character is only required if your data contains this value.
Set the quote character by setting the columnstore_import_for_batchinsert_enclosed_by
system variable to the ASCII
value for the desired quote character.
For example, if you want to use a double quote (") as the quote character, you would set columnstore_import_for_batchinsert_enclosed_by
to 34:
# Set quote character
SET SESSION columnstore_import_for_batchinsert_enclosed_by=34;
# Perform import from other table
INSERT INTO inventory.products
SELECT *
FROM innodb_inventory.products;
This page is: Copyright © 2025 MariaDB. All rights reserved.
MariaDB Enterprise ColumnStore automatically translates LOAD DATA [ LOCAL ] INFILE
statements into bulk data loads. By default, it translates the statement into a bulk data load that uses cpimport.bin, which is an internal wrapper around the cpimport tool.
You can load data using the LOAD DATA INFILE statement in the following cases:
You are loading data into a ColumnStore table from a text file stored on the primary node's file system.
You are loading data into a ColumnStore table from a text file stored on the client's file system. In this case, LOAD DATA LOCAL INFILE
must be used.
MariaDB Enterprise ColumnStore enables batch insert mode by default.
When batch insert mode is enabled, MariaDB Enterprise ColumnStore has special handling for LOAD DATA INFILE statements.
Enterprise ColumnStore uses the following rules:
If the statement is executed outside of a transaction, Enterprise ColumnStore loads the data using cpimport, which is a command-line utility that is designed to efficiently load data in bulk. Enterprise ColumnStore executes cpimport using a wrapper called cpimport.bin
.
If the statement is executed inside of a transaction, Enterprise ColumnStore loads the data using the DML interface, which is slower.
Batch insert mode can be disabled by setting the columnstore_use_import_for_batchinsert
system variable to OFF. When batch insert mode is disabled, Enterprise ColumnStore executes the statements using the DML interface, which is slower.
Starting with MariaDB Enterprise ColumnStore 6, an insert cache can be enabled to speed up LOAD DATA INFILE statements.
The insert cache is disabled by default, but it can be enabled by configuring columnstore_cache_inserts=ON
:
[mariadb]
...
columnstore_cache_inserts=ON\
The cache is flushed to ColumnStore in the following scenarios:
When the number of cached rows exceeds the value of columnstore_cache_flush_threshold
When a statement other than INSERT or LOAD DATA INFILE is executed
cpimport is used to flush the insert cache to ColumnStore when columnstore_cache_use_import=ON
is configured.
MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.
When a bulk data load is running:
Read queries will not be blocked.
Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.
The write metadata lock (MDL) can be monitored with the METADATA_LOCK_INFO plugin.
Before data can be imported into the tables, the schema must be created.
Connect to the primary server using MariaDB Client:
$ mariadb --host 192.168.0.100 --port 5001 \
--user db_user --password \
--default-character-set=utf8
After the command is executed, it will prompt you for a password.
For each database that you are importing, create the database with the CREATE DATABASE statement:
CREATE DATABASE inventory;
For each table that you are importing, create the table with the CREATE TABLE statement:
CREATE TABLE inventory.products (
product_name varchar(11) NOT NULL DEFAULT '',
supplier varchar(128) NOT NULL DEFAULT '',
quantity varchar(128) NOT NULL DEFAULT '',
unit_cost varchar(128) NOT NULL DEFAULT ''
) ENGINE=Columnstore DEFAULT CHARSET=utf8;
When MariaDB Enterprise ColumnStore performs a bulk data load, it appends data to the table in the order in which the data is read. Appending data reduces the I/O requirements of bulk data loads, so that larger data sets can be loaded very efficiently.
While the bulk load is in progress, the newly appended data is temporarily hidden from queries.
After the bulk load is complete, the newly appended data is visible to queries.
When MariaDB Enterprise ColumnStore performs a bulk data load, it appends data to the table in the order in which the data is read.
The order of data can have a significant effect on performance with Enterprise ColumnStore, so it can be helpful to sort the data in the input file prior to importing it.
For additional information, see "Load Ordered Data in Proper Order".
Before importing a file into MariaDB Enterprise ColumnStore, confirm that the field delimiter is not present in the data.
The field delimiter is determined by the columnstore_import_for_batchinsert_delimiter
system variable. By default, Enterprise ColumnStore sets the field delimiter to the ASCII value 7
, which corresponds to the BEL
character.
To use another delimiter, you can set the field delimiter.
If you are loading a file located on the client, you can use the LOAD DATA LOCAL INFILE statement. However, for this statement to work, the client must explicitly enable usage of a local infile.
If you are using MariaDB Client the --local-infile
option can be used:
$ mariadb --host 192.168.0.1 \
--user db_user --password \
--default-character-set=utf8 \
--local-infile
If you are using MariaDB Connector/C, the MYSQL_OPT_LOCAL_INFILE
option can be set with the mysql_optionsv()
function:
/* enable local infile */
unsigned int enable_local_infile = 1;
mysql_optionsv(mysql, MYSQL_OPT_LOCAL_INFILE, (void *) &enable_local_infile);
If you are using MariaDB Connector/J, the allowLocalInfile
parameter can be set for the connection:
Connection connection = DriverManager.getConnection("jdbc:mariadb://192.168.0.1/test?user=test_user&password=myPassword&allowLocalInfile=true");
If you are using MariaDB Connector/Node.js, the permitLocalInfile
parameter can be set for the connection:
mariadb.createConnection({
host: '192.168.0.1',
user:'test_user',
password: 'myPassword',
permitLocalInfile: 'true'
});
If you are using MariaDB Connector/Python, the local_infile
parameter can be set for the connection:
conn = mariadb.connect(
user="test_user",
password="myPassword",
host="192.168.0.1",
port=3306,
local_infile=true)
Set the Field Delimiter
The default field delimiter for the LOAD DATA INFILE statement is a tab.
If your data file uses a different field delimiter, you can specify the field delimiter with the FIELDS TERMINATED BY
clause.
To load a TSV (tab-separated values)
file:
LOAD DATA INFILE 'inventory-products.tsv'
INTO TABLE inventory.products;
To load a CSV (comma-separated values) file:
LOAD DATA LOCAL INFILE 'inventory-products.csv'
INTO TABLE accounts.contacts
FIELDS TERMINATED BY ',';
By default, the LOAD DATA INFILE statement does not expect fields to be quoted.
If your data file uses quotes around fields, you can specify the quote character with the FIELDS [OPTIONALLY] ENCLOSED BY
clause.
To load a TSV (tab-separated values)
file that uses double quotes:
LOAD DATA INFILE 'inventory-products.tsv'
INTO TABLE inventory.products
FIELDS ENCLOSED BY '"';
To load a CSV (comma-separated values) file that uses optional single quotes:
LOAD DATA LOCAL INFILE 'inventory-products.csv'
INTO TABLE accounts.contacts
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\'';
MariaDB Enterprise ColumnStore ignores the ON DUPLICATE KEY
clause.
Please ensure that duplicate data is removed prior to performing a bulk data load.
This page is: Copyright © 2025 MariaDB. All rights reserved.
MariaDB Enterprise ColumnStore includes a stored procedure called columnstore_info.load_from_s3, which can load data from a plain-text file containing delimiter-separated values (such as CSV or TSV) stored on S3-compatible cloud object storage.
MariaDB Enterprise ColumnStore 23.02
Before you import data with the columnstore_info.load_from_s3 stored procedure, the authentication credentials and the region can be configured using system variables:
columnstore_s3_key
columnstore_s3_secret
columnstore_s3_region
For example, the following statements show how to set the system variables for your current session:
SET columnstore_s3_key='S3_KEY';
SET columnstore_s3_secret='S3_SECRET';
SET columnstore_s3_region='S3_REGION';
To import data with the columnstore_info.load_from_s3
stored procedure, use the CALL
statement:
CALL columnstore_info.load_from_s3('BUCKET_URL',
'FILE_NAME',
'DATABASE_NAME',
'TABLE_NAME',
'TERMINATED_BY',
'ENCLOSED_BY',
'ESCAPED_BY');
Replace 'BUCKET_URL'
with the URL of your bucket. The protocol in the URL must be s3: for AWS S3 or gs:
for Google Cloud Storage
Replace 'FILE_NAME'
with the file name to load from. The file must be a plain-text file containing delimiter-separated values, such as a comma-separated values (CSV) or tab-separated values (TSV) file. The supported file format is similar to the plain-text file formats supported by cpimport
and LOAD DATA [LOCAL] INFILE
. Please note that this stored procedure can't load dump files created by mariadb-dump
Replace 'DATABASE_NAME'
with the database to import into
Replace 'TABLE_NAME'
with the table name to import into
Replace 'TERMINATED_BY'
with the field terminator used in the file, similar to the -s command-line
option for cpimport
Replace 'ENCLOSED_BY'
with the quotes used in the file, similar to the -E command-line
option for cpimport
Replace 'ESCAPED_BY'
with the escape character used in the file, similar to the -C command-line
option for cpimport
All parameters are mandatory.
For example, to load a comma-separated values (CSV) file from AWS S3:
CALL columnstore_info.load_from_s3('s3://mariadb-columnstore-test-data/',
'test-data-db1-tab1.csv',
'db1',
'tab1',
',',
'"',
'\\');
When the stored procedure completes, it returns JSON containing the status of the operation. If the JSON shows an error or "success": false, check your table to see if some or all of your data was loaded, because many errors are non-fatal.
When the data file is stored in Amazon S3, the AWS user only requires the s3:GetObject
action on the bucket.
For example, the AWS user can use a user policy like the following:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "readBucket",
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
]
}
]
}
This page is: Copyright © 2025 MariaDB. All rights reserved.
MariaDB ColumnStore tutorials show how to set up, load data, and run fast SQL queries using columnar storage, often with Docker.
The MariaDB ColumnStore Reference is a key guide to its architecture, setup, SQL, and analytics use.
The COMMIT statement makes changes to a table permanent. You should only commit changes after you have verified the integrity of the changed data. Once data is committed, it cannot be undone with the ROLLBACK statement. To return the database to its former state, you must restore the data from backups.
images here
This page is licensed: CC BY-SA / Gnu FDL
The DROP PROCEDURE statement deletes a stored procedure from ColumnStore.
images here
The following statement drops the sp_complex_variable procedure:
DROP PROCEDURE sp_complex_variable;
This page is licensed: CC BY-SA / Gnu FDL
The RENAME TABLE statement renames one or more ColumnStore tables.
images here
Notes:
You cannot currently use RENAME TABLE to move a table from one database to another.
See the ALTER TABLE syntax for an alternate to RENAME table. The following statement renames the orders table:
RENAME TABLE orders TO customer_order;
The following statement renames both the orders table and customer table:
RENAME TABLE orders TO customer_orders,customer TO customers;
You may also use RENAME TABLE to swap tables. This example swaps the customer and vendor tables (assuming the temp_table does not already exist):
RENAME TABLE customer TO temp_table, vendor TO customer,temp_table to vendor;
This page is licensed: CC BY-SA / Gnu FDL
The ROLLBACK statement undoes transactions that have not been permanently saved to the database with the COMMIT statement.You cannot rollback changes to table properties including ALTER, CREATE, or DROP TABLE statements.
images here
This page is licensed: CC BY-SA / Gnu FDL
MariaDB ColumnStore has the ability to compress data and this is controlled through a compression mode. This compression mode may be set as a default for the instance or set at the session level.
To set the compression mode at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.
set infinidb_compression_type = n
where n is:
compression is turned off. Any subsequent table create statements run will have compression turned off for that table unless any statement overrides have been performed. Any alter statements run to add a column will have compression turned off for that column unless any statement override has been performed.
compression is turned on. Any subsequent table create statements run will have compression turned on for that table unless any statement overrides have been performed. Any alter statements run to add a column will have compression turned on for that column unless any statement override has been performed. ColumnStore uses snappy compression in this mode.
This page is licensed: CC BY-SA / Gnu FDL
A condition is a combination of expressions and operators that return TRUE, FALSE or NULL.The following syntax shows the conditions that can be used to return a TRUE, FALSE,or NULL condition.
filter:
column| literal| function [=|!=|<>|<|<=|>=|>] column| literal| function | select_statement
column| function [NOT] IN (select_statement | literal, literal,...)
column| function [NOT] BETWEEN (select_statement | literal, literal,...)
column| function IS [NOT] NULL
string_column|string_function [NOT] LIKE pattern
EXISTS (select_statement)
NOT (filter)
(filter|function) [AND|OR] (filter|function)
ColumnStore, unlike the MyISAM engine, is case sensitive for string comparisons used in filters. For the most accurate results, and to avoid confusing results, make sure string filter constants are no longer than the column width itself.
Pattern matching as described with the LIKE condition allows you to use “” to match any single character and “%” to match an arbitrary number of characters (including zero characters). To test for literal instances of a wildcard character, (“%” or “”),precede it by the “\” character.
OR Processing has the following restrictions:
Only column comparisons against a literal are allowed in conjunction with an OR. The following query would be allowed since all comparisons are against literals. SELECT count(*) from lineitem WHERE l_partkey < 100 OR l_linestatus =‘F‘;
ColumnStore binds AND’s more tightly than OR’s, just like any other SQLparser. Therefore you must enclose OR-relations in parentheses, just like in any other SQL parser.
SELECT count(*) FROM orders, lineitem
WHERE (lineitem.l_orderkey < 100 OR lineitem.l_linenumber > 10)
AND lineitem.l_orderkey =orders.o_orderkey;
The following syntax show the conditions you can use when executing a condition against two columns. Note that the columns must be from the same table.
col_name_1 [=|!=|<>|<|<=|>=|>] col_name_2
The following syntax show the conditions you can use when executing a join on two tables.
join_condition [AND join_condition]
join_condition:
[col_name_1|function_name_1] = [col_name_2|function_name_2]
Notes:
ColumnStore tables can only be joined with non-ColumnStore tables in table mode only. See Operating Mode for information.
ColumnStore will require a join in the WHERE clause for each set of tables in the FROM clause. No cartesian product queries will be allowed.
ColumnStore requires that joins must be on the same datatype. In addition, numeric datatypes (INT variations, NUMERIC, DECIMAL) may be mixed in the join if they have the same scale.
Circular joins are not supported in ColumnStore. See the Troubleshooting section
When the join memory limit is exceeded, a disk-based join will be used for processing if this option has been enabled.
This page is licensed: CC BY-SA / Gnu FDL
ColumnStore supports the following data types:
Datatypes
Column Size
Description
1-byte
A very small integer. Numeric value with scale 0. Signed: -126 to +127. Unsigned: 0 to 253.
3-bytes
A medium integer. Signed: -8388608 to 8388607. Unsigned: 0 to 16777215. Supported starting with MariaDB ColumnStore 1.4.2.
4-bytes
A normal-size integer. Numeric value with scale 0. Signed: -2,147,483,646 to 2,147,483,647. Unsigned: 0 to 4,294,967,293
8-bytes
A large integer. Numeric value with scale 0. Signed: -9,223,372,036,854,775,806 to+9,223,372,036,854,775,807 Unsigned: 0 to +18,446,744,073,709,551,613
2, 4, or 8 bytes
A packed fixed-point number that can have a specific total number of digits and with a set number of digits after a decimal. The maximum precision (total number of digits) that can be specified is 18.
4 bytes
Stored in 32-bit IEEE-754 floating point format. As such, the number of significant digits is about 6and the range of values is approximately +/- 1e38.The MySQL extension to specify precision and scale is not supported.
8 bytes
Stored in 64-bit IEEE-754 floating point format. As such, the number of significant digits is about 15 and the range of values is approximately +/-1e308. The MySQL extension to specify precision and scale is not supported. “REAL” is a synonym for “DOUBLE”.
Datatypes
Column Size
Description
1, 2, 4, or 8 bytes
Holds letters and special characters of fixed length. Max length is 255. Default and minimum size is 1 byte.
1, 2, 4, or 8 bytes or 8-byte token
Holds letters, numbers, and special characters of variable length. Max length = 8000 bytes or characters and minimum length = 1 byte or character.
255 bytes
Holds a small amount of letters, numbers, and special characters of variable length. Supported from version 1.1.0 onwards.
255 bytes
Holds a small amount of binary data of variable length. Supported from version 1.1.0 onwards.
64 KB
Holds letters, numbers, and special characters of variable length. Supported from version 1.1.0 onwards.
16 MB
Holds a medium amount of letters, numbers, and special characters of variable length. Supported from version 1.1.0 onwards.
16 MB
Holds a medium amount of binary data of variable length. Supported from version 1.1.0 onwards.
1.96 GB
Holds a large amount of letters, numbers, and special characters of variable length. Supported from version 1.1.0 onwards.
1.96 GB
Holds a large amount of binary data of variable length. Supported from version 1.1.0 onwards.
Datatypes
Column Size
Description
4-bytes
Date has year, month, and day. The internal representation of a date is a string of 4 bytes. The first 2 bytes represent the year, .5 bytes the month, and .75 bytes the day in the following format: YYYY-MM-DD. Supported range is 1000-01-01 to 9999-12-31.
8-bytes
A date and time combination. Supported range is 1000-01-01 00:00:00 to 9999-12-31 23:59:59. From version 1.2.0 microseconds are also supported.
8-bytes
Holds hour, minute, second and optionally microseconds for time. Supported range is '-838:59:59.999999' to '838:59:59.999999'. Supported from version 1.2.0 onwards.
4-bytes
Values are stored as the number of seconds since 1970-01-01 00:00:00 UTC, and optionally microseconds. The max value is currently 2038-01-19 03:14:07 UTC. Supported starting with MariaDB ColumnStore 1.4.2.
ColumnStore treats a zero-length string as a NULL value.
As with core MariaDB, ColumnStore employs “saturation semantics” on integer values. This means that if a value is inserted into an integer field that is too big/small for it to hold (i.e. it is more negative or more positive than the values indicated above), ColumnStore will “saturate” that value to the min/max value indicated above as appropriate. For example, for a SMALLINT column, if 32800 is attempted, the actual value inserted will be 32767.
ColumnStore largest negative and positive numbers appears to be 2 less than what MariaDB supports. ColumnStore reserves these for its internal use and they cannot be used. For example, if there is a need to store -128 in a column, the TINYINT datatype cannot be used; the SMALLINT datatype must be used instead. If the value -128 is inserted into a TINYINT column, ColumnStore will saturate it to -126 (and issue a warning).
ColumnStore truncates rather than rounds decimal constants that have too many digits after the decimal point during bulk load and when running SELECT statements. For INSERT and UPDATE, however, the MariaDB parser will round such constants. You should verify that ETL tools used and any INSERT/UPDATEstatements only specify the correct number of decimal digits to avoid potential confusion.
An optional display width may be added to the BIGINT, INTEGER/INT, SMALLINT & TINYINT columns. As with core MariaDB tables, this value does not affect the internal storage requirements of the column nor does it affect the valid value ranges.
All columns in ColumnStore are nullable and the default value for any column is NULL. You may optionally specify NOT NULL for any column and/or one with a DEFAULT value.
Unlike other MariaDB storage engines, the actual storage limit for LONGBLOB/LONGTEXT is 2,100,000,000 bytes instead of 4GB per entry. MariaDB's client API is limited to a row length of 1GB.
This page is licensed: CC BY-SA / Gnu FDL
MariaDB ColumnStore has the ability to change intermediate decimal mathematical results from decimal type to double. The decimal type has approximately 17-18 digits of precision but a smaller maximum range. Whereas the double type has approximately 15-16 digits of precision, but a much larger maximum range. In typical mathematical and scientific applications, the ability to avoid overflow in intermediate results with double math is likely more beneficial than the additional two digits of precision. In banking applications, however, it may be more appropriate to leave in the default decimal setting to ensure accuracy to the least significant digit.
The columnstore_double_for_decimal_math variable is used to control the data type for intermediate decimal results. This decimal for double math may be set as a default for the instance, set at the session level, or at the statement level by toggling this variable on and off.
To enable/disable the use of the decimal to double math at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.
set columnstore_double_for_decimal_math
where n is:
off (disabled, default)
on (enabled)
ColumnStore has the ability to support varied internal precision on decimal calculations. columnstore_decimal_scale is used internally by the ColumnStore engine to control how many significant digits to the right of the decimal point are carried through in suboperations on calculated columns. If, while running a query, you receive the message ‘aggregate overflow,’ try reducing columnstore_decimal_scale and running the query again. Note that, as you decrease columnstore_decimal_scale, you may see reduced accuracy in the least significant digit(s) of a returned calculated column. columnstore_decimal_scale is used internally by the ColumnStore engine to turn the use of this internal precision on and off. These two system variables may be set as a default for the instance or set at the session level.
To enable/disable the use of the decimal scale at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.
set columnstore_decimal_scale
where n is off (disabled) or on (enabled).
To set the decimal scale at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.
set columnstore_use_decimal_scale
where n is the amount of precision desired for calculations.
This page is licensed: CC BY-SA / Gnu FDL
MariaDB ColumnStore supports the following aggregate functions, these can be used in the SELECT, HAVING, and ORDER BY clauses of the SQL statement.
Function
Description
AVG([DISTINCT] column)
Average value of a numeric (INT variations, NUMERIC, DECIMAL) column
CORR(ColumnY, ColumnX)
The correlation coefficient for non-null pairs in a group.
COUNT (*, [DISTINCT] column)
The number of rows returned by a query or grouping. All datatypes are supported
COVAR_POP(ColumnY, ColumnX)
The population covariance for non-null pairs in a group.
COVAR_SAMP(ColumnY, ColumnX)
The sample covariance for non-null pairs in a group.
MAX ([DISTINCT] column)
The maximum value of a column. All datatypes are supported.
MIN ([DISTINCT] column)
The maximum value of a column. All datatypes are supported.
REGR_AVGX(ColumnY, ColumnX)
Average of the independent variable (sum(ColumnX)/N), where N is number of rows processed by the query
REGR_AVGY(ColumnY, ColumnX)
Average of the dependent variable (sum(ColumnY)/N), where N is number of rows processed by the query
REGR_COUNT(ColumnY, ColumnX)
The total number of input rows in which both column Y and column X are nonnull
REGR_INTERCEPT(ColumnY, ColumnX)
The y-intercept of the least-squares-fit linear equation determined by the (ColumnX, ColumnY) pairs
REGR_R2(ColumnY, ColumnX)
Square of the correlation coefficient. correlation coefficient is the regr_intercept(ColumnY, ColumnX) for linear model
REGR_SLOPE(ColumnY, ColumnX)
The slope of the least-squares-fit linear equation determined by the (ColumnX, ColumnY) pairs
REGR_SXX(ColumnY, ColumnX)
REGR_COUNT(y, x) * VAR_POP(x) for non-null pairs.
REGR_SXY(ColumnY, ColumnX)
REGR_COUNT(y, x) * COVAR_POP(y, x) for non-null pairs.
REGR_SYY(ColumnY, ColumnX)
REGR_COUNT(y, x) * VAR_POP(y) for non-null pairs.
STD(), STDDEV(), STDDEV_POP()
The population standard deviation of a numeric (INT variations, NUMERIC, DECIMAL) column
STDDEV_SAMP()
The sample standard deviation of a numeric (INT variations, NUMERIC, DECIMAL) column
SUM([DISTINCT] column)
The sum of a numeric (INT variations, NUMERIC, DECIMAL) column
VARIANCE(), VAR_POP()
The population standard variance of a numeric (INT variations, NUMERIC, DECIMAL) column
VAR_SAMP()
The population standard variance of a numeric (INT variations, NUMERIC, DECIMAL) column
Regression functions (REGR_AVGX to REGR_YY), CORR, COVAR_POP and COVAR_SAMP are supported for version 1.2.0 and higher
An example group by query using aggregate functions is:
select year(o_orderdate) order_year,
avg(o_totalprice) avg_totalprice,
max(o_totalprice) max_totalprice,
count(*) order_count
from orders
group by order_year
order by order_year;
This page is licensed: CC BY-SA / Gnu FDL
ColumnStore supports the following functions. These functions can be specified in the projection (SELECT), WHERE and ORDER BY portions of the SQL statement and will be processed in a distributed manner.
This page is licensed: CC BY-SA / Gnu FDL
MariaDB ColumnStore Information Functions are selectable pseudo functions that return MariaDB ColumnStore specific “meta” information to ensure queries can be locally directed to a specific node. These functions can be specified in the projection (SELECT), WHERE, GROUP BY, HAVING and ORDER BY portions of the SQL statement and will be processed in a distributed manner.
Function
Description
idbBlockId(column)
The Logical Block Identifier (LBID) for the block containing the physical row
idbDBRoot(column)
The DBRoot where the physical row resides
idbExtentId(column)
The Logical Block Identifier (LBID) for the first block in the extent containing the physical row
idbExtentMax(column)
The max value from the extent map entry for the extent containing the physical row
idbExtentMin(column)
The min value from the extent map entry for the extent containing the physical row
idbExtentRelativeRid(column)
The row id (1 to 8,388,608) within the column's extent
idbLocalPm()
The PM from which the query was launched. This function will return NULL if the query is launched from a standalone UM
idbPartition(column)
The three part partition id (Directory.Segment.DBRoot)
idbPm(column)
The PM where the physical row resides
idbSegmentDir(column)
The lowest level directory id for the column file containing the physical row
idbSegment(column)
The number of the segment file containing the physical row
MariaDB ColumnStore has four information schema tables that expose information about the table and column storage. The tables were added in version 1.0.5 of ColumnStore and were heavily modified for 1.0.6.
The first table is the INFORMATION_SCHEMA.COLUMNSTORE_TABLES. It contains information about the tables inside ColumnStore.
The INFORMATION_SCHEMA.COLUMNSTORE_COLUMNS table contains information about every single column inside ColumnStore.
This table displays the extent map in a user-consumable form. An extent is a collection of details about a section of data related to a columnstore column. A majority of columns in ColumnStore will have multiple extents, and the columns table above can be joined to this one to filter results by table or column. The table layout is as follows:
Column
Description
OBJECT_ID
The object ID for the extent
OBJECT_TYPE
Whether this is a "Column" or "Dictionary" extent
LOGICAL_BLOCK_START
ColumnStore's internal start LBID for this extent
LOGICAL_BLOCK_END
ColumnStore's internal end LBID for this extent
MIN_VALUE
This minimum value stored in this extent
MAX_VALUE
The maximum value stored in this extent
WIDTH
The data width for the extent
DBROOT
The DBRoot number for the extent
PARTITION_ID
The parition ID for the extent
SEGMENT_ID
The segment ID for the extent
BLOCK_OFFSET
The block offset for the data file, each data file can contain multiple extents for a column
MAX_BLOCKS
The maximum number of blocks for the extent
HIGH_WATER_MARK
The last block committed to the extent (starting at 0)
STATE
The state of the extent (see below)
STATUS
The availability status for the column which is either "Available", "Unavailable" or "Out of service"
DATA_SIZE
The uncompressed data size for the extent calculated as (HWM + 1) * BLOCK_SIZE
Notes:
The state is "Valid" for a normal state, "Invalid" if a cpimport has completed but the table has not yet been accessed (min/max values will be invalid)or "Updating" if there is a DML statement writing to the column
In ColumnStore the block size is 8192 bytes
By default, ColumnStore will write and create an extent file of 2561024WIDTH bytes for the first partition; if this is too small, then for uncompressed data, it will create a file of the maximum size for the extent (MAX_BLOCKS * BLOCK_SIZE). Snappy always compression adds a header block.
Object IDs of less than 3000 are for internal tables and will not appear in any of the information schema tables
HWM is set to zero for the lower segments when there are multiple segments in an extent file; these can be observed when BLOCK_OFFSET > 0
When HWM is 0, the DATA_SIZE will show 0 instead of 8192 to avoid confusion when there is multiple segments in an extent file
The columnstore_files table provides information about each file associated with extensions. Each extension can reuse a file at different block offsets, so this is not a 1:1 relationship to the columnstore_extents table.
Column
Description
OBJECT_ID
The object ID for the extent
SEGMENT_ID
The segment ID for the extent
PARTITION_ID
The partition ID for the extent
FILENAME
The full path and filename for the extent file, multiple extents for the same column can point to this file with different BLOCK_OFFSETs
FILE_SIZE
The disk file size for the extent
COMPRESSED_DATA_SIZE
The amount of the compressed file used, NULL if this is an uncompressed file
The total_usage() procedure gives a total disk usage summary for all the columns in ColumnStore except the columns used for internal maintenance. It is executed using the following query:
> call columnstore_info.total_usage();
The table_usage() procedure gives the total data disk usage, dictionary disk usage, and grand total disk usage per table. It can be called in several ways; the first gives a total for each table:
> call columnstore_info.table_usage(NULL, NULL);
Or for a specific table, my_table in my_schema in this example:
> call columnstore_info.table_usage('my_schema', 'my_table');
You can also request all tables for a specified schema:
> call columnstore_info.table_usage('my_schema', NULL);
The compression_ratio() procedure calculates the average compression ratio across all the compressed extents in ColumnStore. It is called using
> call columnstore_info.compression_ratio();
This page is licensed: CC BY-SA / Gnu FDL
This lists the different naming conventions enforced by the column store, compared to the normal MariaDB naming conventions.
User names: 64 characters (MariaDB has 80)
Table and column names are restricted to alphanumeric and underscore only, i.e "A-Z a-z 0-9 _".
The first character of all table and column names should be an ASCII letter (a-z A-Z).
ColumnStore reserves certain words that MariaDB does not, such as SELECT, CHAR and TABLE, so even wrapped in backticks these cannot be used.
In addition to MariaDB Server reserved words, ColumnStore has additional reserved words that cannot be used as table names, column names or user defined variables, functions or stored procedure names.
Keyword
ACTION
ADD
ALTER
AUTO_INCREMENT
BIGINT
BIT
CASCADE
CHANGE
CHARACTER
CHARSET
CHECK
CLOB
COLUMN
COLUMNS
COMMENT
CONSTRAINT
CONSTRAINTS
CREATE
CURRENT_USER
DATETIME
DEC
DECIMAL
DEFERRED
DEFAULT
DEFERRABLE
DOUBLE
DROP
ENGINE
EXISTS
FOREIGN
FULL
IDB_BLOB
IDB_CHAR
IDB_DELETE
IDB_FLOAT
IDB_INT
IF
IMMEDIATE
INDEX
INITIALLY
INTEGER
KEY
MATCH
MAX_ROWS
MIN_ROWS
MODIFY
NO
NOT
NULL_TOK
NUMBER
NUMERIC
ON
PARTIAL
PRECISION
PRIMARY
REAL
REFERENCES
RENAME
RESTRICT
SESSION_USER
SET
SMALLINT
SYSTEM_USER
TABLE
TIME
TINYINT
TO
TRUNCATE
UNIQUE
UNSIGNED
UPDATE
USER
VARBINARY
VARCHAR
VARYING
WITH
ZONE
This page is licensed: CC BY-SA / Gnu FDL
ColumnStore supports all MariaDB functions that can be used in a post-processing manner where data is returned by ColumnStore first and then MariaDB executes the function on the data returned. The functions are currently supported only in the projection (SELECT) and ORDER BY portions of the SQL statement.
This page is licensed: CC BY-SA / Gnu FDL
ColumnStore has the ability to support full MariaDB query syntax through an operating mode. This operating mode may be set as a default for the instance or set at the session level. To set the operating mode at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.
set infinidb_vtable_mode = n
where n is:
a generic, highly compatible row-by-row processing mode. Some WHERE clause components can be processed by ColumnStore, but joins are processed entirely by mysqld using a nested-loop join mechanism.
(the default) query syntax is evaluated by ColumnStore for compatibility with distributed execution and incompatible queries are rejected. Queries executed in this mode take advantage of distributed execution and typically result in higher performance.
auto-switch mode: ColumnStore will attempt to process the query internally, if it cannot, it will automatically switch the query to run in row-by-row mode.
This page is licensed: CC BY-SA / Gnu FDL
Starting with MariaDB ColumnStore 1.1, the ability to create and use user defined aggregate and window functions is supported in addition to scalar functions. With Columnstore 1.2, multiple parameters are supported. A C++ SDK is provided as well as 3 reference examples that provide additional functions that may be of general use:
median - mathematical median, equivalent to percentile_cont(0.5)
avg_mode - mathematical mode, i.e. the most frequent value in the set
ssq - sum of squares, i.e. the sum of each individual number squared in the set
Similar to built-in functions, the SDK supports distributed aggregate execution where as much of the calculation is scaled out across PM nodes and then collected / finalized in the UM node. Window functions (due to the ordering requirement) are only executed at the UM level.
The reference examples above are included in the standard build of MariaDB ColumnStore and so can be used by registering them as user defined aggregate functions. The same can be done for new functions assuming the instance has the updated libraries included. From a mcsmysql prompt:
CREATE AGGREGATE FUNCTION median returns REAL soname 'libudf_mysql.so';
CREATE AGGREGATE FUNCTION avg_mode returns REAL soname 'libudf_mysql.so';
CREATE AGGREGATE FUNCTION ssq returns REAL soname 'libudf_mysql.so';
After this these may be used in the same way as any other aggregate or window function like sum:
SELECT grade,
AVG(loan_amnt) avg,
MEDIAN(loan_amnt) median
FROM loanstats
GROUP BY grade
ORDER BY grade;
This requires a MariaDB ColumnStore source tree and necessary tools to compile C/C++ code. The SDK and reference examples are available in the utils/udfsdk directory of the source tree. This contains the SDK documentation which is also available here:
The implementation of the median and avg_mode functions will scale in memory consumption to the size of the set of unique values in the aggregation.
This page is licensed: CC BY-SA / Gnu FDL
MariaDB provides extensibility support through user defined functions. For more details on the MariaDB server framework see the user-defined-functions article. This documentation applies to MariaDB ColumnStore version 1.0.10 and above.
MariaDB ColumnStore provides scale out query processing and as such requires a separate distributed implementation of each SQL function. This allows for the function application to happen on each PM server node providing distributed scale out performance.
Thus, to fully implement a user defined function for MariaDB ColumnStore requires implementing 2 different API's:
The MariaDB server UDF API: This allows utilization on all storage engines and is the implementation used if applied in the select list.
The ColumnStore distributed UDF API: This enables distributed execution of where clause and group by usage of the function and will be pushed down to PM nodes for execution where possible.
MariaDB ColumnStore supports user defined function implementations in C/C++. User defined aggregate and window functions are not supported in ColumnStore 1.0.
The development kit can be found under the utils/udfsdk directory of the mariadb-columnstore-engine source tree. To develop a user defined function requires you to set up a development environment and be comfortable with c++ development. To setup a ColumnStore development environment please follow the instructions on dependencies in the ColumnStore server fork repository.
Three main files will need to be modified in order to add a new UDF:
udfmysql.cpp : mariadb server UDF implementation
udfsdk.h : class headers.
udfsdk.cpp : distributed columnstore UDF implementation.
Two reference implementations are provided to provide guidance on creating your own functions:
MCS_IsNull : this illustrates a simple one argument function providing the ability to return a Boolean if the expression parameter is null
MCS_Add: this illustrates a simple 2 argument function to illustrate adding 2 values and return the sum.
It is simplest to copy these and adapt to your needs. There are no system dependencies on the included reference implementations so these can be removed to simplify the class files if preferred.
Three functions are required to be implemented (for more details see user-defined-functions):
x_init : perform any parameter validation or setup such as memory allocation.
x : perform the actual function implementation.
x_deinit : perform any clean up tasks such as deallocating memory where 'x' is the function name.
The function name and class must be registered in order to be recognized and used by the ColumnStore primitive processor. This is done by adding a line to perform the registration in the UDFSDK::UDFMap() function in the file udfsdk.cpp:
FuncMap UDFSDK::UDFMap() const
{
FuncMap fm;
// first: function name
// second: Function pointer
// please use lower case for the function name. Because the names might be
// case-insensitive in MariaDB depending on the setting. In such case,
// the function names passed to the interface is always in lower case.
fm["mcs_add"] = new MCS_add();
fm["mcs_isnull"] = new MCS_isnull();
return fm;
}
For any new user defined functions add a new entry into the FuncMap object mapping the name to the udf class.
The UDF class should be defined in file udfsdk.h and implemented in file udfsdk.cpp. It is easiest to adapt the example classes for new instance but each class must implement the funcexp::Func C++ class definition:
constructor: any initialization necessary
destructor: any de-initialization.
getOperationType: this performs parameter validation and returns the result data type.
getVal : computes and returns the value of the user defined function for each given return datatype.
The code changes can be built using make within the directory utils/udfsdk, this will create the following libraries in the same directory:
libudf_mysql.so.1.0.0
libudfsdk.so.1.0.0
containing the compiled code
The 2 libraries created above must be deployed to the /usr/local/mariadb/columnstore/lib directory (or equivalent lib directory in a non root install) replacing the existing files. Symbolic links in the mariadb server directory point to these but should be validated. Run the following as root from the utils/udfsdk directory in the build tree (specifying a password to restartSystem if needed for a multi server cluster):
$ cp libudf_mysql.so.1.0.0 libudfsdk.so.1.0.0 /usr/local/mariadb/columnstore/lib/
$ ls -l /usr/local/mariadb/columnstore/mysql/lib/plugin/libudf_mysql.so
lrwxrwxrwx. 1 root root 56 Jul 19 09:47 /usr/local/mariadb/columnstore/mysql/lib/plugin/libudf_mysql.so -> /usr/local/mariadb/columnstore/lib/libudf_mysql.so.1.0.0
Repeat this for each ColumnStore UM and PM node in the cluster and then restart ColumnStore to make the libraries available.
After restarting the system the UDF must be registered with the MariaDB server to be usable:
$ mcsmysql
> create function mcs_add returns integer soname 'libudf_mysql.so';
The function mcs_add can then be used. Verify that it can be used both in the select list and where clause for correct installation:
MariaDB [test]> create function mcs_add returns integer soname 'libudf_mysql.so';
Query OK, 0 rows affected (0.01 sec)
MariaDB [test]> create table t1(i1 int, i2 int) engine=columnstore;
Query OK, 0 rows affected (0.58 sec)
MariaDB [test]> insert into t1 values (1,1), (2,2);
Query OK, 2 rows affected (0.24 sec)
Records: 2 Duplicates: 0 Warnings: 0
MariaDB [test]> select i1, i2, mcs_add(i1,i2) sum from t1;
+------+------+------+
| i1 | i2 | sum |
+------+------+------+
| 1 | 1 | 2 |
| 2 | 2 | 4 |
+------+------+------+
2 rows in set (0.05 sec)
MariaDB [test]> select i1, i2 from t1 where mcs_add(i1,i2) = 4;
+------+------+
| i1 | i2 |
+------+------+
| 2 | 2 |
+------+------+
1 row in set (0.02 sec)
This page is licensed: CC BY-SA / Gnu FDL
MariaDB ColumnStore Utility Functions are a set of simple functions that return useful information about the system, such as whether it is ready for queries.
Function
Description
mcsSystemReady()
Returns 1 if the system can accept queries, 0 if it's not ready yet.
mcsSystemReadOnly()
Returns 1 if ColumnStore is in a write-suspended mode. That is, a user executed the SuspendDatabaseWrites.It returns 2 if in a read only state. ColumnStore puts itself into a read only state if it detects a logic error that may have corrupted data. Generally it means a ROLLBACK operation failed. Returns 0 if the system is writable.
This page is licensed: CC BY-SA / Gnu FDL
MariaDB ColumnStore provides support for window functions broadly following the SQL 2003 specification. A window function allows for calculations relating to a window of data surrounding the current row in a result set. This capability provides for simplified queries in support of common business questions such as cumulative totals, rolling averages, and top 10 lists.
Aggregate functions are utilized for window functions however differ in behavior from a group by query because the rows remain ungrouped. This provides support for cumulative sums and rolling averages, for example.
Two key concepts for window functions are Partition and Frame:
A Partition is a group of rows, or window, that have the same value for a specific column, for example a Partition can be created over a time period such as a quarter or lookup values.
The Frame for each row is a subset of the row's Partition. The frame typically is dynamic allowing for a sliding frame of rows within the Partition. The Frame determines the range of rows for the windowing function. A Frame could be defined as the last X rows and next Y rows all the way up to the entire Partition.
Window functions are applied after joins, group by, and having clauses are calculated.
A window function is applied in the select clause using the following syntax:
function_name ([expression [, expression ... ]]) OVER ( window_definition )
where window_definition is defined as:
[ PARTITION BY expression [, ...] ]
[ ORDER BY expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ frame_clause ]
PARTITION BY:
Divides the window result set into groups based on one or more expressions.
An expression may be a constant, column, and non window function expressions.
A query is not limited to a single partition by clause. Different partition clauses can be used across different window function applications.
The partition by columns do not need to be in the select list but do need to be available from the query result set.
If there is no PARTITION BY clause, all rows of the result set define the group.
ORDER BY
Defines the ordering of values within the partition.
Can be ordered by multiple keys which may be a constant, column or non window function expression.
The order by columns do not need to be in the select list but need to be available from the query result set.
Use of a select column alias from the query is not supported.
ASC (default) and DESC options allow for ordering ascending or descending.
NULLS FIRST and NULL_LAST options specify whether null values come first or last in the ordering sequence. NULLS_FIRST is the default for ASC order, and NULLS_LAST is the default for DESC order.
and the optional frame_clause is defined as:
{ RANGE | ROWS } frame_start
{ RANGE | ROWS } BETWEEN frame_start AND frame_end
and the optional frame_start and frame_end are defined as (value being a numeric expression):
UNBOUNDED PRECEDING
value PRECEDING
CURRENT ROW
value FOLLOWING
UNBOUNDED FOLLOWING
RANGE/ROWS:
Defines the windowing clause for calculating the set of rows that the function applies to for calculating a given rows window function result.
Requires an ORDER BY clause to define the row order for the window.
ROWS specify the window in physical units, i.e. result set rows and must be a constant or expression evaluating to a positive numeric value.
RANGE specifies the window as a logical offset. If the expression evaluates to a numeric value then the ORDER BY expression must be a numeric or DATE type. If the expression evaluates to an interval value then the ORDER BY expression must be a DATE data type.
UNBOUNDED PRECEDING indicates the window starts at the first row of the partition.
UNBOUNDED FOLLOWING indicates the window ends at the last row of the partition.
CURRENT ROW specifies the window start or ends at the current row or value.
If omitted, the default is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
Function
Description
AVG()
The average of all input values.
COUNT()
Number of input rows.
CUME_DIST()
Calculates the cumulative distribution, or relative rank, of the current row to other rows in the same partition. Number of peer or preceding rows / number of rows in partition.
DENSE_RANK()
Ranks items in a group leaving no gaps in ranking sequence when there are ties.
FIRST_VALUE()
The value evaluated at the row that is the first row of the window frame (counting from 1); null if no such row.
LAG()
The value evaluated at the row that is offset rows before the current row within the partition; if there is no such row, instead return default. Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null. LAG provides access to more than one row of a table at the same time without a self-join. Given a series of rows returned from a query and a position of the cursor, LAG provides access to a row at a given physical offset prior to that position.
LAST_VALUE()
The value evaluated at the row that is the last row of the window frame (counting from 1); null if no such row.
LEAD()
Provides access to a row at a given physical offset beyond that position. Returns value evaluated at the row that is offset rows after the current row within the partition; if there is no such row, instead return default. Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null.
MAX()
Maximum value of expression across all input values.
MEDIAN()
An inverse distribution function that assumes a continuous distribution model. It takes a numeric or datetime value and returns the middle value or an interpolated value that would be the middle value once the values are sorted. Nulls are ignored in the calculation. Not available in MariaDB Columnstore 1.1
MIN()
Minimum value of expression across all input values.
NTH_VALUE()
The value evaluated at the row that is the nth row of the window frame (counting from 1); null if no such row.
NTILE()
Divides an ordered data set into a number of buckets indicated by expr and assigns the appropriate bucket number to each row. The buckets are numbered 1 through expr. The expr value must resolve to a positive constant for each partition. Integer ranging from 1 to the argument value, dividing the partition as equally as possible.
PERCENT_RANK()
Relative rank of the current row: (rank - 1) / (total rows - 1).
PERCENTILE_CONT()
An inverse distribution function that assumes a continuous distribution model. It takes a percentile value and a sort specification, and returns an interpolated value that would fall into that percentile value with respect to the sort specification. Nulls are ignored in the calculation. Not available in MariaDB Columnstore 1.1
PERCENTILE_DISC()
An inverse distribution function that assumes a discrete distribution model. It takes a percentile value and a sort specification and returns an element from the set. Nulls are ignored in the calculation. Not available in MariaDB Columnstore 1.1
RANK()
Rank of the current row with gaps; same as row_number of its first peer.
REGR_COUNT(ColumnY, ColumnX)
The total number of input rows in which both column Y and column X are nonnull
REGR_SLOPE(ColumnY, ColumnX)
The slope of the least-squares-fit linear equation determined by the (ColumnX, ColumnY) pairs
REGR_INTERCEPT(ColumnY, ColumnX)
The y-intercept of the least-squares-fit linear equation determined by the (ColumnX, ColumnY) pairs
REGR_R2(ColumnY, ColumnX)
Square of the correlation coefficient. correlation coefficient is the regr_intercept(ColumnY, ColumnX) for linear model
REGR_SXX(ColumnY, ColumnX)
REGR_COUNT(y, x) * VAR_POP(x) for non-null pairs.
REGR_SXY(ColumnY, ColumnX)
REGR_COUNT(y, x) * COVAR_POP(y, x) for non-null pairs.
REGR_SYY(ColumnY, ColumnX)
REGR_COUNT(y, x) * VAR_POP(y) for non-null pairs.
ROW_NUMBER()
Number of the current row within its partition, counting from 1
STDDEV() STDDEV_POP()
Computes the population standard deviation and returns the square root of the population variance.
STDDEV_SAMP()
Computes the cumulative sample standard deviation and returns the square root of the sample variance.
SUM()
Sum of expression across all input values.
VARIANCE() VAR_POP()
Population variance of the input values (square of the population standard deviation).
VAR_SAMP()
Sample variance of the input values (square of the sample standard deviation).
The examples are all based on the following simplified sales opportunity table:
create table opportunities (
id int,
accountName varchar(20),
name varchar(128),
owner varchar(7),
amount decimal(10,2),
closeDate date,
stageName varchar(11)
) engine=columnstore;
Some example values are (thanks to www.mockaroo.com for sample data generation):
id
accountName
name
owner
amount
closeDate
stageName
1
Browseblab
Multi-lateral executive function
Bob
26444.86
2016-10-20
Negotiating
2
Mita
Organic demand-driven benchmark
Maria
477878.41
2016-11-28
ClosedWon
3
Miboo
De-engineered hybrid groupware
Olivier
80181.78
2017-01-05
ClosedWon
4
Youbridge
Enterprise-wide bottom-line Graphic Interface
Chris
946245.29
2016-07-02
ClosedWon
5
Skyba
Reverse-engineered fresh-thinking standardization
Maria
696241.82
2017-02-17
Negotiating
6
Eayo
Fundamental well-modulated artificial intelligence
Bob
765605.52
2016-08-27
Prospecting
7
Yotz
Extended secondary infrastructure
Chris
319624.20
2017-01-06
ClosedLost
8
Oloo
Configurable web-enabled data-warehouse
Chris
321016.26
2017-03-08
ClosedLost
9
Kaymbo
Multi-lateral web-enabled definition
Bob
690881.01
2017-01-02
Developing
10
Rhyloo
Public-key coherent infrastructure
Chris
965477.74
2016-11-07
Prospecting
The schema, sample data, and queries are available as an attachment to this article.
Window functions can be used to achieve cumulative / running calculations on a detail report. In this case a won opportunity report for a 7 day period adds columns to show the accumulated won amount as well as the current highest opportunity amount in preceding rows.
select owner,
accountName,
CloseDate,
amount,
sum(amount) over (order by CloseDate rows between unbounded preceding and current row) cumeWon,
max(amount) over (order by CloseDate rows between unbounded preceding and current row) runningMax
from opportunities
where stageName='ClosedWon'
and closeDate >= '2016-10-02' and closeDate <= '2016-10-09'
order by CloseDate;
with example results:
owner
accountName
CloseDate
amount
cumeWon
runningMax
Bill
Babbleopia
2016-10-02
437636.47
437636.47
437636.47
Bill
Thoughtworks
2016-10-04
146086.51
583722.98
437636.47
Olivier
Devpulse
2016-10-05
834235.93
1417958.91
834235.93
Chris
Linkbridge
2016-10-07
539977.45
2458738.65
834235.93
Olivier
Trupe
2016-10-07
500802.29
1918761.20
834235.93
Bill
Latz
2016-10-08
857254.87
3315993.52
857254.87
Chris
Avamm
2016-10-09
699566.86
4015560.38
857254.87
The above example can be partitioned, so that the window functions are over a particular field grouping such as owner and accumulate within that grouping. This is achieved by adding the syntax "partition by " in the window function clause.
select owner,
accountName,
CloseDate,
amount,
sum(amount) over (partition by owner order by CloseDate rows between unbounded preceding and current row) cumeWon,
max(amount) over (partition by owner order by CloseDate rows between unbounded preceding and current row) runningMax
from opportunities
where stageName='ClosedWon'
and closeDate >= '2016-10-02' and closeDate <= '2016-10-09'
order by owner, CloseDate;
with example results:
owner
accountName
CloseDate
amount
cumeWon
runningMax
Bill
Babbleopia
2016-10-02
437636.47
437636.47
437636.47
Bill
Thoughtworks
2016-10-04
146086.51
583722.98
437636.47
Bill
Latz
2016-10-08
857254.87
1440977.85
857254.87
Chris
Linkbridge
2016-10-07
539977.45
539977.45
539977.45
Chris
Avamm
2016-10-09
699566.86
1239544.31
699566.86
Olivier
Devpulse
2016-10-05
834235.93
834235.93
834235.93
Olivier
Trupe
2016-10-07
500802.29
1335038.22
834235.93
The rank window function allows for ranking or assigning a numeric order value based on the window function definition. Using the Rank() function will result in the same value for ties / equal values and the next rank value skipped. The Dense_Rank() function behaves similarly except the next consecutive number is used after a tie rather than skipped. The Row_Number() function will provide a unique ordering value. The example query shows the Rank() function being applied to rank sales reps by the number of opportunities for Q4 2016.
select owner,
wonCount,
rank() over (order by wonCount desc) rank
from (
select owner,
count(*) wonCount
from opportunities
where stageName='ClosedWon'
and closeDate >= '2016-10-01' and closeDate < '2016-12-31'
group by owner
) t
order by rank;
with example results (note the query is technically incorrect by using closeDate < '2016-12-31' however this creates a tie scenario for illustrative purposes):
owner
wonCount
rank
Bill
19
1
Chris
15
2
Maria
14
3
Bob
14
3
Olivier
10
5
If the dense_rank function is used the rank values would be 1,2,3,3,4 and for the row_number function the values would be 1,2,3,4,5.
The first_value and last_value functions allow determining the first and last values of a given range. Combined with a group by this allows summarizing opening and closing values. The example shows a more complex case where detailed information is presented for first and last opportunity by quarter.
select a.year,
a.quarter,
f.accountName firstAccountName,
f.owner firstOwner,
f.amount firstAmount,
l.accountName lastAccountName,
l.owner lastOwner,
l.amount lastAmount
from (
select year,
quarter,
min(firstId) firstId,
min(lastId) lastId
from (
select year(closeDate) year,
quarter(closeDate) quarter,
first_value(id) over (partition by year(closeDate), quarter(closeDate) order by closeDate rows between unbounded preceding and current row) firstId,
last_value(id) over (partition by year(closeDate), quarter(closeDate) order by closeDate rows between current row and unbounded following) lastId
from opportunities where stageName='ClosedWon'
) t
group by year, quarter order by year,quarter
) a
join opportunities f on a.firstId = f.id
join opportunities l on a.lastId = l.id
order by year, quarter;
with example results:
year
quarter
firstAccountName
firstOwner
firstAmount
lastAccountName
lastOwner
lastAmount
2016
3
Skidoo
Bill
523295.07
Skipstorm
Bill
151420.86
2016
4
Skimia
Chris
961513.59
Avamm
Maria
112493.65
2017
1
Yombu
Bob
536875.51
Skaboo
Chris
270273.08
Sometimes it useful to understand the previous and next values in the context of a given row. The lag and lead window functions provide this capability. By default the offset is one providing the prior or next value but can also be provided to get a larger offset. The example query is a report of opportunities by account name showing the opportunity amount, and the prior and next opportunity amount for that account by close date.
select accountName,
closeDate,
amount currentOppAmount,
lag(amount) over (partition by accountName order by closeDate) priorAmount, lead(amount) over (partition by accountName order by closeDate) nextAmount
from opportunities
order by accountName, closeDate
limit 9;
with example results:
accountName
closeDate
currentOppAmount
priorAmount
nextAmount
Abata
2016-09-10
645098.45
NULL
161086.82
Abata
2016-10-14
161086.82
645098.45
350235.75
Abata
2016-12-18
350235.75
161086.82
878595.89
Abata
2016-12-31
878595.89
350235.75
922322.39
Abata
2017-01-21
922322.39
878595.89
NULL
Abatz
2016-10-19
795424.15
NULL
NULL
Agimba
2016-07-09
288974.84
NULL
914461.49
Agimba
2016-09-07
914461.49
288974.84
176645.52
Agimba
2016-09-20
176645.52
914461.49
NULL
The NTile window function allows for breaking up a data set into portions assigned a numeric value to each portion of the range. NTile(4) breaks the data up into quartiles (4 sets). The example query produces a report of all opportunities summarizing the quartile boundaries of amount values.
select t.quartile,
min(t.amount) min,
max(t.amount) max
from (
select amount,
ntile(4) over (order by amount asc) quartile
from opportunities
where closeDate >= '2016-10-01' and closeDate <= '2016-12-31'
) t
group by quartile
order by quartile;
With example results:
quartile
min
max
1
6337.15
287634.01
2
288796.14
539977.45
3
540070.04
748727.51
4
753670.77
998864.47
The percentile functions have a slightly different syntax from other window functions as can be seen in the example below. These functions can be only applied against numeric values. The argument to the function is the percentile to evaluate. Following 'within group' is the sort expression which indicates the sort column and optionally order. Finally after 'over' is an optional partition by clause, for no partition clause use 'over ()'. The example below utilizes the value 0.5 to calculate the median opportunity amount in the rows. The values differ sometimes because percentile_cont will return the average of the 2 middle rows for an even data set while percentile_desc returns the first encountered in the sort.
select owner,
accountName,
CloseDate,
amount,
percentile_cont(0.5) within group (order by amount) over (partition by owner) pct_cont,
percentile_disc(0.5) within group (order by amount) over (partition by owner) pct_disc
from opportunities
where stageName='ClosedWon'
and closeDate >= '2016-10-02' and closeDate <= '2016-10-09'
order by owner, CloseDate;
With example results:
owner
accountName
CloseDate
amount
pct_cont
pct_disc
Bill
Babbleopia
2016-10-02
437636.47
437636.4700000000
437636.47
Bill
Thoughtworks
2016-10-04
146086.51
437636.4700000000
437636.47
Bill
Latz
2016-10-08
857254.87
437636.4700000000
437636.47
Chris
Linkbridge
2016-10-07
539977.45
619772.1550000000
539977.45
Chris
Avamm
2016-10-09
699566.86
619772.1550000000
539977.45
Olivier
Devpulse
2016-10-05
834235.93
667519.1100000000
500802.29
Olivier
Trupe
2016-10-07
500802.29
667519.1100000000
500802.29
This page is licensed: CC BY-SA / Gnu FDL
You can use most normal statements from the MariaDB data definition language (DDL) with ColumnStore tables. This section lists DDL that differs for ColumnStore compared to normal MariaDB usage.
The ALTER TABLE statement modifies existing tables. It includes adding, deleting, and renaming columns as well as renaming tables.
ALTER TABLE tbl_name
alter_specification [, alter_specification] ...
alter_specification:
table_option ...
| ADD [COLUMN] col_name column_definition
| ADD [COLUMN] (col_name column_definition,...)
| ALTER [COLUMN] col_name {SET DEFAULT literal | DROP DEFAULT}
| CHANGE [COLUMN] old_col_name new_col_name column_definition
| DROP [COLUMN] col_name
| RENAME [TO] new_tbl_name
column_definition:
data_type
[NOT NULL | NULL]
[DEFAULT default_value]
[COMMENT '[compression=0|1];']
table_options:
table_option [[,] table_option] ... (see CREATE TABLE options)
images here
The ADD clause allows you to add columns to a table. You must specify the data type after the column name. The following statement adds a priority column with an integer datatype to the orders table:
ALTER TABLE orders ADD COLUMN priority INTEGER;
Compression level (0 for no compression, 1 for compression) is set at the system level. If a session default exists, it will override the system default. In turn, it can be overridden by the table-level compression comment and finally, a compression comment at the column level.
The ColumnStore engine fully supports online DDL (one session can be adding columns to a table while another session is querying that table). MariaDB ColumnStore has provided it own syntax to do so for adding columns to a table, one at a time only. Do not attempt to use it for any other purpose. Follow the example below as closely as possible
We have also provided the following workaround. This workaround is intended for adding columns to a table, one at a time only. Do not attempt to use it for any other purpose. Follow the example below as closely as possible.
Scenario: Add an INT column named col7 to the existing table foo:
select calonlinealter('alter table foo add column col7 int;');
alter table foo add column col7 int comment 'schema sync only';
The select statement may take several tens of seconds to run, depending on how many rows are currently in the table. Regardless, other sessions can select against the table during this time (but they won’t be able to see the new column yet). The ALTER TABLE statement will take less than 1 second (depending on how busy MariaDB is), and during this brief time interval, other table reads will be held off.
The CHANGE clause allows you to rename a column in a table.
Notes to CHANGE COLUMN:
You cannot currently use CHANGE COLUMN to change the definition of that column.
You can only change a single column at a time. The following example renames the order_qty field to quantity in the orders table:
ALTER TABLE orders CHANGE COLUMN order_qty quantity
INTEGER;
The DROP clause allows you to drop columns. All associated data is removed when the column is dropped. You can DROP COLUMN (column_name). The following example alters the orders table to drop the priority column:
ALTER TABLE orders DROP COLUMN priority;
The RENAME clause allows you to rename a table. The following example renames the orders table:
ALTER TABLE orders RENAME TO customer_orders;
This page is licensed: CC BY-SA / Gnu FDL
Alters the definition of a view. CREATE OR REPLACE VIEW may also be used to alter the definition of a view.
CREATE
[OR REPLACE]
VIEW view_name [(column_list)]
AS select_statement
This page is licensed: CC BY-SA / Gnu FDL
Creates a stored routine in ColumnStore.
CREATE
[DEFINER = { user | CURRENT_USER }]
PROCEDURE sp_name ([proc_parameter[,...]])
[characteristic ...] routine_body
proc_parameter:
[ IN | OUT | INOUT ] param_name type
type:
Any valid MariaDB ColumnStore data type
routine_body:
Valid SQL procedure statement
ColumnStore currently accepts definition of stored procedures with only input arguments and a single SELECT query while in Operating Mode = 1 (VTABLE mode). However, while in the Operating Mode = 0 (TABLE mode), ColumnStore will allow additional complex definition of stored procedures (i.e., OUT parameter, declare, cursors,etc.)
See Operating Mode for information on Operating Modes
images here
The following statements create and call the sp_complex_variable stored procedure:
CREATE PROCEDURE sp_complex_variable(in arg_key int, in arg_date date)
begin
Select *
from lineitem, orders
where o_custkey < arg_key
and l_partkey < 10000
and l_shipdate>arg_date
and l_orderkey = o_orderkey
order by l_orderkey, l_linenumber;
end
call sp_complex_variable(1000, '1998-10-10');
This page is licensed: CC BY-SA / Gnu FDL
A database consists of tables that store user data. You can create multiple columns with the CREATE TABLE statement. The data type follows the column name when adding columns.
CREATE TABLE [IF NOT EXISTS] tbl_name
(create_definition,...)
engine=columnstore [ DEFAULT CHARSET=character-set]
[COMMENT '[compression=0|1][;]
CREATE TABLE [IF NOT EXISTS] tbl_name
{ LIKE old_table_name | (LIKE old_table_name) }
create_definition:
{ col_name column_definition }
column_definition:
data_type
[NOT NULL | NULL]
[DEFAULT default_value]
[COMMENT '[compression=0|1]
[COMMENT='schema sync only']
[COMMENT 'autoincrement column_name'];
images here
ColumnStore tables should not be created in the mysql, information_schema, calpontsys, or test databases.
ColumnStore stores all object names in lowercase.
CREATE TABLE AS SELECT is not supported and will instead create the table in the default storage engine.
Compression level (0 for no compression, 1 for compression) is set at the system level. If a session default exists, it will override the system default. In turn, it can be overridden by the table-level compression comment and, finally, a compression comment at the column level.
A table is created in the front end only by using a ‘schema sync only’ comment.
The column DEFAULT value can be a maximum of 64 characters.
For maximum compatibility with external tools, MariaDB ColumnStore will accept the following table attributes; however, these are not implemented within MariaDB ColumnStore:
MIN_ROWS
MAX_ROWS
AUTO_INCREMENT
All of these are ignored by ColumnStore. The following statement creates a table called "orders" with two columns: "orderkey" with datatype integer and "customer" with datatype varchar:
CREATE TABLE orders (
orderkey INTEGER,
customer VARCHAR(45)
) ENGINE=ColumnStore
This page is licensed: CC BY-SA / Gnu FDL
Creates a stored query in the MariaDB ColumnStore
CREATE
[OR REPLACE]
VIEW view_name [(column_list)]
AS select_statement
Notes to CREATE VIEW:
If you describe a view in MariaDB ColumnStore, the column types reported may not match the actual column types in the underlying tables. This is normal and can be ignored. The following statement creates a customer view of orders with status:
CREATE VIEW v_cust_orders (cust_name, order_number, order_status) as
select c.cust_name, o.ordernum, o.status from customer c, orders o
where c.custnum = o.custnum;
This page is licensed: CC BY-SA / Gnu FDL
The DROP TABLE statement deletes a table from ColumnStore.
DROP TABLE [IF EXISTS]
tbl_name
[RESTRICT ]
The RESTRICT clause limits the table to being dropped in the front end only. This could be useful when the table has been dropped on one user module, and needs to be synced to others.
images here
The following statement drops the orders table on the front end only:
DROP TABLE orders RESTRICT;
This page is licensed: CC BY-SA / Gnu FDL
In most cases, a ColumnStore table works just as any other MariaDB table. There are however a few differences.
The following table lists the data definition statements (DDL) that differ from normal MariaDB DDL when used on ColumnStore tables.
DDL
Difference
Columnstore supports DROP TABLE ...RESTRICT which only drops the table in the front end.
ColumnStore doesn't allow one to rename a table between databases.
ColumnStore doesn't need indexes, partitions and many other table and column options. See here for ColumnStore Specific Syntax
ColumnStore doesn't need indexes. Hence an index many not be created on a table that is defined with engine=columnstore
This page is licensed: CC BY-SA / Gnu FDL
Learn data manipulation statements for MariaDB ColumnStore. This section covers INSERT, UPDATE, DELETE, and LOAD DATA operations, optimized for efficient handling of large analytical datasets.
The DELETE statement is used to remove rows from tables.
DELETE
[FROM] tbl_name
[WHERE where_condition]
[ORDER BY ...]
[LIMIT row_count]
No disk space is recovered after a DELETE. TRUNCATE and DROP PARTITION can be used to recover space, or alternatively CREATE TABLE, loading only the remaining rows, then using DROP TABLE on the original table and RENAME TABLE).
LIMIT will limit the number of rows deleted, which will perform the DELETE more quickly. The DELETE ... LIMIT statement can then be performed multiple times to achieve the same effect as DELETE with no LIMIT.
The following statement deletes customer records with a customer key identification between 1001 and 1999:
DELETE FROM customer
WHERE custkey > 1000 and custkey <2000
This page is licensed: CC BY-SA / Gnu FDL
Joins are performed in memory unless disk-based joins are enabled via AllowDiskBasedJoin in the columnstore.xml. When a join operation exceeds the memory allocated for query joins, the query is aborted with an error code IDB-2001. Disk-based joins enable such queries to use disk for intermediate join data in case when the memory needed for the join exceeds the memory limit. Although slower in performance as compared to a fully in-memory join and bound by the temporary space on disk, it does allow such queries to complete.
The following variables in the HashJoin element in the Columnstore.xml configuration file relate the o disk-based joins. Columnstore.xml resides in the etc directory for your installation (/usr/local/mariadb/columnstore/etc).
AllowDiskBasedJoin—Option to use disk-based joins. Valid values are Y (enabled) or N (disabled). The default is disabled.
TempFileCompression—Option to use compression for disk join files. Valid values are Y (use compressed files) or N (use non-compressed files).
TempFilePath—The directory path used for the disk joins. By default, this path is the tmp directory for your installation (i.e., /tmp/columnstore_tmp_files/joins/). Files in this directory will be created and cleaned on an as-needed basis. The entire directory is removed and recreated by ExeMgr at startup.)
In addition to the system-wide flags at the SQL global and session levels, the following system variables exist for managing per-user memory limits for joins.
columnstore_um_mem_limit - A value for memory limit in MB per user. When this limit is exceeded by a join, it will switch to a disk-based join. By default, the limit is not set (value of 0).
For modification at the global level: In my.cnf file (example: /etc/my.cnf.d/server.cnf):
[mysqld]
...
columnstore_um_mem_limit = value
where value is the value in Mb for in memory limitation per user.
For modification at the session level, before issuing your join query from the SQL client, set the session variable as follows.
set columnstore_um_mem_limit = value
This page is licensed: CC BY-SA / Gnu FDL
The INSERT statement allows you to add data to tables.
INSERT
INTO tbl_name [(col,...)]
{VALUES | VALUE} ({expr | DEFAULT},...),(...),...
The following statement inserts a row with all column values into the customer table:
INSERT INTO customer (custno, custname, custaddress, phoneno, cardnumber, comments)
VALUES (12, ‘JohnSmith’, ‘100 First Street, Dallas’, ‘(214) 555-1212’,100, ‘On Time’)
The following statement inserts two rows with all column values into the customer table:
INSERT INTO customer (custno, custname, custaddress, phoneno, cardnumber, comments) VALUES
(12, ‘JohnSmith’, ‘100 First Street, Dallas’, ‘(214) 555-1212’,100, ‘On Time’),
(13, ‘John Q Public’, ‘200 Second Street, Dallas’, ‘(972) 555-1234’, 200, ‘LatePayment’);
With INSERT ... SELECT, you can quickly insert many rows into a table from one or more other tables.
ColumnStore ignores the ON DUPLICATE KEY clause.
Non-transactional INSERT ... SELECT is directed to ColumnStores cpimport tool by default, which significantly increases performance.
Transactional INSERT ... SELECT statements (that is with AUTOCOMMIT off or after a START TRANSACTION) are processed through normal DML processes.
Example
create table autoinc_test(
id int,
name varchar(10))
engine=columnstore comment 'autoincrement=id';
INSERT INTO autoinc_test (name) VALUES ('John');
INSERT INTO autoinc_test (name) VALUES ('Doe');
This page is licensed: CC BY-SA / Gnu FDL
The LOAD DATA INFILE statement reads rows from a text file into a table at a very high speed. The file name must be given as a literal string.
LOAD DATA [LOCAL] INFILE 'file_name'
INTO TABLE tbl_name
[CHARACTER SET charset_name]
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]
ColumnStore ignores the ON DUPLICATE KEY clause.
Non-transactional LOAD DATA INFILE is directed to ColumnStores cpimport tool by default, which significantly increases performance.
Transactional LOAD DATA INFILE statements (that is, with AUTOCOMMIT off or after a START TRANSACTION) are processed through normal DML processes.
Use cpimport for importing UTF-8 data that contains multi-byte values
The following example loads data into a simple 5- column table: A file named /simpletable.tblhas the following data in it.
1|100|1000|10000|Test Number 1|
2|200|2000|20000|Test Number 2|
3|300|3000|30000|Test Number 3|
The data can then be loaded into the simpletable table with the following syntax:
LOAD DATA INFILE 'simpletable.tbl' INTO TABLE simpletable FIELDS TERMINATED BY '|'
If the default mode is set to use cpimport internally, any output error files will be written to /var/log/mariadb/columnstore/cpimport/ directory. It can be consulted for troubleshooting any errors reported.
This page is licensed: CC BY-SA / Gnu FDL
The SELECT statement is used to query the database and display table data. You can add many clauses to filter the data.
SELECT
[ALL | DISTINCT ]
select_expr [, select_expr ...]
[ FROM table_references
[WHERE where_condition]
[GROUP BY {col_name | expr | position} [ASC | DESC], ... [WITH ROLLUP]]
[HAVING where_condition]
[ORDER BY {col_name | expr | position} [ASC | DESC], ...]
[LIMIT {[offset,] row_count | row_count OFFSET offset}]
[PROCEDURE procedure_name(argument_list)]
[INTO OUTFILE 'file_name' [CHARACTER SET charset_name] [export_options]
| INTO DUMPFILE 'file_name' | INTO var_name [, var_name] ]
export_options:
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]
<<toc>>
If the same column needs to be referenced more than once in the projection list, a unique name is required for each column using a column alias. The total length of the name of a column, inclusive of the length of functions, in the projection list must be 64 characters or less.
The WHERE clause filters data retrieval based on criteria. Note that column_alias cannot be used in the WHERE clause. The following statement returns rows in the region table where the region = ‘ASIA’:
SELECT * FROM region WHERE name = ’ASIA’;
GROUP BY groups data based on values in one or more specific columns. The following statement returns rows from the lineitem table where /orderkeyis less than 1 000 000 and groups them by the quantity.
SELECT quantity, count(*) FROM lineitem WHERE orderkey < 1000000 GROUP BY quantity;
HAVING is used in combination with the GROUP BY clause. It can be used in a SELECT statement to filter the records that a GROUP BY returns.The following statement returns shipping dates, and the respective quantity where the quantity is 2500 or more.
SELECT shipdate, count(*) FROM lineitem GROUP BYshipdate HAVING count(*) >= 2500;
The ORDER BY clause presents results in a specific order. Note that the ORDER BY clause represents a statement that is post-processed by MariaDB. The following statement returns an ordered quantity column from the lineitem table.
SELECT quantity FROM lineitem WHERE orderkey < 1000000 order by quantity;
The following statement returns an ordered shipmode column from the lineitem table.
Select shipmode from lineitem where orderkey < 1000000 order by 1;
NOTE: When ORDER BY is used in an inner query and LIMIT on an outer query, LIMIT is applied first and then ORDER BY is applied when returning results.
Used to combine the result from multiple SELECT statements into a single result set.The UNION or UNION DISTINCT clause returns query results from multiple queries into one display and discards duplicate results. The UNION ALL clause displays query results from multiple queries and does not discard the duplicates. The following statement returns the p_name rows in the part table and the partno table and discards the duplicate results:
SELECT p_name FROM part UNION select p_name FROM partno;
The following statement returns all the p_name rows in the part table and the partno table:
SELECT p_name FROM part UNION ALL select p_name FROM partno;
A limit is used to constrain the number of rows returned by the SELECT statement. LIMIT can have up to two arguments. LIMIT must contain a row count and may optionally contain an offset of the first row to return (the initial row is 0). The following statement returns 5 customer keys from the customer table:
SELECT custkey from customer limit 5;
The following statement returns 5 customer keys from the customer table beginning at offset 1000:
SELECT custkey from customer limit 1000,5;
NOTE: When LIMIT is applied on a nested query's results, and the inner query contains ORDER BY, LIMIT is applied first, and then ORDER BY is applied.
This page is licensed: CC BY-SA / Gnu FDL
The UPDATE statement changes data stored in rows.
Single-table syntax:
UPDATE table_reference
SET col1={expr1|DEFAULT} [,col2={expr2|DEFAULT}] ...
[WHERE where_condition]
[ORDER BY ...]
[LIMIT row_count]
Multiple-table syntax:
UPDATE table_references
SET col1={expr1|DEFAULT} [, col2={expr2|DEFAULT}] ...
[WHERE where_condition]
This page is licensed: CC BY-SA / Gnu FDL