1 of 5

Storage Engine Development

Learn about developing custom storage engines for MariaDB Server. This section guides you through the process, providing insights into the Storage Engine API for extending database functionality.

Engine-defined New Table/Field/Index Attributes

In MariaDB, a storage engine can allow the user to specify additional attributes per index, field, or table. The engine needs to declare what attributes it introduces.

API

There are three new members in the handlerton structure, they can be set in the engine's initialization function as follows:

example_hton->table_options= example_table_option_array;
example_hton->field_options= example_field_option_array;
example_hton->index_options= example_index_option_array;

The arrays are declared statically, as in the following example:

static MYSQL_THDVAR_ULONG(varopt_default, PLUGIN_VAR_RQCMDARG,
  "default value of the VAROPT table option", NULL, NULL, 5, 0, 100, 0);

struct ha_table_option_struct
{
  char *strparam;
  ulonglong ullparam;
  uint enumparam;
  bool boolparam;
  ulonglong varparam;
};

ha_create_table_option example_table_option_list[]=
{
  HA_TOPTION_NUMBER("NUMBER", ullparam, UINT_MAX32, 0, UINT_MAX32, 10),
  HA_TOPTION_STRING("STR", strparam),
  HA_TOPTION_ENUM("ONE_OR_TWO", enumparam, "one,two", 0),
  HA_TOPTION_BOOL("YESNO", boolparam, 1),
  HA_TOPTION_SYSVAR("VAROPT", varopt, varparam),
  HA_TOPTION_END
};

The engine declares a structureha_table_option_struct that will hold values of these new attributes.

And it describes these attributes to MySQL by creating an array ofHA_TOPTION_* macros. Note a detail: these macros expect a structure calledha_table_option_struct, if the structure is called differently, a#define are needed.

There are five supported kinds of attributes:

macro name

attribure value type

corresponding C type

additional parameters of a macro

macro name

attribure value type

corresponding C type

additional parameters of a macro

HA_TOPTION_NUMBER

an integer number

unsigned long long

a default value, minimal allowed value, maximal allowed value, a factor, that any allowed should be a multiple of.

HA_TOPTION_STRING

a string

char *

none. The default value is a null pointer.

HA_TOPTION_ENUM

one value from a list of allowed values

unsigned int

a string with a comma-separated list of allowed values, and a default value as a number, starting from 0.

HA_TOPTION_BOOL

a boolean

bool

a default value

HA_TOPTION_SYSVAR

defined by the system variable

system variable name

Do not use enum for your HA_TOPTION_ENUM C structure members, the size of the enum depends on the compiler, and even on the compilation options, and the plugin API uses only types with known storage sizes.

In all macros the first two parameters are name of the attribute as should be used in SQL in the CREATE TABLE statement, and the name of the corresponding member of the ha_table_option_struct structure.

The HA_TOPTION_SYSVAR stands aside a bit. It does not specify the attribute type or the default value, instead it binds the attribute to a system variable. The attribute type and the range of allowed values are the same as of the corresponding system variable. The attribute default value are the current value of its system variable. And unlike other attribute types that are only stored in the .frm file if explicitly set in the CREATE TABLE statement, the HA_TOPTION_SYSVAR attributes are always stored. If the system variable value is changed, it will not affect existing tables. Note that for this very reason, if a table was created in the old version of a storage engine, and a new version has introduced a HA_TOPTION_SYSVAR attribute, the attribute value in the old tables are the default value of the system variable, not its current value.

The array ends with a HA_TOPTION_END macro.

Field and index (key) attributes are declared similarly using HA_FOPTION_* and HA_IOPTION_* macros.

When in a CREATE TABLE statement, the ::create() handler method is called, the table attributes are available in the table_arg->s->option_struct, field attributes - in the option_struct member of the individual fields (objects of the Field class), index attributes - in the option_struct member of the individual keys (objects of the KEY class).

Additionally, they are available in most other handler methods: the attributes are stored in the .frm file and on every open MySQL makes them available to the engine by filling the corresponding option_struct members of the table, fields, and keys.

The ALTER TABLE needs a special support from the engine. MySQL compares old and new table definitions to decide whether it needs to rebuild the table or not. As the semantics of the engine declared attributes is unknown, MySQL cannot make this decision by analyzing attribute values - this is delegated to the engine. The HA_CREATE_INFO structure has three new members:

ha_table_option_struct *option_struct;           ///< structure with parsed table options
ha_field_option_struct **fields_option_struct;   ///< array of field option structures
ha_index_option_struct **indexes_option_struct;  ///< array of index option structures

The engine (in the ::check_if_incompatible_data() method) is responsible for comparing new values of the attributes from the HA_CREATE_INFO structure with the old values from the table and returning COMPATIBLE_DATA_NO if they were changed in such a way that requires the table to be rebuild.

The example of declaring the attributes and comparing the values for the ALTER TABLE can be found in the EXAMPLE engine.

SQL

The engine declared attributes can be specified per field, index, or table in the CREATE TABLE or ALTER TABLE. The syntax is the conventional:

CREATE TABLE ... (
  field ... [attribute=value [attribute=value ...]],
  ...
  index ... [attribute=value [attribute=value ...]],
  ...
) ...  [attribute=value [attribute=value ...]]

All values must be specified as literals, not expressions. The value of a boolean option may be specified as one of YES, NO, ON, OFF, 1, or 0. A string value may be specified either quoted or not, as an identifier (if it is a valid identifier, of course). Compare with the old behavior:

CREATE TABLE ... ENGINE=FEDERATED CONNECTION='mysql://root@127.0.0.1';

where the value of the ENGINE attribute is specified not quoted, while the value of the CONNECTION is quoted.

When an attribute is set, it are stored with the table definition and shown in the SHOW CREATE TABLE;. To remove an attribute from a table definition use ALTER TABLE to set its value to a DEFAULT.

The values of unknown attributes or attributes with the illegal values cause an error by default. But with ALTER TABLE one can change the storage engine and some previously valid attributes may become unknown — to the new engine. They are not removed automatically, though, because the table might be altered back to the first engine, and these attributes are valid again. Still SHOW CREATE TABLE will comment these unknown attributes out in the output, otherwise they would make a generated CREATE TABLE statement invalid.

With the IGNORE_BAD_TABLE_OPTIONS sql mode this behavior changes. Unknown attributes do not cause an error, they only result in a warning. And SHOW CREATE TABLE will not comment them out. This mode is implicitly enabled in the replication slave thread.

Storage Engine FAQ

Are storage engines designed for MySQL compatible with MariaDB?

In most cases, yes. MariaDB tries to keep API compatibility with MySQL, even across major versions.

Will storage engines created for MariaDB work in MySQL?

It will mostly work. It would need #ifdef's to adjust to MySQL-5.6 API, for example, for multi-read-range API, for table discovery API, etc. But most of the code will work as is, without any changes.

Do storage engine binaries need to be recompiled for MariaDB?

Yes. You will need to recompile the storage engine against the exact version of MySQL or MariaDB you intend to run it on. This is due to the version of the server being stored in the storage engine binary, and the server will refuse to load it if it was compiled for a different version.

_{This page is licensed: CC BY-SA / Gnu FDL}

Table Discovery (before 10.0.2)

This page describes the old discovery API, created in MySQL for NDB Cluster. It no longer works in MariaDB. New table discovery API is documented here.

There are four parts of it.

First, when a server finds that a table (for example, mentioned in the SELECT query) does not exist, it asks every engine whether it knows anything about this table. For this it uses discover() method of the handlerton. The method is defined as

int discover(handlerton *hton, THD* thd, const char *db, const char *name,
             unsigned char **frmblob, size_t *frmlen);

It takes the database and a table name as arguments and is returns 0 if the table exists in the engine and 1 otherwise. If it returned 0, it is supposed to allocate (with my_malloc()) a buffer and store the complete binary image of the .frm file of that table. The server will write it down to disk, creating table's .frm. The output parameters frmblob and frmlen are used to return the information about the buffer to the caller. The caller is responsible for freeing the buffer with my_free().

Second, in some cases the server only wants to know if the table exists, but it does not really need to open it and will not use its .frm file image. Then using the discover() method would be an overkill, and the server uses a lightweight table_exists_in_engine() method. This method is defined as

int table_exists_in_engine(handlerton *hton, THD* thd,
                           const char *db, const char *name);

and it returns one of the HA_ERR_ codes, typically HA_ERR_NO_SUCH_TABLE or HA_ERR_TABLE_EXIST.

Third, there can be a situation when the server thinks that the table exists (it found and successfully read the .frm file), but from the engine point of view the .frm file is incorrect. For example, the table was already deleted from the engine, or its definition was modified (again, modified only in the engine). In this case the .frm file is outdated, and the server needs to re-discover the table. The engine conveys this to the server by returning HA_ERR_TABLE_DEF_CHANGED error code from the handler's open() method. On receiving this error the server will use the discover() method to get the new .frm image. This also means that after the table is opened, the server does not expect its metadata to change. The engine thus should ensure (with some kind of locking, perhaps) that a table metadata cannot be modified, as long as the table stays opened.

And fourth, a user might want to retrieve a list of tables in a specific database. With SHOW TABLES or by quering INFORMATION_SCHEMA tables. The user expects to see all tables, but the server cannot discover them one by one, because it doesnt know table names. In this case, the server uses a special discovery technique. It is find_files() method in the handlerton, defines as

int find_files(handlerton *hton, THD *thd,
               const char *db, const char *path,
               const char *wild, bool dir, List<LEX_STRING> *files);

and it, typically for Storage Engine API, returns 0 on success and 1 on failure. The arguments mean db - the name of the database, path - the path to it, wild an SQL wildcard pattern (for example, from SHOW TABLES LIKE '...', and dir, if set, means to discover databases instead of tables.

_{This page is licensed: CC BY-SA / Gnu FDL}

Table Discovery

In MariaDB it is not always necessary to run an explicit CREATE TABLE statement for a table to appear. Sometimes a table may already exist in the storage engine, but the server does not know about it, because there is no .frm file for this table. This can happen for various reasons; for example, for a cluster engine the table might have been created in the cluster by another MariaDB server node. Or for the engine that supports table shipping a table file might have been simply copied into the MariaDB data directory. But no matter what the reason is, there is a mechanism for an engine to tell the server that the table exists. This mechanism is called table discovery and if an engine wants the server to discover its tables, the engine should support the table discovery API.

There are two different kinds of table discovery — a fully automatic discovery and a user-assisted one. In the former, the engine can automatically discover the table whenever an SQL statement needs it. In MariaDB, the Archive and Sequence engines support this kind of discovery. For example, one can copy a t1.ARZ file into the database directory and immediately start using it — the corresponding .frm file are created automatically. Or one can select from say, the seq_1_to_10 table without any explicit CREATE TABLE statement.

In the latter, user-assisted, discovery the engine does not have enough information to discover the table all on its own. But it can discover the table structure if the user provides certain additional information. In this case, an explicit CREATE TABLE statement is still necessary, but it should contain no table structure — only the table name and the table attributes. In MariaDB, the FederatedX storage engine supports this. When creating a table, one only needs to specify the CONNECTION attribute and the table structure — fields and indexes — are provided automatically by the engine.

Automatic Discovery

As far as automatic table discovery is concerned, the tables, from the server point of view, may appear, disappear, or change structure anytime. Thus the server needs to be able to ask whether a given table exists and what its structure is. It needs to be notified when a table structure changes outside of the server. And it needs to be able to get a list of all (unknown to the server) tables, for statements like SHOW TABLES. The server does all that by invoking specific methods of the handlerton:

const char **tablefile_extensions;
int (*discover_table_names)(handlerton *hton, LEX_STRING *db, MY_DIR *dir,
                            discovered_list *result);
int (*discover_table_existence)(handlerton *hton, const char *db,
                                const char *table_name);
int (*discover_table)(handlerton *hton, THD* thd, TABLE_SHARE *share);

handlerton::tablefile_extensions

Engines that store tables in separate files (one table might occupy many files with different extensions, but having the same base file name) should store the list of possible extensions in the tablefile_extensions member of the handlerton (earlier this list was returned by the handler::bas_ext() method). This will significantly simplify the discovery implementation for these engines, as you will see below.

handlerton::discover_table_names()

When a user asks for a list of tables in a specific database — for example, by using SHOW TABLES or by selecting from INFORMATION_SCHEMA.TABLES — the server invokes discover_table_names() method of the handlerton. For convenience this method, besides the database name in question, gets the list of all files in this database directory, so that the engine can look for table files without doing any filesystem i/o. All discovered tables should be added to the result collector object. It is defined as

class discovered_list
{
  public:
  bool add_table(const char *tname, size_t tlen);
  bool add_file(const char *fname);
};

and the engine should call result->add_table() or result->add_file() for every discovered table (use add_file() if the name to add is in the MariaDB file name encoding, and add_table() if it's a true table name, as shown in SHOW TABLES).

If the engine is file-based, that is, it has non-empty list in the tablefile_extensions, this method is optional. For any file-based engine that does not implement discover_table_names(), MariaDB will automatically discover the list of all tables of this engine, by looking for files with the extension tablefile_extensions[0].

handlerton::discover_table_existence()

In some rare cases MariaDB needs to know whether a given table exists, but does not particularly care about this table structure (for example, when executing a DROP TABLE statement). In these cases, the server uses the discover_table_existence() method to find out whether a table with the given name exists in the engine.

This method is optional. For the engine that does not implement it, MariaDB will look for files with the tablefile_extensions[0], if possible. But if the engine is not file-based, MariaDB will use the discover_table() method to perform a full table discovery. While this will allow determining correctly whether a table exists, a full discovery is usually slower than the simple existence check. In other words, engines that are not file-based might want to support discover_table_existence() method as a useful optimization.

handlerton::discover_table()

This is the main method of table discovery, the heart of it. The server invokes it when it wants to use the table. The discover_table() method gets the TABLE_SHARE structure, which is not completely initialized — only the table and the database name (and a path to the table file) are filled in. It should initialize this TABLE_SHARE with the desired table structure.

MariaDB provides convenient and easy to use helpers that allow the engine to initialize the TABLE_SHARE with minimal efforts. They are the TABLE_SHARE methods init_from_binary_frm_image() and init_from_sql_statement_string().

TABLE_SHARE::init_from_binary_frm_image()

This method is used by engines that use "frm shipping" — such as Archive or NDB Cluster in MySQL. An frm shipping engine reads the frm file for a given table, exactly as it was generated by the server, and stores it internally. Later it can discover the table structure by using this very frm image. In this sense, a separate frm file in the database directory becomes redundant, because a copy of it is stored in the engine.

TABLE_SHARE::init_from_sql_statement_string()

This method allows initializing the TABLE_SHARE using a conventional SQL CREATE TABLE syntax.

TABLE_SHARE::read_frm_image()

Engines that use frm shipping need to get the frm image corresponding to a particular table (typically in the handler::create() method). They do it via the read_frm_image() method. It returns an allocated buffer with the binary frm image, that the engine can use the way it needs.

TABLE_SHARE::free_frm_image()

The frm image that was returned by read_frm_image() must be freed with the free_frm_image().

HA_ERR_TABLE_DEF_CHANGED

One of the consequences of automatic discovery is that the table definition might change when the server doesn't expect it to. Between two SELECT queries, for example. If this happens, if the engine detects that the server is using an outdated version of the table definition, it should return a HA_ERR_TABLE_DEF_CHANGED handler error. Depending on when in the query processing this error has happened, MariaDB will either re-discover the table and execute the query with the correct table structure, or abort the query and return an error message to the user.

TABLE_SHARE::tabledef_version

The previous paragraph doesn't cover one important question — how can the engine know that the server uses an outdated table definition? The answer is — by checking the tabledef_version, the table definition version. Every table gets a unique tabledef_version value. Normally it is generated automatically when a table is created. When a table is discovered the engine can force it to have a specific tabledef_version value (simply by setting it in the TABLE_SHARE before calling the init_from_binary_frm_image() or init_from_sql_statement_string() methods).

Now the engine can compare the table definition version that the server is using (from any handler method it can be accessed as this->table->s->tabledef_version) with the version of the actual table definition. If they differ — it is HA_ERR_TABLE_DEF_CHANGED.

Assisted discovery

Assisted discovery is a lot simpler from the server point of view, a lot more controlled. The table cannot appear or disappear at will, one still needs explicit DDL statements to manipulate it. There is only one new handlerton method that the server uses to discover the table structure when a user has issued an explicit CREATE TABLE statement without declaring any columns or indexes.

int (*discover_table_structure)(handlerton *hton, THD* thd,
                               TABLE_SHARE *share, HA_CREATE_INFO *info);

The assisted discovery API is pretty much independent from the automatic discovery API. An engine can implement either of them or both (or none); there is no requirement to support automatic discovery if only assisted discovery is needed.

handlerton::discover_table_structure()

Much like the discover_table() method, the discover_table_structure() handlerton method gets a partially initialized TABLE_SHARE with the table name, database name, and a path to table files filled in, but without a table structure. Unlike discover_table(), here the TABLE_SHARE has all the engine-defined table attributes in the TABLE_SHARE::option_struct structure. Based on the values of these attributes the discover_table_structure() method should initialize the TABLE_SHARE with the desired set of fields and keys. It can use TABLE_SHARE helper methods init_from_binary_frm_image() and init_from_sql_statement_string() for that.

The role of `.frm` files

Before table discovery was introduced, MariaDB used .frm files to store the table definition. But now the engine can store the table definition (if the engine supports automatic discovery, of course), and .frm files become redundant. Still, the server can use .frm files for such an engine — but they are no longer the only source of the table definition. Now .frm files are merely a cache of the table definition, while the original authoritative table definition is stored in the engine. Like any cache, its purpose is to reduce discovery attempts for a table. The engine decides whether it makes sense to cache table definition in the .frm file or not (see the second argument for the TABLE_SHARE::init_from_binary_frm_image()). For example, the Archive engine uses .frm cache, while the Sequence engine does not. In other words, MariaDB creates .frm files for Archive tables, but not for Sequence tables.

The cache is completely transparent for a user; MariaDB makes sure that it always stores the actual table definition and invalidates the .frm file automatically when it becomes out of date. This can happen, for example, if a user copies a new Archive table into the datadir and forgets to delete the .frm file of the old table with the same name.

_{This page is licensed: CC BY-SA / Gnu FDL}

Storage Engine Development

Engine-defined New Table/Field/Index Attributes

API

SQL

See Also

Storage Engine FAQ

Are storage engines designed for MySQL compatible with MariaDB?

Will storage engines created for MariaDB work in MySQL?

Do storage engine binaries need to be recompiled for MariaDB?

Table Discovery (before 10.0.2)

Table Discovery

Automatic Discovery

handlerton::tablefile_extensions

handlerton::discover_table_names()

handlerton::discover_table_existence()

handlerton::discover_table()

TABLE_SHARE::init_from_binary_frm_image()

TABLE_SHARE::init_from_sql_statement_string()

TABLE_SHARE::read_frm_image()

TABLE_SHARE::free_frm_image()

HA_ERR_TABLE_DEF_CHANGED

TABLE_SHARE::tabledef_version

Assisted discovery

handlerton::discover_table_structure()

The role of .frm files

The role of `.frm` files