MariaDB has a dedicated VECTOR(N) data type with a built-in data validation. N
is the number of dimensions that all vector values in the column have.
Consider the following table:
CREATE TABLE embeddings (
doc_id BIGINT UNSIGNED PRIMARY KEY,
embedding VECTOR(1536)
);
To have a fast vector search, you have to index the vector column, creating a VECTOR
index:
CREATE TABLE embeddings (
doc_id BIGINT UNSIGNED PRIMARY KEY,
embedding VECTOR(1536) NOT NULL,
VECTOR INDEX (embedding)
);
Note that there can be only one vector index in the table, and the indexed vector column must be NOT NULL
.
There are two options that can be used to configure the vector index:
M
— Larger values mean slower SELECT
and INSERT
statements, larger index size and higher memory consumption, but more accurate results. The valid range is from 3
to 200
.
DISTANCE
— Distance function to build the vector index for. Searches using a different distance function will not be able to use a vector index. Valid values are cosine
and euclidean
(the default).
CREATE TABLE embeddings (
doc_id BIGINT UNSIGNED PRIMARY KEY,
embedding VECTOR(1536) NOT NULL,
VECTOR INDEX (embedding) M=8 DISTANCE=cosine
);
This page is licensed: CC BY-SA / Gnu FDL