Mroonga Overview
Once Mroonga has been installed (see About Mroonga), its basic usage is similar to that of a regular fulltext index.
For example:
CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy)) ENGINE=Mroonga;
INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),
('There was a wicked witch'), ('Who ate everybody up');
SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('wicked');
+--------------------------+
| copy |
+--------------------------+
| There was a wicked witch |
+--------------------------+
Score
Mroonga can also order by weighting. For example, first add another record:
INSERT INTO ft_mroonga(copy) VALUES ('She met a wicked, wicked witch');
Records can be returned by weighting, for example, the newly added record has two occurences of the word 'wicked' and a higher weighting:
SELECT *, MATCH(copy) AGAINST('wicked') AS score FROM ft_mroonga
WHERE MATCH(copy) AGAINST('wicked') ORDER BY score DESC;
+--------------------------------+--------+
| copy | score |
+--------------------------------+--------+
| She met a wicked, wicked witch | 299594 |
| There was a wicked witch | 149797 |
+--------------------------------+--------+
Parser
Mroonga permits you to set a different parser for searching by specifying the parser in the CREATE TABLE
statement as a comment or, in older versions, changing the value of the mroonga_default_parser system variable.
For example:
CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenDelimitNull"')
ENGINE=Mroonga;,
or
SET GLOBAL mroonga_default_parser = 'TokenBigramSplitSymbol';
The following parser settings are available:
Setting
Description
off
No tokenizing is performed.
TokenBigram
Default value. Continuous alphabetical characters, numbers or symbols are treated as a token.
TokenBigramIgnoreBlank
Same as TokenBigram except that white spaces are ignored.
TokenBigramIgnoreBlankSplitSymbol
Same as TokenBigramSplitSymbol. except that white spaces are ignore.
TokenBigramIgnoreBlankSplitSymbolAlpha
Same as TokenBigramSplitSymbolAlpha except that white spaces are ignored.
TokenBigramIgnoreBlankSplitSymbolAlphaDigit
Same as TokenBigramSplitSymbolAlphaDigit except that white spaces are ignored.
TokenBigramSplitSymbol
Same as TokenBigram except that continuous symbols are not treated as a token, but tokenised in bigram.
TokenBigramSplitSymbolAlpha
Same as TokenBigram except that continuous alphabetical characters are not treated as a token, but tokenised in bigram.
TokenDelimit
Tokenises by splitting on white spaces.
TokenDelimitNull
Tokenises by splitting on null characters (\0).
TokenMecab
Tokenise using MeCab. Required Groonga to be buillt with MeCab support.
TokenTrigram
Tokenises in trigrams but continuous alphabetical characters, numbers or symbols are treated as a token.
TokenUnigram
Tokenises in unigrams but continuous alphabetical characters, numbers or symbols are treated as a token.
Examples
TokenBigram vs TokenBigramSplitSymbol
TokenBigram
failing to match partial symbols which TokenBigramSplitSymbol
matches, since TokenBigramSplitSymbol
does not treat continuous symbols as a token.
DROP TABLE ft_mroonga;
CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenBigram"')
ENGINE=Mroonga;
INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),
('There was a wicked witch'),
('Who ate everybody up'),
('She met a wicked, wicked witch'),
('A really wicked, wicked witch!!?!');
SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('!?');
Empty set (0.00 sec)
DROP TABLE ft_mroonga;
CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenBigramSplitSymbol"')
ENGINE=Mroonga;
INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),
('There was a wicked witch'),
('Who ate everybody up'),
('She met a wicked, wicked witch'),
('A really wicked, wicked witch!!?!');
SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('!?');
+-----------------------------------+
| copy |
+-----------------------------------+
| A really wicked, wicked witch!!?! |
+-----------------------------------+
TokenBigram vs TokenBigramSplitSymbolAlpha
DROP TABLE ft_mroonga;
CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenBigram"')
ENGINE=Mroonga;
INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),
('There was a wicked witch'),
('Who ate everybody up'),
('She met a wicked, wicked witch'),
('A really wicked, wicked witch!!?!');
SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('ick');
Empty set (0.00 sec)
DROP TABLE ft_mroonga;
CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenBigramSplitSymbolAlpha"')
ENGINE=Mroonga;
INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),
('There was a wicked witch'),
('Who ate everybody up'),
('She met a wicked, wicked witch'),
('A really wicked, wicked witch!!?!');
SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('ick');
+-----------------------------------+
| copy |
+-----------------------------------+
| There was a wicked witch |
| She met a wicked, wicked witch |
| A really wicked, wicked witch!!?! |
+-----------------------------------+
This page is licensed: CC BY-SA / Gnu FDL
Last updated
Was this helpful?