Mroonga Overview

Once Mroonga has been installed (see About Mroonga), its basic usage is similar to that of a regular fulltext index.

For example:

CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy)) ENGINE=Mroonga;

INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),
    ('There was a wicked witch'), ('Who ate everybody up');

SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('wicked');
+--------------------------+
| copy                     |
+--------------------------+
| There was a wicked witch |
+--------------------------+

Score

Mroonga can also order by weighting. For example, first add another record:

INSERT INTO ft_mroonga(copy) VALUES ('She met a wicked, wicked witch');

Records can be returned by weighting, for example, the newly added record has two occurences of the word 'wicked' and a higher weighting:

SELECT *, MATCH(copy) AGAINST('wicked') AS score FROM ft_mroonga 
   WHERE MATCH(copy) AGAINST('wicked') ORDER BY score DESC;
+--------------------------------+--------+
| copy                           | score  |
+--------------------------------+--------+
| She met a wicked, wicked witch | 299594 |
| There was a wicked witch       | 149797 |
+--------------------------------+--------+

Parser

Mroonga permits you to set a different parser for searching by specifying the parser in the CREATE TABLE statement as a comment or, in older versions, changing the value of the mroonga_default_parser system variable.

For example:

CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenDelimitNull"') 
  ENGINE=Mroonga;,

or

SET GLOBAL mroonga_default_parser = 'TokenBigramSplitSymbol';

The following parser settings are available:

Setting
Description

Setting

Description

off

No tokenizing is performed.

TokenBigram

Default value. Continuous alphabetical characters, numbers or symbols are treated as a token.

TokenBigramIgnoreBlank

Same as TokenBigram except that white spaces are ignored.

TokenBigramIgnoreBlankSplitSymbol

Same as TokenBigramSplitSymbol. except that white spaces are ignore.

TokenBigramIgnoreBlankSplitSymbolAlpha

Same as TokenBigramSplitSymbolAlpha except that white spaces are ignored.

TokenBigramIgnoreBlankSplitSymbolAlphaDigit

Same as TokenBigramSplitSymbolAlphaDigit except that white spaces are ignored.

TokenBigramSplitSymbol

Same as TokenBigram except that continuous symbols are not treated as a token, but tokenised in bigram.

TokenBigramSplitSymbolAlpha

Same as TokenBigram except that continuous alphabetical characters are not treated as a token, but tokenised in bigram.

TokenDelimit

Tokenises by splitting on white spaces.

TokenDelimitNull

Tokenises by splitting on null characters (\0).

TokenMecab

Tokenise using MeCab. Required Groonga to be buillt with MeCab support.

TokenTrigram

Tokenises in trigrams but continuous alphabetical characters, numbers or symbols are treated as a token.

TokenUnigram

Tokenises in unigrams but continuous alphabetical characters, numbers or symbols are treated as a token.

Examples

TokenBigram vs TokenBigramSplitSymbol

TokenBigram failing to match partial symbols which TokenBigramSplitSymbol matches, since TokenBigramSplitSymbol does not treat continuous symbols as a token.

DROP TABLE ft_mroonga;
CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenBigram"') 
  ENGINE=Mroonga;
INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),   
  ('There was a wicked witch'), 
  ('Who ate everybody up'), 
  ('She met a wicked, wicked witch'), 
  ('A really wicked, wicked witch!!?!');
SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('!?');
Empty set (0.00 sec)

DROP TABLE ft_mroonga;
CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenBigramSplitSymbol"') 
  ENGINE=Mroonga;
INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),   
  ('There was a wicked witch'), 
  ('Who ate everybody up'), 
  ('She met a wicked, wicked witch'), 
  ('A really wicked, wicked witch!!?!');
SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('!?');
+-----------------------------------+
| copy                              |
+-----------------------------------+
| A really wicked, wicked witch!!?! |
+-----------------------------------+

TokenBigram vs TokenBigramSplitSymbolAlpha

DROP TABLE ft_mroonga;
CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenBigram"') 
  ENGINE=Mroonga;
INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),   
  ('There was a wicked witch'), 
  ('Who ate everybody up'), 
  ('She met a wicked, wicked witch'), 
  ('A really wicked, wicked witch!!?!');
SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('ick');
Empty set (0.00 sec)

DROP TABLE ft_mroonga;
CREATE TABLE ft_mroonga(copy TEXT,FULLTEXT(copy) COMMENT 'parser "TokenBigramSplitSymbolAlpha"') 
  ENGINE=Mroonga;
INSERT INTO ft_mroonga(copy) VALUES ('Once upon a time'),   
  ('There was a wicked witch'), 
  ('Who ate everybody up'), 
  ('She met a wicked, wicked witch'), 
  ('A really wicked, wicked witch!!?!');
SELECT * FROM ft_mroonga WHERE MATCH(copy) AGAINST('ick');
+-----------------------------------+
| copy                              |
+-----------------------------------+
| There was a wicked witch          |
| She met a wicked, wicked witch    |
| A really wicked, wicked witch!!?! |
+-----------------------------------+

This page is licensed: CC BY-SA / Gnu FDL

Last updated

Was this helpful?