10bet网址
MySQL 5.7 Reference Manual
Related Documentation Download this Manual Excerpts from this Manual

12.10.6 Fine-Tuning MySQL Full-Text Search

MySQL's full-text search capability has few user-tunable parameters. You can exert more control over full-text searching behavior if you have a MySQL source distribution because some changes require source code modifications. SeeSection 2.9, “Installing MySQL from Source”.

Full-text search is carefully tuned for effectiveness. Modifying the default behavior in most cases can actually decrease effectiveness.Do not alter the MySQL sources unless you know what you are doing.

Most full-text variables described in this section must be set at server startup time. A server restart is required to change them; they cannot be modified while the server is running.

Some variable changes require that you rebuild theFULLTEXTindexes in your tables. Instructions for doing so are given later in this section.

Configuring Minimum and Maximum Word Length

The minimum and maximum lengths of words to be indexed are defined by theinnodb_ft_min_token_sizeandinnodb_ft_max_token_sizeforInnoDBsearch indexes, andft_min_word_lenandft_max_word_lenforMyISAMones.

Note

Minimum and maximum word length full-text parameters do not apply toFULLTEXTindexes created using the ngram parser. ngram token size is defined by thengram_token_sizeoption.

After changing any of these options, rebuild yourFULLTEXTindexes for the change to take effect. For example, to make two-character words searchable, you could put the following lines in an option file:

[mysqld] innodb_ft_min_token_size=2 ft_min_word_len=2

Then restart the server and rebuild yourFULLTEXT索引。ForMyISAMtables, note the remarks regardingmyisamchkin the instructions that follow for rebuildingMyISAMfull-text indexes.

Configuring the Natural Language Search Threshold

ForMyISAMsearch indexes, the 50% threshold for natural language searches is determined by the particular weighting scheme chosen. To disable it, look for the following line instorage/myisam/ftdefs.h:

#define GWS_IN_USE GWS_PROB

改变这一行:

#define GWS_IN_USE GWS_FREQ

Then recompile MySQL. There is no need to rebuild the indexes in this case.

Note

By making this change, youseverelydecrease MySQL's ability to provide adequate relevance values for theMATCH()function. If you really need to search for such common words, it would be better to search usingIN BOOLEAN MODEinstead, which does not observe the 50% threshold.

Modifying Boolean Full-Text Search Operators

To change the operators used for boolean full-text searches onMyISAMtables, set theft_boolean_syntaxsystem variable. (InnoDBdoes not have an equivalent setting.) This variable can be changed while the server is running, but you must have privileges sufficient to set global system variables (seeSection 5.1.8.1, “System Variable Privileges”). No rebuilding of indexes is necessary in this case.

Character Set Modifications

For the built-in full-text parser, you can change the set of characters that are considered word characters in several ways, as described in the following list. After making the modification, rebuild the indexes for each table that contains anyFULLTEXT索引。假设你想把连字符character ('-') as a word character. Use one of these methods:

  • Modify the MySQL source: Instorage/innobase/handler/ha_innodb.cc(forInnoDB), or instorage/myisam/ftdefs.h(forMyISAM), see thetrue_word_char()andmisc_word_char()macros. Add“- - -”to one of those macros and recompile MySQL.

  • Modify a character set file: This requires no recompilation. Thetrue_word_char()macro uses a字符类型table to distinguish letters and numbers from other characters. . You can edit the contents of thearray in one of the character set XML files to specify that“- - -”is aletter.Then use the given character set for yourFULLTEXT索引。For information about thearray format, seeSection 10.13.1, “Character Definition Arrays”.

  • Add a new collation for the character set used by the indexed columns, and alter the columns to use that collation. For general information about adding collations, seeSection 10.14, “Adding a Collation to a Character Set”. For an example specific to full-text indexing, seeSection 12.10.7, “Adding a User-Defined Collation for Full-Text Indexing”.

Rebuilding InnoDB Full-Text Indexes

For the changes to take effect,FULLTEXTindexes must be rebuilt after modifying any of the following full-text index variables:innodb_ft_min_token_size;innodb_ft_max_token_size;innodb_ft_server_stopword_table;innodb_ft_user_stopword_table;innodb_ft_enable_stopword;ngram_token_size. Modifyinginnodb_ft_min_token_size,innodb_ft_max_token_size, orngram_token_sizerequires restarting the server.

To rebuildFULLTEXTindexes for anInnoDBtable, useALTER TABLEwith theDROP INDEXandADD INDEXoptions to drop and re-create each index.

Optimizing InnoDB Full-Text Indexes

RunningOPTIMIZE TABLEon a table with a full-text index rebuilds the full-text index, removing deleted Document IDs and consolidating multiple entries for the same word, where possible.

To optimize a full-text index, enableinnodb_optimize_fulltext_onlyand runOPTIMIZE TABLE.

mysql> set GLOBAL innodb_optimize_fulltext_only=ON; Query OK, 0 rows affected (0.01 sec) mysql> OPTIMIZE TABLE opening_lines; +--------------------+----------+----------+----------+ | Table | Op | Msg_type | Msg_text | +--------------------+----------+----------+----------+ | test.opening_lines | optimize | status | OK | +--------------------+----------+----------+----------+ 1 row in set (0.01 sec)

To avoid lengthy rebuild times for full-text indexes on large tables, you can use theinnodb_ft_num_word_optimizeoption to perform the optimization in stages. Theinnodb_ft_num_word_optimizeoption defines the number of words that are optimized each timeOPTIMIZE TABLEis run. The default setting is 2000, which means that 2000 words are optimized each timeOPTIMIZE TABLEis run. SubsequentOPTIMIZE TABLEoperations continue from where the precedingOPTIMIZE TABLEoperation ended.

Rebuilding MyISAM Full-Text Indexes

If you modify full-text variables that affect indexing (ft_min_word_len,ft_max_word_len, orft_stopword_file), or if you change the stopword file itself, you must rebuild yourFULLTEXTindexes after making the changes and restarting the server.

To rebuild theFULLTEXTindexes for aMyISAMtable, it is sufficient to do aQUICKrepair operation:

mysql> REPAIR TABLEtbl_nameQUICK;

Alternatively, useALTER TABLEas just described. In some cases, this may be faster than a repair operation.

Each table that contains anyFULLTEXTindex must be repaired as just shown. Otherwise, queries for the table may yield incorrect results, and modifications to the table cause the server to see the table as corrupt and in need of repair.

If you usemyisamchkto perform an operation that modifiesMyISAMtable indexes (such as repair or analyze), theFULLTEXTindexes are rebuilt using thedefaultfull-text parameter values for minimum word length, maximum word length, and stopword file unless you specify otherwise. This can result in queries failing.

The problem occurs because these parameters are known only by the server. They are not stored inMyISAMindex files. To avoid the problem if you have modified the minimum or maximum word length or stopword file values used by the server, specify the sameft_min_word_len,ft_max_word_len, andft_stopword_filevalues formyisamchkthat you use formysqld. For example, if you have set the minimum word length to 3, you can repair a table withmyisamchklike this:

myisamchk --recover --ft_min_word_len=3tbl_name.MYI

To ensure thatmyisamchkand the server use the same values for full-text parameters, place each one in both the[mysqld]and[myisamchk]sections of an option file:

[mysqld] ft_min_word_len=3 [myisamchk] ft_min_word_len=3

An alternative to usingmyisamchkforMyISAMtable index modification is to use theREPAIR TABLE,ANALYZE TABLE,OPTIMIZE TABLE, orALTER TABLEstatements. These statements are performed by the server, which knows the proper full-text parameter values to use.