Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space TDDM and version 1

Since Totara 12 it has been possible to add full text search indexes into database schema, and to execute full text searches against them.

Working with full text search solutions

When is it appropriate to use full text searches

Full text searches are possible, but they should not be abused.

...

If you do decide to work with full text searches then this document should provide you with all the information you need to get started.

Understanding language support

Unlike other existing DDL features full text search is language aware.
The language is specified during the creation of the index and is used by your database in the production of the index.
It is used for the likes of breaking words, ignoring noise words (a, it, and etc) and for other linguistic features provided by the full text search index of your database.

...

The actual value is database dependent.

PostgreSQL

PostgreSQL is using 'english' configuration for full text search by default, for list of available options see result of "SELECT cfgname FROM pg_ts_config;".
For example:

...

Code Block
themeEmacs
$CFG->dboptions['fts3bworkaround'] = true;

MySQL

MySQL is using case and accent insensitive collation for full text search by default, you can specify a different collation here, for example:

...

Code Block
themeEmacs
$CFG->dboptions['fts3bworkaround'] = true;

MSSQL

MS SQL Server is using 'English' language by default, list of options in the MSSQL full text languages documentation.

Code Block
themeEmacs
$CFG->dboptions['ftslanguage'] = 'English';
$CFG->dboptions['ftslanguage'] = 'German';
$CFG->dboptions['ftslanguage'] = 'Japanese';
$CFG->dboptions['ftslanguage'] = 1028; // Traditional Chinese
$CFG->dboptions['ftslanguage'] = 2052; // Simplified Chinese

Developing full text search solutions

Defining a full text search solution

Full text search indexes must be defined within the INSTALL.xml file for your plugin/component.
We recommend adding them through the XMLDB editor that can be found within the product to ensure that the definitions are perfect.
The XMLDB editor can also give you the PHP code required to add the search index during upgrade.

...

Code Block
languagexml
themeEmacs
titleExample 1
<TABLE NAME="searchables" COMMENT="Searchable learning items">
  <FIELDS>
    <FIELD NAME="id" TYPE="int" LENGTH="10" NOTNULL="true" SEQUENCE="true"/>
    <FIELD NAME="component" TYPE="char" LENGTH="100" NOTNULL="true" SEQUENCE="false" COMMENT="The component this record belongs to"/>
    <FIELD NAME="type" TYPE="char" LENGTH="100" NOTNULL="true" SEQUENCE="false" COMMENT="The type of this record within the component"/>
    <FIELD NAME="instanceid" TYPE="int" LENGTH="10" NOTNULL="true" SEQUENCE="false" COMMENT="The identifier for this record in the component table"/>
    <FIELD NAME="timemodified" TYPE="int" LENGTH="10" NOTNULL="true" SEQUENCE="false" COMMENT="When this record was last modified"/>
    <FIELD NAME="high" TYPE="text" NOTNULL="false" SEQUENCE="false" COMMENT="Searchable text that is high value"/>
    <FIELD NAME="medium" TYPE="text" NOTNULL="false" SEQUENCE="false" COMMENT="Searchable text that is of medium value"/>
    <FIELD NAME="low" TYPE="text" NOTNULL="false" SEQUENCE="false" COMMENT="Searchable text that is low value"/>
  </FIELDS>
  <KEYS>
    <KEY NAME="primary-id" TYPE="primary" FIELDS="id"/>
  </KEYS>
  <INDEXES>
    <INDEX NAME="component-type-id" UNIQUE="true" FIELDS="component, type, instanceid"/>
    <INDEX NAME="high" UNIQUE="false" FIELDS="high" HINTS="full_text_search"/>
    <INDEX NAME="medium" UNIQUE="false" FIELDS="medium" HINTS="full_text_search"/>
    <INDEX NAME="low" UNIQUE="false" FIELDS="low" HINTS="full_text_search"/>
  </INDEXES>
</TABLE>

Populating the search table

During installation or upgrade your new table will be created, as will any full text search indexes on it.

...

In both cases, the cron routine, and the event observers it is important that all data is formatted specifically for searching prior to inserting it into your new table.
This can be done by calling the $DB→unformat_fts_content() method on each piece of textual content that will be inserted into a field with a full text search index on it.
By always using this method you can ensure that your content is consistently and correctly formatted in the best possible way.

Running a full text search

Once an index exists running a full text search can be easily run by using preparing a query that includes the full text search as a subquery.
This is facilitated by our DML layer in the following fashion:

...

Read on for more information on the get_fts_subquery.

The DML unformat_fts_content() method

This method should be used to ensure that all content written into a field  on which a full text search index exists is correctly and consistently formatted for searching.

...

It returns the now prepared content.

The DML get_fts_subquery() method

This method is the cornerstone of the of the full text search DML functionality.
It produces SQL for a subquery that executes the search and can be easily included in the join in order to filter results using a full text search.

...

  • It only executes the search in the one set language for the site.
  • The search table MUST contain a field called id.
  • At least one search field must be provided.
  • The weight must be a positive number.
  • If an empty search term is provided a debugging notice will be triggered and an empty result set will be returned.
  • The SQL snippet provided exposes two fields, id and score.
    • id: this is the id of the record in the table being searched.
    • score: the score is a float used to rank the results.
      It's actual value will depend upon how well the record matches the search term.
      The range will be database dependent and should not be relied upon.
  • The SQL snippet provided does not order the results. You need to do this yourself when making the search.

Recommendations, notes and FAQs

The following are things that we think will help you when designing and implementing your full text search solution.

Don't add full text search indexes to existing fields

This may seem like a good idea. It is not.

...

It also ensures that you never get into a situation where when adding a full text search index that you have to update every single record in a table by running it through unformat_fts_content() during upgrade.

Don't add full text search indexes to existing tables

We recommend creating a new table when designing full text solutions.
On occasion you may be tempted to add a new field to an existing table, and to add a full text search to that new field.
This avoids the trouble noted above in why you shouldn't add full text searches to existing fields.
However we recommend you add a new table still for a number of reasons:

  1. If the new table contains only searchable records and references the original table then you can essentially truncate it and repopulate it when building the index. The system can essentially treat it like a temp table.
  2. Having fields in an existing table adds complications to observers, particularly if the event is triggered within a transaction, and may lead you to having to abandon live updates via events. These cause lead to hard to replicate regressions.
  3. Requires additional consideration in how the site upgrades to ensure that everything ends up consistent in all upgrade paths.
  4. Promotes good separation of concerns.