Full-Text Search – A Beginner’s Guide

In my experience, many DBA’s and developers seem to shy away from Full-Text Search (http://technet.microsoft.com/en-us/library/ms142571.aspx).  But, it is not as complicated as you may think.  In this post, I am going to attempt to make it as easy as possible to understand.

*** NOTE: Full-Text Search is different in SQL Server 2008 and newer versions from older versions.  So, I am just going to focus on newer versions.

Why might you decide to use SQL Servers’ Full-Text Search?

  • Faster searches – Full-text search can provide faster searches for specific word(s) in a field.  For example, suppose you want to search the text of a column in a table for a specific word like “test”.  A normal nonclustered index would not be ideal.  If you coded it “ColumnName LIKE ‘%test%’”, then SQL Server would most likely perform an index scan and this would NOT scale well.
  • Large data field – SQL Server has a 900-byte limit for the maximum total size of all index key columns (http://technet.microsoft.com/en-us/library/ms191241.aspx).  So, if you wanted to index and search on a column larger than that, you could use full-text search.
  • Complex searching – In using the LIKE keyword, you can do simple pattern matching.  However, with full-text queries they are language aware, and can apply specific transformations at index and query time, such as filtering stopwords and making thesaurus and inflectional expansions.

How do I setup SQL Servers’ Full-Text Search?

First, make sure that the Full-Text Search feature is installed from the SQL Server installation.

Second, take note of the following code that handles these steps:

-- create table
CREATE TABLE dbo.FTSearch(
RowId INT IDENTITY(1,1),
RowText NVARCHAR(4000)
)
ALTER TABLE [dbo].[FTSearch] ADD  CONSTRAINT [PK_FTSearch] PRIMARY KEY CLUSTERED ([RowId] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

-- populate table
INSERT INTO dbo.FTSearch
SELECT text
FROM master.sys.syscomments
WHERE colid = 1
GO

-- enable full-text search
CREATE FULLTEXT CATALOG [TestFTSearch] WITH ACCENT_SENSITIVITY = ON
AS DEFAULT AUTHORIZATION [dbo]
GO
CREATE FULLTEXT INDEX ON [dbo].[FTSearch]([RowText]) KEY INDEX [PK_FTSearch] ON ([TestFTSearch]) WITH (CHANGE_TRACKING AUTO)
GO
ALTER FULLTEXT INDEX ON [dbo].[FTSearch] ENABLE
GO

How do I query the data using Full-Text Search?

SQL Server full-text queries (http://technet.microsoft.com/en-us/library/ms142583.aspx) use the following syntax:

SELECT *
FROM dbo.FTSearch
WHERE CONTAINS(RowText, 'test')

SELECT *
FROM dbo.FTSearch
WHERE FREETEXT(RowText, 'test')

  • CONTAINSTABLE (http://technet.microsoft.com/en-us/library/ms189760.aspx)– A function that returns a table of zero, one, or more rows for columns matched, similar to the CONTAINS predicate.  However, unlike CONTAINS, queries using CONTAINSTABLE return a relevance ranking value (RANK) and full-text key (KEY) for each row.

SELECT *
FROM CONTAINSTABLE(dbo.FTSearch, RowText, 'test')

  • FREETEXTTABLE (http://technet.microsoft.com/en-us/library/ms177652.aspx) – A function that returns a table of zero, one, or more rows for the columns matched, similar to the FREETEXT predicate.  However, unlike FREETEXT, queries using FREETEXTTABLE return a relevance ranking value (RANK) and full-text key (KEY) for each row.

SELECT *
FROM FREETEXTTABLE(dbo.FTSearch, RowText, 'test')

*** NOTE: These are really basic examples of searching, and those links provide much more depth on how to do more complex searching.

How do I maintain the indexes of SQL Servers’ Full-Text Search?

Like any index, full-text search indexes need to be maintained.  The key to determining when this maintenance is needed, is to look at the catalog view, sys.fulltext_index_fragments (http://technet.microsoft.com/en-us/library/cc280700.aspx).

SELECT *
FROM sys.fulltext_index_fragments

At this point, you will probably not see many fragments, if you followed the sample script above.  If you want to see some fragments, run the following code multiple times:

INSERT INTO dbo.FTSearch
SELECT text
FROM master.sys.syscomments
WHERE colid = 1

When there are many queryable fragments, we could use the command ALTER FULLTEXT CATALOG (http://technet.microsoft.com/en-us/library/ms176095.aspx).  This command has options to REORGANIZE (merging the smaller indexes into one large index) or REBUILD (deleting the existing catalog and creating a new catalog).

ALTER FULLTEXT CATALOG TestFTSearch REORGANIZE

Additional references to explore:

So, this post represented my attempt to present SQL Server Free-Text Search in a simple and easy to understand manner.  Of course, it can get much more complex depending on your circumstances, but I leave that for you to figure out.

Advertisements
This entry was posted in DBA, Developer, TSQL and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s