In my experience, many DBA’s and developers seem to shy away from Full-Text Search (http://technet.microsoft.com/en-us/library/ms142571.aspx). But, it is not as complicated as you may think. In this post, I am going to attempt to make it as easy as possible to understand.
*** NOTE: Full-Text Search is different in SQL Server 2008 and newer versions from older versions. So, I am just going to focus on newer versions.
Why might you decide to use SQL Servers’ Full-Text Search?
- Faster searches – Full-text search can provide faster searches for specific word(s) in a field. For example, suppose you want to search the text of a column in a table for a specific word like “test”. A normal nonclustered index would not be ideal. If you coded it “ColumnName LIKE ‘%test%’”, then SQL Server would most likely perform an index scan and this would NOT scale well.
- Large data field – SQL Server has a 900-byte limit for the maximum total size of all index key columns (http://technet.microsoft.com/en-us/library/ms191241.aspx). So, if you wanted to index and search on a column larger than that, you could use full-text search.
- Complex searching – In using the LIKE keyword, you can do simple pattern matching. However, with full-text queries they are language aware, and can apply specific transformations at index and query time, such as filtering stopwords and making thesaurus and inflectional expansions.
How do I setup SQL Servers’ Full-Text Search?
First, make sure that the Full-Text Search feature is installed from the SQL Server installation.
Second, take note of the following code that handles these steps:
- Create a table with a primary key
- Create a full-text catalog (http://technet.microsoft.com/en-us/library/ms189520.aspx)
- Create a full-text index (http://technet.microsoft.com/en-us/library/ms187317.aspx)
-- create table
CREATE TABLE dbo.FTSearch(
RowId INT IDENTITY(1,1),
ALTER TABLE [dbo].[FTSearch] ADD CONSTRAINT [PK_FTSearch] PRIMARY KEY CLUSTERED ([RowId] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
-- populate table
INSERT INTO dbo.FTSearch
WHERE colid = 1
-- enable full-text search
CREATE FULLTEXT CATALOG [TestFTSearch] WITH ACCENT_SENSITIVITY = ON
AS DEFAULT AUTHORIZATION [dbo]
CREATE FULLTEXT INDEX ON [dbo].[FTSearch]([RowText]) KEY INDEX [PK_FTSearch] ON ([TestFTSearch]) WITH (CHANGE_TRACKING AUTO)
ALTER FULLTEXT INDEX ON [dbo].[FTSearch] ENABLE
How do I query the data using Full-Text Search?
SQL Server full-text queries (http://technet.microsoft.com/en-us/library/ms142583.aspx) use the following syntax:
- CONTAINS (http://technet.microsoft.com/en-us/library/ms187787.aspx) – A predicate which searches for precise or fuzzy (less precise) matches to single words and phrases, words within a certain distance of one another, or weighted matches.
WHERE CONTAINS(RowText, 'test')
- FREETEXT (http://technet.microsoft.com/en-us/library/ms176078.aspx) – A predicate which searches for values that match the meaning and not just the exact wording of the words in the search condition.
WHERE FREETEXT(RowText, 'test')
- CONTAINSTABLE (http://technet.microsoft.com/en-us/library/ms189760.aspx)– A function that returns a table of zero, one, or more rows for columns matched, similar to the CONTAINS predicate. However, unlike CONTAINS, queries using CONTAINSTABLE return a relevance ranking value (RANK) and full-text key (KEY) for each row.
FROM CONTAINSTABLE(dbo.FTSearch, RowText, 'test')
- FREETEXTTABLE (http://technet.microsoft.com/en-us/library/ms177652.aspx) – A function that returns a table of zero, one, or more rows for the columns matched, similar to the FREETEXT predicate. However, unlike FREETEXT, queries using FREETEXTTABLE return a relevance ranking value (RANK) and full-text key (KEY) for each row.
FROM FREETEXTTABLE(dbo.FTSearch, RowText, 'test')
*** NOTE: These are really basic examples of searching, and those links provide much more depth on how to do more complex searching.
How do I maintain the indexes of SQL Servers’ Full-Text Search?
Like any index, full-text search indexes need to be maintained. The key to determining when this maintenance is needed, is to look at the catalog view, sys.fulltext_index_fragments (http://technet.microsoft.com/en-us/library/cc280700.aspx).
At this point, you will probably not see many fragments, if you followed the sample script above. If you want to see some fragments, run the following code multiple times:
INSERT INTO dbo.FTSearch
WHERE colid = 1
When there are many queryable fragments, we could use the command ALTER FULLTEXT CATALOG (http://technet.microsoft.com/en-us/library/ms176095.aspx). This command has options to REORGANIZE (merging the smaller indexes into one large index) or REBUILD (deleting the existing catalog and creating a new catalog).
ALTER FULLTEXT CATALOG TestFTSearch REORGANIZE
Additional references to explore:
- Full-Text Search and Semantic Search Catalog Views (http://technet.microsoft.com/en-us/library/cc280702.aspx)
- Full-Text Search Dynamic Management Views and Functions (http://technet.microsoft.com/en-us/library/ms174971.aspx)
So, this post represented my attempt to present SQL Server Free-Text Search in a simple and easy to understand manner. Of course, it can get much more complex depending on your circumstances, but I leave that for you to figure out.