Create a Custom Tokenizer
- how-to
Create a custom tokenizer with the Couchbase Server Web Console to change how the Search Service creates tokens for matching Search index content to a Search query.
Prerequisites
- 
You have the Search Service enabled on a node in your database. For more information about how to deploy a new node and Services on your database, see Manage Nodes and Clusters. 
- 
You have created an index. For more information, see Create a Basic Search Index with the Web Console. 
- 
You have logged in to the Couchbase Server Web Console. 
Procedure
You can create 2 types of custom tokenizers:
| Tokenizer Type | Description | 
|---|---|
| The tokenizer uses any input that matches the regular expression to create new tokens. | |
| The tokenizer removes any input that matches the regular expression, and creates tokens from the remaining input. You can choose another tokenizer to apply to the remaining input. | 
Create a Regular Expression Tokenizer
To create a regular expression tokenizer with the Couchbase Server Web Console:
- 
Go to Search. 
- 
Click the Search index where you want to create a custom tokenizer. 
- 
Click Edit. 
- 
Expand . 
- 
Click Add Tokenizer. 
- 
In the Name field, enter a name for the custom tokenizer. 
- 
In the Type field, select regexp. 
- 
In the Regular Expression field, enter the regular expression to use to split input into tokens. 
- 
Click Save. 
Create an Exception Custom Tokenizer
To create an exception custom tokenizer with the Couchbase Server Web Console:
- 
Go to Search. 
- 
Do one of the following: 
- 
Click the Search index where you want to create a custom tokenizer. 
- 
Click Edit. 
- 
Expand . 
- 
Click Add Tokenizer. 
- 
In the Name field, enter a name for the custom tokenizer. 
- 
In the Type field, select exception. 
- 
In the Exception Patterns field, enter a regular expression to use to remove content from input. 
- 
To add the regular expression to the list of exception patterns, click Add. 
- 
(Optional) To add additional regular expressions to the list of exception patterns, repeat the previous steps. 
- 
In the Tokenizer for Remaining Input field, select a tokenizer to apply to input after removing any content that matches the regular expression. For more information about the available tokenizers, see Default Tokenizers. 
- 
Click Save. 
Next Steps
After you create a custom tokenizer, you can use it with a custom analyzer.
To continue customizing your Search index, you can also:
To run a search and test the contents of your Search index, see Run A Simple Search with the Web Console or Run a Simple Search with the REST API and curl/HTTP.