Overview

A tokenizer is an essential component of the analyzer that receives a stream of characters as input, breaks it down into individual tokens (usually individual words), and outputs a stream of tokens.

Elasticsearch provides several built-in tokenizers that can be used for different types of text analysis. Here are some of the most commonly used ones:

Get hands-on with 1200+ tech skills courses.