Amazon OpenSearch Service
Explore how Amazon OpenSearch helps in creating RESTful analytical engines.
OpenSearch is an open-source search and analytics tool. It provides fast access to large amounts of data for analysis. Additionally, it offers an OpenSearch dashboard for visualization and exploration. It supports multiple analytical capabilities such as k-nearest neighbors (KNN) search, SQL, Anomaly Detection, Machine Learning Commons, Trace Analytics, full-text search, and more.
OpenSearch is a
How Amazon OpenSearch works
OpenSearch has three main components: a distributed analytic engine, a data visualization interface, and a data prepper.
OpenSearch Domain
The distributed analytical engine is essentially an ElasticSearch cluster called an OpenSearch Domain. It is a collection of nodes and configurations that define an OpenSearch Service environment. EC2 instances act as nodes in this cluster, which allows us to process large volumes, execute complex queries, and perform aggregations.
To create an OpenSearch domain, we specify the configuration details, such as instance types, storage options, network settings, access policies, and more. It serves as the endpoint for accessing and interacting with the Elasticsearch cluster.
The OpenSearch service only charges for the EC2 instances comprising the cluster and the cumulative size of the EBS volumes attached. To further improve costs, Amazon OpenSearch offers OpenSearch Serverless, which automatically provisions, manages, and scales the compute capacity of the cluster based on the application’s needs.
OpenSearch Dashboards
Amazon OpenSearch dashboards is an open-source visualization tool provided with each OpenSearch domain. It is a javascript application that provides functionality similar to the Kibana dashboards in the ElastiSerach clusters.
OpenSearch Ingestion
OpenSearch ingestion is a fully managed serverless data collector. It collects and delivers real-time logs. metrics, and trace data. With OpenSerach ingestion, we don't need to use third-party tools such as LogStash to ingest data into the pipeline. We can directly configure data sources and send them to the OpenSearch ingester.
OpenSearch is powered by Data Prepper, which is an open-source data collector. It supports real-time data streaming, batch data processing, and transformation tasks. Data Prepper offers features such as data buffering, schema validation, and error handling to ensure reliable and efficient data ingestion.
OpenSearch ingestion offers several benefits, such as automatic scaling, updated security and bug patches, control costs by stopping or starting a pipeline, adding a layer of security by connecting it to VPC, and more.
Use case: Partial queries for IoT devices
OpenSearch is mostly used in complement to other database services. For example, consider we have a fleet of IoT devices deployed in various locations, each generating sensor data at regular intervals. We want to monitor this data in real-time and perform analytics to identify patterns, anomalies, and trends. We designed an application to gather information about all the devices that are connected to the DynamoDB table, which stores the important data regarding the devices.
The application also offers a search functionality that fetches information regarding IoT devices through
When an IoT device pushes data, DynamoDB stores it in real-time, generating DynamoDB streams that contain data about IoT devices. The streams trigger an event that invokes the Lambda function, and the Lambda function passes on this data to OpenSearch.
Now, when an application user searches for a device using partial queries, for example, they might search for devices based on partial device IDs or locations, OpenSearch analyzes the query and returns the item to the user.
Get hands-on with 1300+ tech skills courses.