Whether you are just starting your observability journey or already are an expert, our courses will help advance your knowledge and practical skills.
Expert insight, best practices and information on everything related to Observability issues, trends and solutions.
Explore our guides on a broad range of observability related topics.
OpenSearch is a community-driven, open-source search and analytics suite. Originally a fork of Elasticsearch and Kibana, it was created to ensure the community retains an open-source alternative for indexing and searching large datasets.
OpenSearch makes it possible to store, search, and visualize data in real time, scaling up to handle extensive datasets without performance degradation. It covers various applications, including log analytics, business intelligence, and full-text search.
Its features include a RESTful API, integrated security, alerting, and machine learning capabilities, making it suitable for operational and business use cases. OpenSearch supports customization, along with community-contributed plugins and extensions that can further enhance its functionalities.
There are several clients that can be used to interface with OpenSearch in Python.
The low-level Python client, opensearch-py, serves as an interface to the OpenSearch REST API, enabling more natural interactions with an OpenSearch cluster within a Python environment. Instead of sending raw HTTP requests, users can create an OpenSearch client instance and utilize built-in functions to perform various operations. This approach streamlines the process of managing OpenSearch clusters and executing API requests.
opensearch-py-ml is a specialized Python client designed to enhance data analytics and natural language processing (NLP) capabilities within OpenSearch. This client allows data analysts to leverage the following features:
The high-level Python client for OpenSearch, known as opensearch-dsl-py, offers convenient wrapper classes for handling OpenSearch entities, such as documents, as Python objects. This client simplifies the process of writing queries and provides accessible Python methods for frequent OpenSearch tasks, such as creating, indexing, and updating documents, as well as conducting searches with and without filters.
However, it is important to note that opensearch-dsl-py will be deprecated after version 2.1.0. Users are encouraged to transition to the low-level Python client, opensearch-py, which has incorporated the functionalities of the high-level client.
In this tutorial, we’ll walk through the steps to set up OpenSearch with Python, including how to connect to an OpenSearch cluster, perform basic operations such as creating an index, indexing a document, performing bulk operations, searching for documents, and deleting documents and indexes. The tutorial steps are adapted from the official documentation.
First, install the OpenSearch Python client using pip:
pip3 install opensearch-py
Once installed, import the client in with the following Python script:
from opensearchpy import OpenSearch
To connect the client to the OpenSearch host, create a client object. If using the Security plugin, enable SSL and provide authentication details. Here’s an example:
host = 'localhost'
port = 9200
auth = ('admin', 'admin')
ca_certs_path = '/full/path/to/root-ca.pem'
client = OpenSearch(
hosts=[{'host': host, 'port': port}],
http_compress=True,
http_auth=auth,
use_ssl=True,
verify_certs=True,
ssl_assert_hostname=False,
ssl_show_warn=False,
ca_certs=ca_certs_path
)
# Print out a debug print to see if connection is successful
print( client.info() )
If using the organization’s client certificates, include them as follows:
client_cert_path = '/full/path/to/client.pem'
client_key_path = '/full/path/to/client-key.pem'
client = OpenSearch(
hosts=[{'host': host, 'port': port}],
http_compress=True,
http_auth=auth,
client_cert=client_cert_path,
client_key=client_key_path,
use_ssl=True,
verify_certs=True,
ssl_assert_hostname=False,
ssl_show_warn=False,
ca_certs=ca_certs_path
)
# Print out a debug print to see if connection is successful
print( client.info() )
For connections without the Security plugin, disable SSL:
client = OpenSearch(
hosts=[{'host': host, 'port': port}],
http_compress=True,
use_ssl=False,
verify_certs=False,
ssl_assert_hostname=False,
ssl_show_warn=False
)
# Print out a debug print to see if connection is successful
print( client.info() )
Before proceeding, please make sure:
To connect to Amazon OpenSearch Service, use AWS credentials and specify the connection details:
from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth
import boto3
host = 'my-example-domain.us-east-1.es.amazonaws.com'
region = 'us-east-1'
service = 'es'
credentials = boto3.Session().get_credentials()
auth = AWSV4SignerAuth(credentials, region, service)
client = OpenSearch(
hosts=[{'host': host, 'port': 443}],
http_auth=auth,
use_ssl=True,
verify_certs=True,
connection_class=RequestsHttpConnection,
pool_maxsize=30
)
# Check if above command succeeded or not
print( client.info() )
The Policy document explicitly denies access by default, please allow for it for your IAM user by using the following JSON:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::XXXXXXXX:user/YOUR_USER"
},
"Action": "es:*",
"Resource": "arn:aws:es:us-east-1:XXXXXXXXX:domain/myostestagile/*"
}
]
}
To create an index, use a script similar to this:
index_name = 'python-example-index'
index_body = {
'settings': {
'index': {
'number_of_shards': 5
}
}
}
response = client.indices.create(index_name, body=index_body)
print('Creating index:', response)
To index a document, use the client.index() method. For example:
document = {
'title': 'To Kill a Mockingbird',
'author': 'Harper Lee',
'year': '1960'
}
response = client.index(
index='python-example-index',
body=document,
id='1',
refresh=True
)
print('Adding document:', response)
For bulk operations, use the client.bulk() method. This supports multiple simultaneous operations, which can be of the same or different types. To separate operations, use \n. For example:books = '{ "index" : { "_index" : "example-dsl-index", "_id" : "1" } } \n { "title" : "To Kill a Mockingbird", "author" : "Harper Lee", "year" : "1960"} \n { "create" : { "_index" : "example-dsl-index", "_id" : "2" } } \n { "title" : "1984", "author" : "George Orwell", "year" : "1949"} \n { "update" : {"_id" : "2", "_index" : "example-dsl-index" } } \n { "doc" : {"year" : "1950"} }'
client.bulk(body=books)
To search for documents, create a query using the client.search() method:
q = 'Harper Lee'
query = {
'size': 5,
'query': {
'multi_match': {
'query': q,
'fields': ['title^2', 'author']
}
}
}
response = client.search(
body=query,
index='python-example-index'
)
print('Search results:', response)
The client.delete() method can be used to delete documents:
response = client.delete(
index='python-example-index',
id='1'
)
print('Deleting document:', response
To delete an index, use the client.indices.delete() method:
response = client.indices.delete(
index='python-example-index'
)
print('Deleting index:', response)
Coralogix sets itself apart in observability with its modern architecture, enabling real-time insights into logs, metrics, and traces with built-in cost optimization. Coralogix’s straightforward pricing covers all its platform offerings including APM, RUM, SIEM, infrastructure monitoring and much more. With unparalleled support that features less than 1 minute response times and 1 hour resolution times, Coralogix is a leading choice for thousands of organizations across the globe.