Skip to content

Added Basic Client side Metrics #1574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

001mertiya
Copy link
Contributor

@001mertiya 001mertiya commented May 20, 2025

Description

This PR introduces client-side metrics to track request rate, latency, and error rate. These metrics will help improve user experience by providing insights into how the request is used and identifying performance bottlenecks or unexpected behaviors.

Issues Resolved

The metrics PR includes integration with Micrometer and support for Prometheus, along with custom client-side metrics capabilities.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

psingh3 added 2 commits May 20, 2025 09:21
Signed-off-by: psingh3 <pushpendra.singh@walmart.com>
Signed-off-by: psingh3 <pushpendra.singh@walmart.com>
psingh3 added 4 commits May 20, 2025 10:23
Signed-off-by: psingh3 <pushpendra.singh@walmart.com>
Signed-off-by: psingh3 <pushpendra.singh@walmart.com>
Signed-off-by: psingh3 <pushpendra.singh@walmart.com>
@001mertiya
Copy link
Contributor Author

Hi @dblock @Xtansia , when you get a chance, could you please review this PR that adds client-side metrics support?

Signed-off-by: psingh3 <pushpendra.singh@walmart.com>
@rursprung
Copy link
Contributor

i haven't reviewed the code, but: i think this is great, thanks! would it be possible to add documentation for it to the README? otherwise people will not notice that these metrics now exist.

psingh3 and others added 8 commits May 27, 2025 13:33
Signed-off-by: psingh3 <pushpendra.singh@walmart.com>
Signed-off-by: psingh3 <pushpendra.singh@walmart.com>
Signed-off-by: psingh3 <pushpendra.singh@walmart.com>
Signed-off-by: psingh3 <pushpendra.singh@walmart.com>
Signed-off-by: Pushpendra Singh - WM <mertiya.pushpendra@gmail.com>
Copy link
Contributor

@rursprung rursprung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot for adding the documentation, this looks great!

i won't review the java code (no time for this right now and anyway one of the maintainers - which i'm not - needs to review this), but i did look through the documentation and added some remarks there.


## How to enable metrics

We should create a MetricOptions instance and set it in the ApacheHttpClient5TransportBuilder when creating the client. Take a look at below code snippet for an example of how to create a client with metrics enabled:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
We should create a MetricOptions instance and set it in the ApacheHttpClient5TransportBuilder when creating the client. Take a look at below code snippet for an example of how to create a client with metrics enabled:
We should create a `MetricOptions` instance and set it in the `ApacheHttpClient5TransportBuilder` when creating the client. Take a look at below code snippet for an example of how to create a client with metrics enabled:

var port = Integer.parseInt(env.getOrDefault("PORT", "9200"));
var user = env.getOrDefault("USERNAME", "admin");
var pass = env.getOrDefault("PASSWORD", "admin");
var metricEnabled = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
var metricEnabled = true;
var metricsEnabled = true;

(also for the usage(s) below)


final var hosts = new HttpHost[]{new HttpHost(https ? "https" : "http", hostname, port)};

final var sslContext = SSLContextBuilder.create().loadTrustMaterial(null, (chains, authType) -> true).build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use org.apache.hc.client5.http.ssl.TrustAllStrategy here.

Suggested change
final var sslContext = SSLContextBuilder.create().loadTrustMaterial(null, (chains, authType) -> true).build();
final var sslContext = SSLContextBuilder.create().loadTrustMaterial(null, new TrustAllStrategy()).build();

Copy link
Collaborator

@reta reta Jun 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rursprung the TrustAllStrategy should not be used in production code, we should probably have a placehoder here or use the default one SSLContexts.createDefault()


| Metric | Description | Important Dimensions(tags) |
|---------------------------------------|-----------------------------------------------------------------|--------------------------------|
| os.client.request.seconds | End-to-end request execution latency | StatusCodeOrException, Request |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (also for the other rows): i'd mark the metric name & tags as code for better readability):

Suggested change
| os.client.request.seconds | End-to-end request execution latency | StatusCodeOrException, Request |
| `os.client.request.seconds` | End-to-end request execution latency | `StatusCodeOrException`, `Request` |

var metricEnabled = true;
double PERCENTILE_99 = 0.99;
double PERCENTILE_95 = 0.95;
var meterRegistry = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use SimpleMeterRegistry for the example since it's not specific to prometheus? it's the simplest example which they also use in their intro docs.

# SDK Metrics

We've integrated a robust metrics solution into the OpenSearch Java client to provide comprehensive insights into its API usage and performance. This includes the collection of vital operational metrics, such as overall throughput, request latency, and error rate. Furthermore, we're capturing more granular details like request and response payload sizes, distinct success and failure rates for operations, and real-time OpenSearch node status to provide a holistic view of client behavior and cluster health.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add an explicit mention here that this uses micrometer (with a link to micrometer.io) and that it offers various integrations with monitoring systems (with a link to their overview)?

CHANGELOG.md Outdated
@@ -3,10 +3,11 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)

## [Unreleased 3.x]
### Added

- Metrics support includes micrometer integration and Prometheus support with a custom client-side metrics. [Metrics](./guides/metrics.md)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think that prometheus warrants explicit mention here since the integration is with micrometer and not prometheus. it's just that micrometer offers an integration with prometheus - amongst many others

Signed-off-by: psingh3 <pushpendra.singh@walmart.com>
@001mertiya
Copy link
Contributor Author

@reta Could you please review this PR ?

@reta
Copy link
Collaborator

reta commented Jun 17, 2025

@reta Could you please review this PR ?

Thank for the pull request @001mertiya , I will try to find time this week

@reta
Copy link
Collaborator

reta commented Jun 22, 2025

@reta Could you please review this PR ?

@001mertiya the Micrometer has support of Apache HttpClient 5 metrics out of the box, it could be plugged in the setHttpClientConfigCallback and requires no changes from OpenSearch Java client:

 final var transport = ApacheHttpClient5TransportBuilder.builder(hosts)
            .setMapper(new JacksonJsonpMapper())
            .setHttpClientConfigCallback(httpClientBuilder -> {
                return httpClientBuilder.addExecInterceptorAfter(
                    ChainElement.RETRY.name(), "micrometer", 
                          new ObservationExecChainHandler(observationRegistry));
            })
            .build();

Could you please clarify why the provided out of the box instrumentation is not sufficient? Thank you.

[1] https://docs.micrometer.io/micrometer/reference/reference/httpcomponents.html

@rursprung
Copy link
Contributor

@reta Could you please review this PR ?

@001mertiya the Micrometer has support of Apache HttpClient 5 metrics out of the box, it could be plugged in the setHttpClientConfigCallback and requires no changes from OpenSearch Java client:

 final var transport = ApacheHttpClient5TransportBuilder.builder(hosts)
            .setMapper(new JacksonJsonpMapper())
            .setHttpClientConfigCallback(httpClientBuilder -> {
                return httpClientBuilder.addExecInterceptorAfter(
                    ChainElement.RETRY.name(), "micrometer", 
                          new ObservationExecChainHandler(observationRegistry));
            })
            .build();

Could you please clarify why the provided out of the box instrumentation is not sufficient? Thank you.

[1] https://docs.micrometer.io/micrometer/reference/reference/httpcomponents.html

oh, that's actually very interesting! is this something which could be enabled by default if micrometer is present? (maybe just in spring-data-opensearch if it needs some class/bean-presence checks?)

@reta
Copy link
Collaborator

reta commented Jun 23, 2025

oh, that's actually very interesting! is this something which could be enabled by default if micrometer is present? (maybe just in spring-data-opensearch if it needs some class/bean-presence checks?)

In general - yes, it could be part of auto configuration

@rursprung
Copy link
Contributor

oh, that's actually very interesting! is this something which could be enabled by default if micrometer is present? (maybe just in spring-data-opensearch if it needs some class/bean-presence checks?)

In general - yes, it could be part of auto configuration

in spring-data-opensearch i guess and not here (since you mention auto configuration)? should we wait for the outcome of this issue? or should we do it anyway now?

@001mertiya
Copy link
Contributor Author

001mertiya commented Jun 30, 2025

@reta Could you please review this PR ?

@001mertiya the Micrometer has support of Apache HttpClient 5 metrics out of the box, it could be plugged in the setHttpClientConfigCallback and requires no changes from OpenSearch Java client:

 final var transport = ApacheHttpClient5TransportBuilder.builder(hosts)
            .setMapper(new JacksonJsonpMapper())
            .setHttpClientConfigCallback(httpClientBuilder -> {
                return httpClientBuilder.addExecInterceptorAfter(
                    ChainElement.RETRY.name(), "micrometer", 
                          new ObservationExecChainHandler(observationRegistry));
            })
            .build();

Could you please clarify why the provided out of the box instrumentation is not sufficient? Thank you.

[1] https://docs.micrometer.io/micrometer/reference/reference/httpcomponents.html

Hi @reta , The out of the box instrumentation provides only high-level metrics and lacks detailed insights. Additionally, we are not capturing all possible metrics. However, if we implement custom metric integration, we can improve this significantly.

  • The solution provided above provides high-level metrics only. For example, in the sample below, we get method level metrics:
# HELP http_client_requests_seconds Time spent for HTTP client exchanges
# TYPE http_client_requests_seconds gauge
http_client_requests_seconds_count{exception="none",method="GET",outcome="SUCCESS",status="200",target_host="example.com",target_port="443",uri="example.com"} 1.0

By integrating metrics within the opensearch-java client, we gain more granular control over requests and can collect detailed tags such as request type, StatusCodeOrException, and if a status code is unavailable extract the exception class name.

os_client_request_seconds_count{Request="SearchRequest", StatusCodeOrException="200", cloud_zone="", app="opensearch-example", env="dev", target_host="os-dev-1-2577", target_port="9200"} 2.0

  • Another benefit of adding metrics implementation in the opensearch-java client is that we can track additional metrics, like request and response bytes, and the count of active and inactive nodes.

The this PR, we collects a variety of metrics to provide insights into its operations. Below is a summary of key metrics, including descriptions and important dimensions (tags) for filtering and analysis:

Metric Description Important Dimensions (tags)
os.client.request.seconds End-to-end request execution latency StatusCodeOrException, Request
os.client.request.seconds.count Request throughput and error rate based on status tags StatusCodeOrException, Request
os.client.request.payload.size.bytes Request payload size in bytes Request
os.client.response.payload.size.bytes Response payload size in bytes Request
os.client.active.nodes Number of active OpenSearch nodes from the client’s perspective
os.client.inactive.nodes Number of inactive OpenSearch nodes from the client’s perspective

Metrics on dashboard:

Screenshot 2025-06-26 at 10 07 53 AM

What are your thoughts on having all the above metrics and tags? Thank you!

@reta
Copy link
Collaborator

reta commented Jul 1, 2025

Thanks @001mertiya

Hi @reta , The out of the box instrumentation provides only high-level metrics and lacks detailed insights. Additionally, we are not capturing all possible metrics. However, if we implement custom metric integration, we can improve this significantly.

We get these metrics for free, this is huge benefit.
Another huge benefit, is that every standard dashboard out there for Micrometer will work for us out of the box.
Yet another benefit is that ObservationExecChainHandler supports tracing out of the box.

Now, to your point of adding more / custom metrics or/and labels. There is nothing that prevents as to add additional metrics (if needed to) by composing multiple ExecChainHandler, however it has to be done following Micrometer conventions that are there. Regarding additional tags / labels, we have all control there by providing own ApacheHttpClientObservationConvention implementation (or extending the default one).

The maintenance cost of duplicating similar efforts and maintaining custom one off dashboards (for the standard components) is extremely high. I think we should be leveraging what has been built already at maximum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants