Extensible metrics for monitoring and alerts
You can use the HCL Commerce Version 9.1 Metrics Monitoring framework with built-in performance dashboards, or build your own. The monitoring data are collected using Micrometer and are provided in the industry-standard Prometheus format. This means you can use them with many different tools. HCL provides a set of Grafana dashboards to get you started.
You can also use this Metrics Monitoring framework to visualize
the cache requests sent and received by Nifi . Using this new API, http://NIFIHOST:30690/monitor/metrics
, the monitoring data can
be collected in the industry-standard Prometheus format which can be used in Grafana or
any other different tools to visualize the cache requests sent and received by Nifi.
There are three parts to the monitoring framework. First, a fully-customizable presentation layer enables you to use your preferred tools to report and analyze your systems' performance. The flexibility of this layer comes from its use of a vendor-neutral, industry-standard data-representation language. This is the open-source Prometheus toolkit. Finally, Prometheus gets its data from the fully-customizable Micrometer library, which "scrapes" the data from your containers.
Reporting and dashboarding
The top of the framework is the reporting layer. Because your data is represented in the Prometheus format, you can use many different tools to display and analyze it. One popular dashboarding tool is Grafana (https://grafana.com/). Grafana is often used with Prometheus to provide graphical analysis of monitoring data.
You can download the HCL Commerce Grafana Dashboards from the HCL License and Delivery portal. For more information on the available HCL Commerce Grafana Dashboard package, see HCL Commerce eAssemblies.
The Prometheus toolkit
HCL Commerce metrics use the Prometheus text-based exposition format. Although this format is native to the Prometheus monitoring framework (https://prometheus.io/), the popularity of the format has led to widespread adoption and support. For example, Prometheus joined the Cloud Native Computing Foundation in 2016 as the second hosted project, after Kubernetes.
Micrometer application monitoring
Monitoring and performance data is scraped using the JVM-based Micrometer instrumentation library. The key concept for Micrometer is the meter. A rich set of predefined meter primitives exist that define times, counters, gauges and other data collection types. You can use the default meters to aggregate performance and monitoring data from your containers, or customize your own.
Metrics for the performance of each container are exposed at its /monitor/metrics endpoint. They are collected by a process known as “scraping.” Micrometer scrapes the metrics endpoint on all containers at a configurable internal. The metrics are stored in a database where other services can access them. In Kubernetes environments, scrapers also add contextual metadata to the metrics obtained by endpoints, such as the service, namespace, and pod that identify the origin of the data.
Configuring meters
EXPOSE_METRICS=true
Metrics are exposed on each pods
on the following paths and ports:Deployment | Path | Metrics port (http) |
---|---|---|
demoqaauthcrs-app | /monitor/metrics | 8280 |
demoqaauthsearch-* | /monitor/metrics | 3280 |
demoqaliveingest-app | /monitor/metrics | 30880 |
demoqalivequery-app | /monitor/metrics | 30280 |
demoqaauthts-app | /monitor/metrics | 5280 |
demoqaauthxc-app | /monitor/metrics | 9280 |
demoqaingest-app | /monitor/metrics | 30880 |
demoqalivequery-app | /monitor/metrics | 30280 |
In addition to enabling metrics, the Helm chart exposes the metrics port thru the
services, and offers the option to define a servicemonitor (
metrics.servicemonitor.enabled
,
metrics.servicemonitor.namespace
) for use with Prometheus
Operator.
Implementing custom meters
In addition to the default set of meters, you can add your own. When meters are enabled, the Metrics class makes the global registry available. Meters added to the global registry are automatically published to the metrics endpoint.
New meters can be added to the registry by using the Micrometer APIs. See the Micrometer Javadoc for API details: https://javadoc.io/doc/io.micrometer/micrometer-core/1.3.5/index.html.
Samples
The following examples show how metrics can be used from custom code.
- Counters
- A positive count that can be increased by a fixed amount. For example,
“number of requests.” Prometheus includes functions such as
rate()
andincrease()
that can be used to protect against counter resets - Timers
- Timers are used to track the duration and frequency of events. Besides
calculating the average durations, the API allows to configure a set of
Service Level Objectives (SLO), which are translated to histogram
buckets. SLOs can also be used to calculate quantiles. For more
information, see Histograms and Summaries on
the Prometheus website.
The Metrics class defines SLOs for common usages. For example, Metrics.DEFAULT_SLO_REST_DURATIONS_NAME defines buckets that are appropriate for typical REST execution times. If your timer doesn’t match these durations, you can specify new values as a long array. For more information, see
.sla()
in the Timer.Builder class definition on the Micrometer website.Example: Adding a new Timer with known value labels
private static Timer BACKEND_TIMER = Metrics.isEnabled() ? Timer.builder( "backend.calls.duration" ) .tags( "result", "ok" ) .sla( Metrics.getSLOsByName(Metrics.DEFAULT_SLO_REST_DURATIONS_NAME) ) .description("Duration of successful backend requests") .builder.register( Metrics.getRegistry()) : null; … if (BACKEND_TIMER != null ) { startTime = System.nanoTime(); } doWork(); if (BACKEND_TIMER != null ) { final long deltaTime = System.nanoTime() - startTime; BACKEND_TIMER .record(deltaTime, TimeUnit.NANOSECONDS ); }
When using a Timer with label values that are not known in advance, the micrometer API doesn’t allow for SLO
(.sla(..))
to be specified. In order to achieve this, define a meter filter to merge the config. TheMetrics.applySLO(final String metricName, final long[] slos)
orMetrics.applySLO(final String metricName, final String name)
utility methods can be used for the same. - Gauges
- A gauge holds a value that can increase and decrease over time. The meter is mapped to a function to obtain the value. Examples include number of active sessions and current cache sizes.