Skip to content

SOLR-17797: add prometheus formatter support for segment merge metrics #3407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -16,32 +16,92 @@
*/
package org.apache.solr.metrics.prometheus.core;

import com.codahale.metrics.Gauge;
import com.codahale.metrics.Metric;
import com.codahale.metrics.*;
import org.apache.solr.metrics.prometheus.SolrPrometheusFormatter;

/** Dropwizard metrics of name INDEX.* */
public class SolrCoreIndexMetric extends SolrCoreMetric {
public static final String CORE_INDEX_METRICS = "solr_metrics_core_index_size_bytes";

public static final String CORE_INDEX_FLUSH_METRICS = "solr_metrics_core_index_flush";
public static final String CORE_INDEX_MERGE_ERRORS_METRICS = "solr_metrics_core_index_merge_errors";

public static final String CORE_INDEX_MERGE_METRICS = "solr_metrics_core_index_merges";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This metric name seems wrong. It's a timer but nothing in this name explains its a timer. Name should be something like solr_core_index_merge_time_ms.

I actually think the times are ns but this PR fixes up some of the timer implementation to get the quantiles out from this metric and gives a better understanding of the unit being used by converting to ms. I'd like to get that merged in as well so this can pick up the quantiles.


// these are only for major merges with more details enabled
public static final String CORE_INDEX_MERGE_DOCS_METRICS = "solr_metrics_core_index_merged_docs";
public static final String CORE_INDEX_MERGE_DELETED_DOCS_METRICS = "solr_metrics_core_index_merged_deleted_docs";

public static final String CORE_INDEX_MERGING_METRICS = "solr_metrics_core_index_merges_running";
public static final String CORE_INDEX_MERGING_DOCS_METRICS = "solr_metrics_core_index_merging_docs";
public static final String CORE_INDEX_MERGING_SEGMENTS_METRICS = "solr_metrics_core_index_merging_segments";

public SolrCoreIndexMetric(Metric dropwizardMetric, String metricName) {
super(dropwizardMetric, metricName);
}

/*
* Metric examples being exported
* INDEX.sizeInBytes
*
* INDEX.merge.errors
* INDEX.merge.minor
* INDEX.merge.minor.running
* INDEX.merge.minor.running.docs
* INDEX.merge.minor.running.segments
* INDEX.merge.major
* INDEX.merge.major.docs
* INDEX.merge.major.deletedDocs
* INDEX.merge.major.running
* INDEX.merge.major.running.docs
* INDEX.merge.major.running.segments
Comment on lines +47 to +57
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't need to comment every metric here you export. Just a few should be fine. Might also be worth mentioning in the comment that these must be enabled

*/

@Override
public SolrCoreMetric parseLabels() {
String[] parsedMetric = metricName.split("\\.");
if (parsedMetric.length < 3) {
return this;
}
if (parsedMetric[1].equals("merge")) {
labels.put("type", parsedMetric[2]); // minor / major
if (parsedMetric.length > 3) {
labels.put("running", "true");
} else {
labels.put("running", "false");
}
}
return this;
}

@Override
public void toPrometheus(SolrPrometheusFormatter formatter) {
if (metricName.endsWith("sizeInBytes")) {
formatter.exportGauge(CORE_INDEX_METRICS, (Gauge<?>) dropwizardMetric, getLabels());
if (dropwizardMetric instanceof Gauge) {
if (metricName.endsWith("sizeInBytes")) {
formatter.exportGauge(CORE_INDEX_METRICS, (Gauge<?>) dropwizardMetric, getLabels());
} else if (metricName.endsWith("running")) {
formatter.exportGauge(CORE_INDEX_MERGING_METRICS, (Gauge<?>) dropwizardMetric, getLabels());
} else if (metricName.endsWith("segments")) {
formatter.exportGauge(CORE_INDEX_MERGING_SEGMENTS_METRICS, (Gauge<?>) dropwizardMetric, getLabels());
} else if (metricName.endsWith("docs")) {
formatter.exportGauge(CORE_INDEX_MERGING_DOCS_METRICS, (Gauge<?>) dropwizardMetric, getLabels());
Comment on lines +80 to +87
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at these metrics myself, these are very confusing:

# TYPE solr_metrics_core_index_merged_deleted_docs_total counter
solr_metrics_core_index_merged_deleted_docs_total{collection="foobar",core="core_foobar_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="major"} 0.0
# TYPE solr_metrics_core_index_merged_docs_total counter
solr_metrics_core_index_merged_docs_total{collection="foobar",core="core_foobar_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="major"} 0.0
# TYPE solr_metrics_core_index_merges gauge
solr_metrics_core_index_merges{collection="demo",core="core_demo_shard1_replica_n1",replica="replica_n1",running="false",shard="shard1",type="major"} 0.0
solr_metrics_core_index_merges{collection="demo",core="core_demo_shard1_replica_n1",replica="replica_n1",running="false",shard="shard1",type="minor"} 0.0
solr_metrics_core_index_merges{collection="foobar",core="core_foobar_shard1_replica_n1",replica="replica_n1",running="false",shard="shard1",type="major"} 0.0
solr_metrics_core_index_merges{collection="foobar",core="core_foobar_shard1_replica_n1",replica="replica_n1",running="false",shard="shard1",type="minor"} 0.0
# TYPE solr_metrics_core_index_merges_running gauge
solr_metrics_core_index_merges_running{collection="demo",core="core_demo_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="major"} 0.0
solr_metrics_core_index_merges_running{collection="demo",core="core_demo_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="minor"} 0.0
solr_metrics_core_index_merges_running{collection="foobar",core="core_foobar_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="major"} 0.0
solr_metrics_core_index_merges_running{collection="foobar",core="core_foobar_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="minor"} 0.0
# TYPE solr_metrics_core_index_merging_docs gauge
solr_metrics_core_index_merging_docs{collection="demo",core="core_demo_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="major"} 0.0
solr_metrics_core_index_merging_docs{collection="demo",core="core_demo_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="minor"} 0.0
solr_metrics_core_index_merging_docs{collection="foobar",core="core_foobar_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="major"} 0.0
solr_metrics_core_index_merging_docs{collection="foobar",core="core_foobar_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="minor"} 0.0
# TYPE solr_metrics_core_index_merging_segments gauge
solr_metrics_core_index_merging_segments{collection="demo",core="core_demo_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="major"} 0.0
solr_metrics_core_index_merging_segments{collection="demo",core="core_demo_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="minor"} 0.0
solr_metrics_core_index_merging_segments{collection="foobar",core="core_foobar_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="major"} 0.0
solr_metrics_core_index_merging_segments{collection="foobar",core="core_foobar_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="minor"} 0.0
solr_metrics_core_index_merging_segments{collection="demo",core="core_demo_shard1_replica_n1",replica="replica_n1",shard="shard1"} 1.0
solr_metrics_core_index_merging_segments{collection="foobar",core="core_foobar_shard1_replica_n1",replica="replica_n1",shard="shard1"} 0.0

They have running=true or false but at the same time one of the metrics has the name running in it with a label running=true. Is it possible that running=false but the metric name says running? Can we aggregate these better? Like one metric name saying these are the running merges then the labels to differentiate by are type=minor/major and item=docs/segments. I'm also fine with keeping them separate metric names of docs vs segments but the running label is confusing as all of them is running=true and it doesn't seem to ever be running=false But then also what are these last 2 metrics?

solr_metrics_core_index_merging_segments{collection="foobar",core="core_foobar_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="major"} 0.0
solr_metrics_core_index_merging_segments{collection="foobar",core="core_foobar_shard1_replica_n1",replica="replica_n1",running="true",shard="shard1",type="minor"} 0.0
solr_metrics_core_index_merging_segments{collection="demo",core="core_demo_shard1_replica_n1",replica="replica_n1",shard="shard1"} 1.0
solr_metrics_core_index_merging_segments{collection="foobar",core="core_foobar_shard1_replica_n1",replica="replica_n1",shard="shard1"} 0.0

It has no label for minor/major

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the last two are INDEX.segments metric (number of segments) that got inadvertently pulled in by the .endsWith convention

}
} else if (dropwizardMetric instanceof Timer) {
if (metricName.endsWith("minor") || metricName.endsWith("major")) {
formatter.exportTimer(CORE_INDEX_MERGE_METRICS, (Timer) dropwizardMetric, getLabels());
}
} else if (dropwizardMetric instanceof Counter) {
if (metricName.endsWith("errors")) {
formatter.exportCounter(CORE_INDEX_MERGE_ERRORS_METRICS, (Counter) dropwizardMetric, getLabels());
}
} else if (dropwizardMetric instanceof Meter) {
if (metricName.endsWith("flush")) {
formatter.exportMeter(CORE_INDEX_FLUSH_METRICS, (Meter) dropwizardMetric, getLabels());
} else if (metricName.endsWith("docs")) {
formatter.exportMeter(CORE_INDEX_MERGE_DOCS_METRICS, (Meter) dropwizardMetric, getLabels());
} else if (metricName.endsWith("deletedDocs")) {
formatter.exportMeter(CORE_INDEX_MERGE_DELETED_DOCS_METRICS, (Meter) dropwizardMetric, getLabels());
Comment on lines +98 to +103
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these be merged together? One single metric name with labels operation minor or major which you are doing then the type like type=merged or type=errors. Even though they are Counter and Meter I believe you can still group them because both Dropwizard meter and counters are just exported to prometheus counters.

}
}
}
}
Loading