SweetOps #prometheus for September, 2022

Archive: https://archive.sweetops.com/prometheus/

2022-09-22

mikesew

Q: has anybody setup AWS cloudwatch metrics to Prometheus using Cloudwatch Metric Streams? Newbie DBA, trying to get RDS metrics into prometheus/grafana. I’m told that a std exporter would start to cost $$ because of constant polling. Would the following architecture work?

cloudwatch metric streams => Kinesis Data Firehose => Prometheus ??

2022-09-29

Alex Box

10:09:40 AM

Hi all, wondering if anyone has a solution they could share for running analytics over historical alerts from alertmanager? For example, “alert X fired 10 times in July and 30 times in August”. This would allow the monitoring team in company Y to investigate the biggest drain for the on-call shift over time. I understand it’s a design decision of AM to not persist any state but in medium/large environments I think it’s an important area that often gets overlooked. Thanks.

Alertmanager doesn’t persist state, so it is not possible using only this tool. To preserve history you need to send alerts to some other system, for example by using webhooks.

zadkiel

07:47:27 PM

We’re currently investigating around this area. Here are the options we’re seeing as of today:

• Send alertmanager webhooks to https://github.com/tomtom-international/alertmanager-webhook-logger so we gatter alerts history in logs. Our logs are in Loki, which can then be queried easily and graphed in Grafana.

• Send alertmanager webhooks to https://gitlab.com/yakshaving.art/alertsnitch, which poops them into a MySQL database for offline analysis. The database can then be queried with Grafana’s integration.

• It looks like prometheus emits some meta metrics for alerts, did not look down this hole for the moment. See https://karma-dashboard.io/#<i class="em em-~"</i>text=10%20minutes%20ago.-,Alert%20history,-Alertmanager%20doesn%E2%80%99t%20currently>

Alertmanager doesn’t persist state, so it is not possible using only this tool. To preserve history you need to send alerts to some other system, for example by using webhooks.

Alex Box

04:02:36 PM

Thank you for sharing. In case you weren’t already aware, there is also the synthetic ALERTS{} data that Prometheus provides: https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/#inspecting-alerts-during-runtime. I think that would be a good source for post processing but I haven’t seen any uses of it yet.

Alerting rules | Prometheus

An open-source monitoring system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach.

Alex Box

04:03:03 PM

I like the idea of using SQL with alertsnitch

Pawel Rein

02:29:26 PM

I’m wondering if you found out what works best for you since the last reply. Are there any new ways to persist and visualize alert history?

Pawel Rein

02:06:28 PM

for the record, if anyone finds this thread - there’s a new player, alerta alternative https://github.com/keephq/keep

keephq/keep

The open-source alert management and AIOps platform