#sre (2022-03)

Prometheus, Prometheus Operator, Grafana, Kubernetes

Archive: https://archive.sweetops.com/monitoring/


Andy avatar

Hi all. Do any teams monitor the status of their Infrastructure independently to their website/service?

I’m asking in terms of separating the SLA responsibility:

  1. The uptime of infrastructure is the DevOps team’s responsibility
  2. The uptime of the website/service is the Development team’s responsibility Or do teams tend to say the uptime of the website/service is a shared responsibility between DevOps and Developers?
Andy avatar

Expecting some “well, it depends…”

Chris Picht avatar
Chris Picht

If your monitoring isn’t independent of your infrastructure, how can you trust it to work when your infrastructure has a failure?