SweetOps #sre for December, 2020

Archive: https://archive.sweetops.com/monitoring/

2020-12-06

sheldonh

Allow free reign in datadog or is it worth using terraform to manage most key resources? Seems like it might slow down adoption. Maybe just integrations and maybe monitors go through terraform?

Erik Osterman (Cloud Posse)

07:53:02 PM

We do monitors through terraform. Tag them as managed by terraform. The it’s easy to see what is done by hand vs automated.

Erik Osterman (Cloud Posse)

07:53:19 PM

https://github.com/cloudposse/terraform-datadog-monitor

cloudposse/terraform-datadog-monitor

Terraform module to configure and provision Datadog monitors from a YAML configuration, complete with automated tests. - cloudposse/terraform-datadog-monitor

2020-12-08

Erik Osterman (Cloud Posse)

03:29:01 PM

https://github.com/linkedin/school-of-sre

linkedin/school-of-sre

At Linkedin, we are using this curriculum for onboarding our non-traditional hires and new college grads into the SRE role. - linkedin/school-of-sre

2020-12-15

uselessuseofcat

10:16:16 AM

Hi, I would like to monitor my inrastructure with NewRelic. I have few doubts. For example ECS clusters - usually I have more than one EC2 instance per cluster, so I set hostname to be nameofthecluster-$(pwgen 4) - which adds 4 character random string to every hostname. Is there more elegant way to group them in NewRelic or something like that? What would be your approach on this?

2020-12-23

tim.j.birkett

11:33:07 PM

Hi :wave: - Not sure that this belongs in here, but is loosely related to observability (logging) - Currently running an EFK stack and I configured fluent(-bit d) to partition logs by namespace (ie an index per namespace like: fluentd-[YYYY.MM](namespace>-<http://YYYY.MM).DD). This is great, it means no field type collisions between teams, teams can search and visualise (I’m from the UK :stuck_out_tongue:) based on their own indexes (making queries kinder to Elasticsearch, and we can create finer grained curator configurations (keep fewer noisy namespace logs around).

The things that have been a bit annoying: • Having to add another index pattern to Kibana every time a new namespace pops up • Having to regularly refresh field lists on index patterns as log fields evolve over time. It was okay with 4 or 5 index patterns but is now a bit tedious with 40+. Developers forget to do it and have issues searching and visualising new logs Today, I’ve spent a day having a bit of a hack and have a script that:

Keeps Kibana index patterns in sync with the indexes in Elasticsearch based on a prefix
Updates index pattern field lists based on the presence of an environment variable

It’s over at: https://github.com/devopsmakers/kibana-index-pattern-creator - the script works well and has a DRY_RUN mode. My next step is to get an image up in Dockerhub and get a Helm chart up and running to deploy it to Kubernetes as a CronJob (or 2) - oh, and rewrite the README. Hopefully it comes in handy to others

devopsmakers/kibana-index-pattern-creator

A Docker image that creates and updates wildcard kibana index patterns for all indices. - devopsmakers/kibana-index-pattern-creator