SweetOps #sre for November, 2018

Archive: https://archive.sweetops.com/monitoring/

2018-11-28

pecigonzalo

10:22:11 AM

What are you guys using with ECS and prometheus for SD?

pecigonzalo

10:22:38 AM

We had our own SD in python, but im always afraid of hitting the api limits as we scale, as we did before

pecigonzalo

10:22:54 AM

(we had a DNS SD based on lambda and ECS events)

joshmyers

11:41:12 AM

At previous client who were so big they ended up having to pay for AWS API requests a tool was written to do a single lookup a la ec2_sd, write it to a file and the file gets mounted inside k8s prom, where k8s prom was multi team and if each team did their own lookups, would bust the limit

pecigonzalo

01:55:52 PM

Yeah, we did a tool for that in Lambda, tbh is not that complicated

pecigonzalo

01:55:59 PM

but I was looking for a “simpler” solution

pecigonzalo

01:57:08 PM

I have a similar situation for uploading new prometheus configs, without doing a docker deployment, since albeit incorrectly that was the “easy” start for us but it sort of sucks

mrwacky

06:57:27 PM

https://github.com/gliderlabs/registrator

gliderlabs/registrator

Service registry bridge for Docker with pluggable adapters - gliderlabs/registrator

mrwacky

06:57:33 PM

@pecigonzalo

tamsky

06:58:12 PM

what’s wrong with Consul for SD ?

joshmyers

07:02:13 PM

You need consul?

joshmyers

07:02:26 PM

Maybe not what you want if that is all you are going to use Consul for

tamsky

07:20:23 PM

@joshmyers so what are your reasons for not using Consul if used strictly for SD ?

mrwacky

07:20:45 PM

ease of use, setup, deficiencies in AWS SD options, yeah, Consul is great

joshmyers

07:21:21 PM

I don’t have any. I’m just saying folks may not want to run a 3 node etcd cluster when they have been using AWS API as cheap service discovery

mrwacky

07:22:00 PM

Good news, Consul is not etcd

joshmyers

07:22:39 PM

hah, oops, same thing. It is a thing you need to manage?

tamsky

07:23:59 PM

of all the services I’ve operated/managed since 2014, consul is the least needy service I’ve met

joshmyers

07:24:38 PM

Nice

tamsky

07:25:37 PM

self-bootstrapping EC2 ASG cluster FTW

joshmyers

07:25:41 PM

Have used with Nomad before and not had any issues with it, but it isn’t a managed type service, was my only point

tamsky

07:27:11 PM

managed type services are good for getting started – one should have a plan for when your org’s needs or skills outgrow a managed service offering from anyone

Erik Osterman (Cloud Posse)

07:30:02 PM

…such as multi-cloud

joshmyers

07:36:38 PM

Aye, multi cloud is hard though

Erik Osterman (Cloud Posse)

07:43:12 PM

yea, the all elusive multi-cloud strategy

pecigonzalo

07:57:32 AM

@mrwacky yeah we know about consul and registrator, but as explained by @joshmyers that is ofc the option if you have Consul, we dont

pecigonzalo

07:58:18 AM

and while it is a good easy discovery once you have that, the question would be you do if you dont hve consul

pecigonzalo

07:58:55 AM

at the moment our services dont use mesh, as we dont need/want that yet

pecigonzalo

07:59:11 AM

so consul will be there ONLY to support prometheus discovery, and that seemed overkill to me, but maybe its the only option

tamsky

06:13:52 PM

there are a lot of options. all of them that end in *_sd_config are candidates. let us know what you pick and why:

https://prometheus.io/docs/prometheus/latest/configuration/configuration/

Configuration | Prometheus

An open-source monitoring system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach.

2018-11-29

mrwacky

04:16:38 PM

I’m sure there’s other options..

2018-11-30

pecigonzalo

08:27:14 AM

Tamsky, I know the options, as I mentioned we are already using this, and we had our own lambda for SD

pecigonzalo

08:27:25 AM

I was trying to ping/pong how others were doing it

tamsky

08:47:26 PM

(we had a DNS SD based on lambda and ECS events)
but I was looking for a “simpler” solution

I guess I was trying to help out re: “simpler” solutions.

tamsky

08:48:21 PM

I have a similar situation for uploading new prometheus configs, without doing a docker deployment, since albeit incorrectly that was the “easy” start for us but it sort of sucks

how do you handle persistent storage for prometheus in your docker setup – that answer might guide us toward an easy process that can update your prometheus configs.

#sre (2018-11)

Prometheus, Prometheus Operator, Grafana, Kubernetes

2018-11-28

2018-11-29

2018-11-30