#sre (2018-11)
Prometheus, Prometheus Operator, Grafana, Kubernetes
Archive: https://archive.sweetops.com/monitoring/
2018-11-28
![pecigonzalo avatar](https://avatars.slack-edge.com/2020-02-24/954674862595_11f6ff71106151c32655_72.png)
What are you guys using with ECS and prometheus for SD?
![pecigonzalo avatar](https://avatars.slack-edge.com/2020-02-24/954674862595_11f6ff71106151c32655_72.png)
We had our own SD in python, but im always afraid of hitting the api limits as we scale, as we did before
![pecigonzalo avatar](https://avatars.slack-edge.com/2020-02-24/954674862595_11f6ff71106151c32655_72.png)
(we had a DNS SD based on lambda and ECS events)
![joshmyers avatar](https://avatars.slack-edge.com/2018-11-20/483958217281_8117d6f6c62807ce9912_72.jpg)
At previous client who were so big they ended up having to pay for AWS API requests a tool was written to do a single lookup a la ec2_sd, write it to a file and the file gets mounted inside k8s prom, where k8s prom was multi team and if each team did their own lookups, would bust the limit
![pecigonzalo avatar](https://avatars.slack-edge.com/2020-02-24/954674862595_11f6ff71106151c32655_72.png)
Yeah, we did a tool for that in Lambda, tbh is not that complicated
![pecigonzalo avatar](https://avatars.slack-edge.com/2020-02-24/954674862595_11f6ff71106151c32655_72.png)
but I was looking for a “simpler” solution
![pecigonzalo avatar](https://avatars.slack-edge.com/2020-02-24/954674862595_11f6ff71106151c32655_72.png)
I have a similar situation for uploading new prometheus configs, without doing a docker deployment, since albeit incorrectly that was the “easy” start for us but it sort of sucks
![mrwacky avatar](https://avatars.slack-edge.com/2018-08-22/423003208646_5ad1b1ba6be6b00306b3_72.jpg)
Service registry bridge for Docker with pluggable adapters - gliderlabs/registrator
![mrwacky avatar](https://avatars.slack-edge.com/2018-08-22/423003208646_5ad1b1ba6be6b00306b3_72.jpg)
@pecigonzalo
![tamsky avatar](https://avatars.slack-edge.com/2019-10-31/817094217669_6e765cea39b456597957_72.jpg)
what’s wrong with Consul for SD ?
![joshmyers avatar](https://avatars.slack-edge.com/2018-11-20/483958217281_8117d6f6c62807ce9912_72.jpg)
![joshmyers avatar](https://avatars.slack-edge.com/2018-11-20/483958217281_8117d6f6c62807ce9912_72.jpg)
Maybe not what you want if that is all you are going to use Consul for
![tamsky avatar](https://avatars.slack-edge.com/2019-10-31/817094217669_6e765cea39b456597957_72.jpg)
@joshmyers so what are your reasons for not using Consul if used strictly for SD ?
![mrwacky avatar](https://avatars.slack-edge.com/2018-08-22/423003208646_5ad1b1ba6be6b00306b3_72.jpg)
ease of use, setup, deficiencies in AWS SD options, yeah, Consul is great
![joshmyers avatar](https://avatars.slack-edge.com/2018-11-20/483958217281_8117d6f6c62807ce9912_72.jpg)
I don’t have any. I’m just saying folks may not want to run a 3 node etcd cluster when they have been using AWS API as cheap service discovery
![mrwacky avatar](https://avatars.slack-edge.com/2018-08-22/423003208646_5ad1b1ba6be6b00306b3_72.jpg)
Good news, Consul is not etcd
![joshmyers avatar](https://avatars.slack-edge.com/2018-11-20/483958217281_8117d6f6c62807ce9912_72.jpg)
hah, oops, same thing. It is a thing you need to manage?
![tamsky avatar](https://avatars.slack-edge.com/2019-10-31/817094217669_6e765cea39b456597957_72.jpg)
of all the services I’ve operated/managed since 2014, consul is the least needy service I’ve met
![joshmyers avatar](https://avatars.slack-edge.com/2018-11-20/483958217281_8117d6f6c62807ce9912_72.jpg)
Nice
![tamsky avatar](https://avatars.slack-edge.com/2019-10-31/817094217669_6e765cea39b456597957_72.jpg)
self-bootstrapping EC2 ASG cluster FTW
![joshmyers avatar](https://avatars.slack-edge.com/2018-11-20/483958217281_8117d6f6c62807ce9912_72.jpg)
Have used with Nomad before and not had any issues with it, but it isn’t a managed type service, was my only point
![tamsky avatar](https://avatars.slack-edge.com/2019-10-31/817094217669_6e765cea39b456597957_72.jpg)
managed type services are good for getting started – one should have a plan for when your org’s needs or skills outgrow a managed service offering from anyone
![Erik Osterman (Cloud Posse) avatar](https://secure.gravatar.com/avatar/88c480d4f73b813904e00a5695a454cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0023-72.png)
…such as multi-cloud
![joshmyers avatar](https://avatars.slack-edge.com/2018-11-20/483958217281_8117d6f6c62807ce9912_72.jpg)
![Erik Osterman (Cloud Posse) avatar](https://secure.gravatar.com/avatar/88c480d4f73b813904e00a5695a454cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0023-72.png)
yea, the all elusive multi-cloud strategy
![pecigonzalo avatar](https://avatars.slack-edge.com/2020-02-24/954674862595_11f6ff71106151c32655_72.png)
@mrwacky yeah we know about consul and registrator, but as explained by @joshmyers that is ofc the option if you have Consul, we dont
![pecigonzalo avatar](https://avatars.slack-edge.com/2020-02-24/954674862595_11f6ff71106151c32655_72.png)
and while it is a good easy discovery once you have that, the question would be you do if you dont hve consul
![pecigonzalo avatar](https://avatars.slack-edge.com/2020-02-24/954674862595_11f6ff71106151c32655_72.png)
at the moment our services dont use mesh, as we dont need/want that yet
![pecigonzalo avatar](https://avatars.slack-edge.com/2020-02-24/954674862595_11f6ff71106151c32655_72.png)
so consul will be there ONLY to support prometheus discovery, and that seemed overkill to me, but maybe its the only option
![tamsky avatar](https://avatars.slack-edge.com/2019-10-31/817094217669_6e765cea39b456597957_72.jpg)
there are a lot of options. all of them that end in *_sd_config
are candidates. let us know what you pick and why:
An open-source monitoring system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach.
2018-11-29
![mrwacky avatar](https://avatars.slack-edge.com/2018-08-22/423003208646_5ad1b1ba6be6b00306b3_72.jpg)
I’m sure there’s other options..
2018-11-30
![pecigonzalo avatar](https://avatars.slack-edge.com/2020-02-24/954674862595_11f6ff71106151c32655_72.png)
Tamsky, I know the options, as I mentioned we are already using this, and we had our own lambda for SD
![pecigonzalo avatar](https://avatars.slack-edge.com/2020-02-24/954674862595_11f6ff71106151c32655_72.png)
I was trying to ping/pong how others were doing it
![tamsky avatar](https://avatars.slack-edge.com/2019-10-31/817094217669_6e765cea39b456597957_72.jpg)
(we had a DNS SD based on lambda and ECS events)
but I was looking for a “simpler” solution
I guess I was trying to help out re: “simpler” solutions.
![tamsky avatar](https://avatars.slack-edge.com/2019-10-31/817094217669_6e765cea39b456597957_72.jpg)
I have a similar situation for uploading new prometheus configs, without doing a docker deployment, since albeit incorrectly that was the “easy” start for us but it sort of sucks
how do you handle persistent storage for prometheus in your docker setup – that answer might guide us toward an easy process that can update your prometheus configs.