SweetOps #kubernetes for January, 2019

Archive: https://archive.sweetops.com/kubernetes/

2019-01-08

warrenvw

Hello. I’m curious if anyone has had performance issues running kubectl against an EKS cluster? kubectl get po takes 5 seconds to complete. FWIW, when I used kops to create the cluster, kubectl get po would return quickly.

Erik Osterman (Cloud Posse)

12:53:41 AM

hrmmmmm

Erik Osterman (Cloud Posse)

12:53:55 AM

same size nodes, and same number of pods?

Erik Osterman (Cloud Posse)

12:53:59 AM

(roughly)

Erik Osterman (Cloud Posse)

12:54:13 AM

…are you using IAM authenticator with both?

warrenvw

12:54:15 AM

actually, worker nodes are bigger.

warrenvw

12:54:29 AM

let me confirm IAM authenticator

warrenvw

12:54:53 AM

yep. uses aws-iam-authenticator.

Erik Osterman (Cloud Posse)

12:57:15 AM

so kops uses aws-iam-authenticator as well…

Erik Osterman (Cloud Posse)

12:57:17 AM

hrm…

Erik Osterman (Cloud Posse)

12:57:24 AM

@Andriy Knysh (Cloud Posse) have you noticed this?

Erik Osterman (Cloud Posse)

12:57:37 AM

(btw, are you using our terraform modules for EKS?)

warrenvw

01:09:36 AM

sorry, no, at least not yet.

warrenvw

01:10:19 AM

i wanted to find out if this is an EKS thing in general.

Andriy Knysh (Cloud Posse)

01:10:43 AM

When I was testing EKS, I didn’t notice any delay

warrenvw

01:10:58 AM

okay, that’s a good data point. thanks.

Erik Osterman (Cloud Posse)

01:11:15 AM

(@Andriy Knysh (Cloud Posse) wrote all of our EKS terraform modules)

Erik Osterman (Cloud Posse)

01:11:26 AM

https://github.com/cloudposse/terraform-aws-eks-cluster

cloudposse/terraform-aws-eks-cluster

Terraform module for provisioning an EKS cluster. Contribute to cloudposse/terraform-aws-eks-cluster development by creating an account on GitHub.

Andriy Knysh (Cloud Posse)

01:11:27 AM

Maybe the Authenticator is slow to connect to AWS

warrenvw

01:14:33 AM

i’ll investigate that. thanks.

Andriy Knysh (Cloud Posse)

01:18:27 AM

Also, how do you access the kubeconfig file?

warrenvw

01:19:14 AM

default ~/.kube/config

warrenvw

01:23:30 AM

something must not be configured properly. i’m investigating. i’ll let you know what i discover.

Erik Osterman (Cloud Posse)

01:24:10 AM

sometimes using strace helps me figure out what the process is doing

Erik Osterman (Cloud Posse)

01:24:13 AM

enough to dig deeper

2019-01-09

webb

06:41:40 AM

@webb has joined the channel

Erik Osterman (Cloud Posse)

06:46:02 AM

Months ago, I shared a link to this awesome write up: https://medium.com/kubecost/effectively-managing-kubernetes-with-cost-monitoring-96b54464e419

Effectively Managing Kubernetes with Cost Monitoring attachment image

This is the first in a series of posts for managing Kubernetes costs. Article shows how to quickly setup monitoring for basic cost metrics.

Erik Osterman (Cloud Posse)

06:46:22 AM

I saw a demo of this yesterday and am super impressed.

Erik Osterman (Cloud Posse)

06:46:52 AM

I’ve invited @webb to #kubecost, so if you have any questions ping him.

webb

06:52:56 AM

Thanks for the kind words, @Erik Osterman (Cloud Posse)! We’re ready & available to help with tuning kube infrastructure!

2019-01-10

Igor Rodionov

09:20:55 AM

@Erik Osterman (Cloud Posse) Check this out. New Year it the time to imagine it. https://blog.giantswarm.io/the-state-of-kubernetes-2019/

The State of Kubernetes 2019 attachment image

Last year I wrote a post entitled A Trip From the Past to the Future of Kubernetes. In it, I talked about the KVM and AWS versions of our stack and the imminent availability of our Azure release. I also…

sarkis

04:42:31 PM

Just got access to the GKE Serverless Add-on beta: https://cloud.google.com/knative/

Knative | Google Cloud

Knative is a Google-sponsored industry-wide project to establish the best building blocks for creating modern, Kubernetes-native cloud-based software

Erik Osterman (Cloud Posse)

07:34:11 PM

wow thats slick

sarkis

01:30:34 AM

i’m going to give it a spin… looks interesting!

sarkis

01:31:55 AM

feels a little bit like fargate is to ECS

Erik Osterman (Cloud Posse)

01:32:26 AM

yea, that’s how interpreted it

Erik Osterman (Cloud Posse)

01:35:03 AM

hah, we’ve all heard of dind (docker in docker)

Erik Osterman (Cloud Posse)

01:35:09 AM

first time hearing kind: https://github.com/kubernetes-sigs/kind

kubernetes-sigs/kind

Kubernetes IN Docker - local clusters for testing Kubernetes - kubernetes-sigs/kind

Erik Osterman (Cloud Posse)

01:37:07 AM

I think this is pretty cool. We could leverage this for testing with geodesic.

kubernetes-sigs/kind

Kubernetes IN Docker - local clusters for testing Kubernetes - kubernetes-sigs/kind

Erik Osterman (Cloud Posse)

01:37:07 AM

https://github.com/cloudposse/geodesic/issues/204

Add Better Support for Minikube · Issue #204 · cloudposse/geodesic

what Add support for Docker for Mac (DFM) Kubernetes or Minikube why Faster LDE, protyping Testing Helm Charts, Helmfiles howto I got it working very easily. Here's what I did (manually): Enabl…

2019-01-15

frednotet

10:13:02 AM

hi everyone ! I’m struggling to implement a CI/CD with Gitlab… I do have several different k8s cluster (one per stage “test”, “dev”, “stg” and “prd”) on different aws accounts (one per stage as before). I cannot find help on 2 things: how to target a specific cluster depending the branch ? and since we’re working with micro-services: how to keep a running version of my deployments on each cluster with a generic name not depending the branches names; but allowing an auto-deploy with uniques names in only one stage ? Could someone help me or link me to a good read/video about it ? right now, I just have my fresh new cluster; I still have to install/config everything (using helm).

Erik Osterman (Cloud Posse)

05:44:07 PM

hahaha there’s your problem.

Erik Osterman (Cloud Posse)

05:44:11 PM

Erik Osterman (Cloud Posse)

05:44:21 PM

I’m struggling to implement a CI/CD with Gitlab…

Erik Osterman (Cloud Posse)

05:44:41 PM

We highly recommend #codefresh.

Erik Osterman (Cloud Posse)

05:44:42 PM

https://codefresh.io/continuous-integration/codefresh-versus-gitlabci/

Codefresh vs. GitlabCI - Which one should you use

Gitlab is one of the supported GIT providers in Codefresh. In this article, we will look at the advantages of Codefresh compared to the GitlabCI platform.

Erik Osterman (Cloud Posse)

05:45:09 PM

Codefresh makes it trivial to select the cluster.

Erik Osterman (Cloud Posse)

05:45:22 PM

We’ve used different strategies.

Erik Osterman (Cloud Posse)

05:45:27 PM

e.g. for release tags

Erik Osterman (Cloud Posse)

05:45:37 PM

1.2.3-prod or 1.2.3-staging

Erik Osterman (Cloud Posse)

05:45:55 PM

for branches, I suggest using a convention.

Erik Osterman (Cloud Posse)

05:45:58 PM

e.g.

Erik Osterman (Cloud Posse)

05:46:27 PM

a branch called staging/fix-widgets would go to the staging cluster

Erik Osterman (Cloud Posse)

05:47:08 PM

how to keep a running version of my deployments on each cluster with a generic name not depending the branches names;

Erik Osterman (Cloud Posse)

05:47:11 PM

Oops. Missed that.

Erik Osterman (Cloud Posse)

05:47:27 PM

So, the meta data needs to come from somewhere.

Erik Osterman (Cloud Posse)

05:47:34 PM

It can be ENVs in the pipeline configuration.

Erik Osterman (Cloud Posse)

05:47:51 PM

It can be branch or tag names. Note, you can use tags for non-production releases.

Erik Osterman (Cloud Posse)

05:48:13 PM

It can be manual when you trigger the deployments

Erik Osterman (Cloud Posse)

05:48:30 PM

@frednotet

vitaly.markov

05:57:16 PM

cause Codefresh designed for using within Kubernetes, when Gitlab more general purpose

Erik Osterman (Cloud Posse)

05:57:50 PM

yes, exactly..

Erik Osterman (Cloud Posse)

05:58:07 PM

built from the ground up with support for docker, compose, swam, kubernetes, and helm.

frednotet

08:35:57 PM

Thanks I’m reading

frednotet

08:36:33 PM

(I just achieved my integration of gitlab but indeed I still have this multiple cluster that requires me to take the gitlab EE)

Erik Osterman (Cloud Posse)

11:44:23 PM

2019-01-17

Ajay Tripathy

04:28:14 PM

@Ajay Tripathy has joined the channel

2019-01-18

btai

07:47:43 PM

anyone have authentication problems using metrics-server with a kops cluster? Also wondering if anyones run into heapster continuously in a CrashLoopBackOff because of OOMKilled

btai

07:48:09 PM

I’ve tried increasing the mem limit on the heapster pod but it doesn’t seem to increase

Erik Osterman (Cloud Posse)

07:57:59 PM

I have seen that. I recall not being able to figure it out. We don’t have it happening any more. This was also on an older 1.9 kops cluster.

Erik Osterman (Cloud Posse)

07:58:15 PM

it was driving me mad

Erik Osterman (Cloud Posse)

07:58:27 PM

no matter how much memory I gave it, it had no effect

btai

09:08:48 PM

@Erik Osterman (Cloud Posse) how’d you fix it? im on 1.11.6 kops

btai

09:09:06 PM

also have you switched to metrics-server

Erik Osterman (Cloud Posse)

09:09:25 PM

all our configurations are here:

Erik Osterman (Cloud Posse)

09:09:27 PM

https://github.com/cloudposse/helmfiles/tree/master/releases

cloudposse/helmfiles

Comprehensive Distribution of Helmfiles. Works with helmfile.d - cloudposse/helmfiles

Erik Osterman (Cloud Posse)

09:09:31 PM

Erik Osterman (Cloud Posse)

09:09:44 PM

i never ended up fixing it on that cluster. it was a throw away.

btai

09:09:46 PM

you use prom insteaD?

Igor Rodionov

06:03:52 PM

not yet. using legacy - heapster

Erik Osterman (Cloud Posse)

09:09:52 PM

we do

btai

09:09:54 PM

oh wait

btai

09:09:57 PM

and heapster

btai

09:11:16 PM

you dont use heapster-nanny?

Erik Osterman (Cloud Posse)

09:11:26 PM

i don’t know the details

Erik Osterman (Cloud Posse)

09:11:34 PM

@Igor Rodionov would probably

btai

09:12:21 PM

the OOMKilled is also driving me mad

Erik Osterman (Cloud Posse)

09:12:50 PM

yea, sorry man!

Erik Osterman (Cloud Posse)

09:12:53 PM

i literally spent days on it

Erik Osterman (Cloud Posse)

09:13:04 PM

and didn’t figure it out

Erik Osterman (Cloud Posse)

09:14:03 PM

@Daren I forgot who is doing your prometheus stuff

Erik Osterman (Cloud Posse)

09:14:25 PM

I was having this problem on one of your clusters.

btai

10:56:45 PM

do you guys know why when I try to edit a deployment with kubectl edit the changes I make don’t stick?

Erik Osterman (Cloud Posse)

10:58:50 PM

usually it will emit an error when you exit kubectl edit

Erik Osterman (Cloud Posse)

10:59:00 PM

if it doesn’t check $?

Erik Osterman (Cloud Posse)

10:59:18 PM

kubectl edit ....; echo $?

btai

11:04:37 PM

weird

btai

11:05:30 PM

even if I increase the heapster deployment resource memory limit, it keeps dropping back down to 284Mi

btai

11:08:45 PM

no error btw @Erik Osterman (Cloud Posse)

$ k edit deployment heapster -n kube-system
deployment.extensions/heapster edited

Daren

11:17:06 PM

@Erik Osterman (Cloud Posse) @btai we did have the heapster issue. I believe it was traced to having too many old pods for it to handle

Daren

11:17:19 PM

It tried to load the state of every pod include dead ones

Erik Osterman (Cloud Posse)

11:18:03 PM

OH!! That makes sense

btai

11:17:21 PM

too many pods?

btai

11:17:42 PM

we do have alot of pods

btai

11:18:01 PM

were u able to fix it daren via configuration?

Daren

11:20:09 PM

We switched to kube-state-metrics

btai

11:20:39 PM

so heapster just flat out stopped working for you guys

Daren

11:21:30 PM

I believe we increased its memory limit to 4GB for a while then had to ditch it

btai

11:23:16 PM

so I’m unable to increase the mem limit for some reason. ill update the deployment spec resource limit for memory to 1000Mi and it will continue to stay at 284Mi

btai

11:23:23 PM

ever run into that?

btai

11:25:36 PM

I have ~5000 pods currently in this cluster

Erik Osterman (Cloud Posse)

11:27:39 PM

I had that issue. there’s also some pod auto resizer component

Erik Osterman (Cloud Posse)

11:27:45 PM

i think that was fighting with me

btai

11:28:42 PM

@Erik Osterman (Cloud Posse) you had the issue where you couldnt increase the mem limit?

Erik Osterman (Cloud Posse)

11:18:24 PM

also, i think daren is talking about exited pods

Daren

11:19:10 PM

Yes

2019-01-19

2019-01-22

btai

11:57:44 PM

@Daren since youre using kube-state-metrics, are you unable to use k top anymore

Daren

11:59:51 PM

Honestly, Ive never used it, and it appears it does not work

Daren

11:59:54 PM

# kubectl top pod
Error from server (NotFound): the server could not find the requested resource (get services http)

btai

12:03:42 AM

i see

2019-01-26

Max Moon

04:32:01 AM

https://github.com/stakater/IngressMonitorController pretty cool add-on for k8s to automatically provision health checks in 3rd party apps, these folks make a lot great open source projects, worth checking out

stakater/IngressMonitorController

A Kubernetes controller to watch ingresses and create liveness alerts for your apps/microservices in UptimeRobot, StatusCake, Pingdom, etc. – [✩Star] if you're using it! - stakater/IngressMoni…

Erik Osterman (Cloud Posse)

04:32:33 AM

Yes, stakater is cool

Erik Osterman (Cloud Posse)

04:32:36 AM

I’ve been following them too

Erik Osterman (Cloud Posse)

04:33:03 AM

I want to deploy https://github.com/stakater/Forecastle

stakater/Forecastle

Forecastle is a control panel which dynamically discovers and provides a launchpad to access applications deployed on Kubernetes – [✩Star] if you’re using it! - stakater/Forecastle

Erik Osterman (Cloud Posse)

04:33:59 AM

Max Moon

04:47:21 AM

Same here, I’ve been working on a “getting started on kubernetes” blog and was looking for fun new projects to include

Max Moon

04:50:11 AM

I’ve been trying new projects out on a Digital Ocean K8s cluster, it’s multi-master + 2 workers, 100gb storage, and a LB for $30 a month

Erik Osterman (Cloud Posse)

04:50:31 AM

that’s cool

Max Moon

04:50:32 AM

not too shabby for development

Erik Osterman (Cloud Posse)

04:50:37 AM

@Igor Rodionov has been doing that too

Max Moon

04:51:25 AM

It’s honestly a very nice experience, as you know, my setup at work is very smooth already

Erik Osterman (Cloud Posse)

04:52:14 AM

haha, that said, always want to make things smoother

Erik Osterman (Cloud Posse)

04:52:31 AM

I think the ease-of-use of GKE/digital ocean k8s is what we aspire to

Erik Osterman (Cloud Posse)

04:52:37 AM

while at the same time getting the IaC control

Max Moon

04:52:50 AM

Yeah! It’s really nice to have the model to work off of

Erik Osterman (Cloud Posse)

04:53:17 AM

Especially for smaller teams that don’t need all the bells and whistles and ultimate control over every little thing

Max Moon

04:53:42 AM

Agreed. My experience with GKE was so nice and smooth, very much so what I base a lot of our tools off of. Their cloud shell is very similar in function to Geodesic, as you’re probably aware

Erik Osterman (Cloud Posse)

04:54:17 AM

Yea, I saw that. I haven’t gone deep on it, but it validates the pattern.

Erik Osterman (Cloud Posse)

04:55:12 AM

Also, #geodesic is always positioned as a superset of other tools, which means the google cloudshell fits well inside

Erik Osterman (Cloud Posse)

04:56:05 AM

But you bring up a good point.

Erik Osterman (Cloud Posse)

04:56:24 AM

I think we can improve our messaging by comparing geodesic to the google cloud shell

Max Moon

04:57:23 AM

yeah, at least as an introduction to the idea

Igor Rodionov

05:32:16 AM

@Max Moon @erik means I also use DO for my pet projects

Max Moon

05:33:54 AM

Right! We should chat

2019-01-27

deftunix

05:54:35 PM

hi everyone, i am creating a deployment example of nginx on kubernetes using manifest file and I want add prometheus monitoring on it

deftunix

05:54:55 PM

do you have some github manifest t share?

Erik Osterman (Cloud Posse)

05:56:05 PM

Are you using prometheus operator?

Erik Osterman (Cloud Posse)

05:56:35 PM

We use helmfile + prometheus operator to deploy monitoring for nginx-ingress here: https://github.com/cloudposse/helmfiles/blob/master/releases/nginx-ingress.yaml#L156

cloudposse/helmfiles

Comprehensive Distribution of Helmfiles. Works with helmfile.d - cloudposse/helmfiles

deftunix

05:58:11 PM

@Erik Osterman (Cloud Posse) yes, I am using prometheus operator pattern

deftunix

06:00:49 PM

@Erik Osterman (Cloud Posse) I see. I will analyse your code. I would like just to add monitoring on top a easy nginx deployment like https://raw.githubusercontent.com/kubernetes/website/master/content/en/examples/controllers/nginx-deployment.yaml

Erik Osterman (Cloud Posse)

06:01:41 PM

I suggest using helm for repeatable deployments rather than raw resources

Erik Osterman (Cloud Posse)

06:01:55 PM

(unless this is just a learning exercise)

Erik Osterman (Cloud Posse)

06:02:44 PM

We install the official nginx-ingress helm chart here: https://github.com/cloudposse/helmfiles/blob/master/releases/nginx-ingress.yaml

cloudposse/helmfiles

Comprehensive Distribution of Helmfiles. Works with helmfile.d - cloudposse/helmfiles

Erik Osterman (Cloud Posse)

06:02:58 PM

(helmfile is a declarative way of deploying helm charts)

deftunix

06:03:29 PM

@Erik Osterman (Cloud Posse) yes, I am using helm. I was just trying to arrange an example based on manifest

2019-01-28

btai

08:56:17 PM

if I want to ssh into my EKS worker node, the default username is ec2user right?

btai

09:04:14 PM

ah its ec2-user

Erik Osterman (Cloud Posse)

12:40:07 AM

@btai sorry - @Andriy Knysh (Cloud Posse) is heads down today on another project for a deadline on friday

Erik Osterman (Cloud Posse)

12:40:34 AM

have you made some headway?

btai

12:41:02 AM

i figured out the ssh username, i just left my question in case someone else searches for it in the future

Erik Osterman (Cloud Posse)

12:41:10 AM

yep! that’s great.

Erik Osterman (Cloud Posse)

12:41:17 AM

We’re about to release our public slack archives (hopefully EOW)

btai

12:41:31 AM

and once i got into my worker nodes, i was able to debug my issues

btai

12:42:27 AM

i do have a suggestion for https://github.com/cloudposse/terraform-root-modules/blob/master/aws/eks/eks.tf

cloudposse/terraform-root-modules

Example Terraform service catalog of “root module” invocations for provisioning reference architectures - cloudposse/terraform-root-modules

btai

12:43:03 AM

I would change those subnet_ids to be private subnets and add a bastion module

btai

12:43:10 AM

i can help you guys do that if you’d like

Erik Osterman (Cloud Posse)

12:43:16 AM

yea, that’s a good suggestion.

Erik Osterman (Cloud Posse)

12:43:36 AM

We’d accept a PR for that (just saying…)

Erik Osterman (Cloud Posse)

12:43:36 AM

haha

2019-01-29

btai

05:19:13 PM

@Erik Osterman (Cloud Posse) i remember u mentioning an aws iam auth provider that we should use for kubernetes

btai

05:19:17 PM

which one was it? kube2iam?

Erik Osterman (Cloud Posse)

05:26:12 PM

kiam

Erik Osterman (Cloud Posse)

05:26:17 PM

(avoid kube2iam)

Erik Osterman (Cloud Posse)

05:26:40 PM

we have an example of deploying it in our helmfiles distribution

Andriy Knysh (Cloud Posse)

05:36:19 PM

@btai are you unblocked? what was the issue with worker nodes not able to access the cluster?

btai

05:39:15 PM

yes @Andriy Knysh (Cloud Posse) it was a stupid mistake, i had created the eks cluster security group but didn’t attach it to the cluster

btai

05:39:59 PM

@Erik Osterman (Cloud Posse) what was the reasoning to avoid kube2iam?

Erik Osterman (Cloud Posse)

05:41:51 PM

sec

Erik Osterman (Cloud Posse)

05:42:51 PM

thought we had a write up on it.

Erik Osterman (Cloud Posse)

05:43:12 PM

can’t find it

Erik Osterman (Cloud Posse)

05:43:32 PM

so kube2iam has a very primitive model. every node runs an a daemon.

Erik Osterman (Cloud Posse)

05:44:06 PM

when a pod needs an IAM session, it queries the metadata api which is intercepted by iptables rules and routed to the kube2iam daemon

Erik Osterman (Cloud Posse)

05:44:14 PM

that part is fine. that’s how kiam works more or less.

Erik Osterman (Cloud Posse)

05:44:26 PM

the problem is if you run a lot of pods, kube2iam will DoS AWS

Erik Osterman (Cloud Posse)

05:45:00 PM

AWS doesn’t like that and blocks you. so the pod gets rescheduled to another node (or re-re-scheduled) until starts

Erik Osterman (Cloud Posse)

05:45:24 PM

so we have this cascading problem, where one-by-one each node starts triggering rate limits

Erik Osterman (Cloud Posse)

05:45:32 PM

and then it doesn’t back off

Erik Osterman (Cloud Posse)

05:45:58 PM

so now we have 5000 pods request IAM credential in an an aggresive manner and basically the whole AWS account is hosed.

Erik Osterman (Cloud Posse)

05:46:00 PM

Erik Osterman (Cloud Posse)

05:46:07 PM

kiam has a client / server model

Erik Osterman (Cloud Posse)

05:46:21 PM

you run the servers on the masters. they are the only ones that need IAM permissions.

Erik Osterman (Cloud Posse)

05:46:42 PM

the clients request a session from the servers. the servers cache those sessions.

Erik Osterman (Cloud Posse)

05:46:58 PM

this reduces the number of instances hitting the AWS IAM APIs

Erik Osterman (Cloud Posse)

05:47:17 PM

and results in (a) faster assumed rules (b) less risk of tripping rate limits

Andriy Knysh (Cloud Posse)

05:47:53 PM

https://docs.cloudposse.com/troubleshooting/kube2iam-assuming-role-access-denied/

btai

06:38:55 PM

@Erik Osterman (Cloud Posse) awesome that makes sense. thanks for the detailed answer!

deftunix

07:07:10 PM

hi everyone, I am deploying a prometheus operator to monitor my application. I probably misunderstood how it works

deftunix

07:07:47 PM

basically, for each application or servicemonitor you will have a prometheus instance

deftunix

07:07:59 PM

or you can share the cluster one with your application? what is the practice?

Andriy Knysh (Cloud Posse)

08:57:22 PM

@deftunix when you deploy prometheus operator, it will scrape all pods including the app, so you don’t need to anything special about it

Andriy Knysh (Cloud Posse)

08:57:43 PM

here’s how we deploy it with helmfile https://github.com/cloudposse/helmfiles/blob/master/releases/prometheus-operator.yaml

cloudposse/helmfiles

Comprehensive Distribution of Helmfiles. Works with helmfile.d - cloudposse/helmfiles

Andriy Knysh (Cloud Posse)

08:58:32 PM

it will create these resources

Andriy Knysh (Cloud Posse)

08:58:41 PM

Andriy Knysh (Cloud Posse)

08:58:50 PM

https://github.com/helm/charts/tree/master/stable/prometheus-operator/templates

helm/charts

Curated applications for Kubernetes. Contribute to helm/charts development by creating an account on GitHub.

deftunix

08:59:05 PM

yes, I have the prometheus operator running and monitoring my base infrastructure

deftunix

08:59:33 PM

I deployed it using the coreos helm chart in a monitoring namespace but my application service are not scraped

deftunix

08:59:58 PM

it’s scarpping just a set of servicemonitors “seems” predefined

Andriy Knysh (Cloud Posse)

09:00:19 PM

does the app output any logs into stdout?

deftunix

09:01:04 PM

yes! I deployed an nginx with the exporter. when I created with the operator a servicemonitor and prometheus instance

deftunix

09:01:10 PM

dedicated to the app, it works

deftunix

09:01:17 PM

the target appear

deftunix

09:01:54 PM

https://github.com/coreos/prometheus-operator/blob/master/Documentation/user-guides/getting-started.md

coreos/prometheus-operator

Prometheus Operator creates/configures/manages Prometheus clusters atop Kubernetes - coreos/prometheus-operator

deftunix

09:02:00 PM

I am following this

deftunix

09:02:41 PM

but I was expecting that adding the annotation to the services the scrape was automatic

deftunix

09:02:58 PM

and new target will be showed in my target list

Andriy Knysh (Cloud Posse)

09:06:07 PM

did you also deploy kube-prometheus?

Andriy Knysh (Cloud Posse)

09:06:08 PM

https://github.com/coreos/prometheus-operator#prometheus-operator-vs-kube-prometheus

coreos/prometheus-operator

Prometheus Operator creates/configures/manages Prometheus clusters atop Kubernetes - coreos/prometheus-operator

Andriy Knysh (Cloud Posse)

09:06:20 PM

https://github.com/cloudposse/helmfiles/blob/master/releases/kube-prometheus.yaml

cloudposse/helmfiles

Comprehensive Distribution of Helmfiles. Works with helmfile.d - cloudposse/helmfiles

deftunix

09:07:44 PM

in my “cluster-metrics” prometheus yes

Andriy Knysh (Cloud Posse)

09:11:27 PM

so when you install kube-prometheus, it will install a bunch of resources including https://github.com/prometheus/node_exporter

prometheus/node_exporter

Exporter for machine metrics. Contribute to prometheus/node_exporter development by creating an account on GitHub.

Andriy Knysh (Cloud Posse)

09:11:43 PM

which will scrape metrics

deftunix

09:12:03 PM

yes, from node, apiserver, kubelets, kube-statistics

deftunix

09:12:47 PM

my problem are not the cluster-metrics, because them are fully supported by default by the helm chart but understand how the operator pattern work

2019-01-30

Erik Osterman (Cloud Posse)

04:49:13 PM

TIL: https://github.com/kubernetes/kubernetes/issues/57291

Make CrashLoopBackoff timing tuneable, or add mechanism to exempt some exits · Issue #57291 · kubernetes/kubernetes

Is this a BUG REPORT or FEATURE REQUEST?: Feature request /kind feature What happened: As part of a development workflow, I intentionally killed a container in a pod with restartPolicy: Always. The…

Erik Osterman (Cloud Posse)

04:49:34 PM

Would have assumed the threshold off a CrashLoopBackoff be configurable

Erik Osterman (Cloud Posse)

04:49:41 PM

I am working on a demo where we deliberably kill pods

Erik Osterman (Cloud Posse)

04:49:53 PM

so I want to show resiliency. oh well.

btai

10:29:26 PM

have you guys checked out https://github.com/windmilleng/tilt