#kubecost

Kubernetes resource and cost management Archive: https://archive.sweetops.com/kubecost/

2019-10-10

how to encrypt passwords in helm values.yaml, any good documents is appreciated. Thanks

Erik Osterman

see my answer in #kubernetes

2019-10-08

@AG if the prometheus-node-exporter already running in the cluster, how do I use it for kubecost

If it’s running on the default nodePort it should just get picked up by or Prometheus

1

You can also disable our node exporter

@ will it work if node-exporter running in different namespace?

Yep, it should if port is correct. Our front end (settings) will tell you if those metrics are not scraped successfully.

Thank you @

Of course! Let me know if we can help anywhere else

sure. Recently I started working on it.. will definitely post here if any question. Thanks again.

2019-09-26

A Practical Guide to Setting Kubernetes Requests and Limits

Setting Kubernetes requests and limits effectively has a major impact on application performance, stability, and cost. And yet working with many teams over the past year has shown us that determining the right values for these parameters is hard. For this reason, we have created this short guide and are launching a new product to help teams more accurately set Kubernetes requests and limits for their applications.

very nice @

1
1

Thank you, @Jan! Very much the work of @Ajay Tripathy and the team.

1
1

2019-08-27

2019-08-26

2019-08-23

So I have tried those, at least in our case, I can’t seem to populate those parameters from passing arguments to the helm chart

We deploy the chart passing a values.yaml file and additional override specific things (grafana address etc)

By setting specific parameters as arguments

Even after doing so if I edit the Nginx-conf config map I see the upstream grafana value isn’t set

@Jan, any chance you can share your values.yaml file? I can try to reproduce.

I can do in a few hours

When my kids are asleep

Will need to show you how we are installing the chart

so we are installing the chart via terraform and setting two parameters while installing as well as providing two values files

data "helm_repository" "kubecost" {
  url  = "<https://kubecost.github.io/cost-analyzer>"
  name = "kubecost"
}

resource "helm_release" "kubecost" {
  name       = "kubecost"
  namespace  = "monitoring"
  chart      = "kubecost/cost-analyzer"
  repository = "${data.helm_repository.kubecost.name}"

  set {
    name  = "ingress.hosts[0]"
    value = "kubecost.${local.cluster_domain}"
  }

  set {
    name  = "grafana.domainName[0]"
    value = "grafana.${local.cluster_domain}"
  }

  values = [
    "${file("defaults.yaml")}",
    "${file("stage.yaml")}",
  ]
}

so ingress.hosts works 100% as expected

grafana.domainName doesn not as I would expect

defaults.yaml:

global:
  prometheus:
    enabled: false
    fqdn: <http<i class="em em-//promop-prometheus-operator-prometheus.monitoring.svc.cluster.local"></i>9090>

  grafana:
    enabled: false
    scheme: "http"

  notifications:
    alertmanager:
      fqdn: <http<i class="em em-//promop-prometheus-operator-alertmanager.monitoring.svc.cluster.local"></i>9093>

# <https://kubecost.com/install>
# This token is not really secret and can be exchanged at any time
kubecostToken: xxxxxx
serviceMonitor:
  enabled: true

kubecost:
  limits:
    cpu: 300m
    memory: 128Mi

kubecostFrontend:
  limits:
    cpu: 30m
    memory: 56Mi

kubecostModel:
  limits:
    cpu: 400m
    memory: 128Mi

ingress:
  enabled: true
  annotations:
    <http<i class="em em-//kubernetes.io/ingress.class\|kubernetes.io/ingress.class>"></i> traefik

grafana:
  sidecar:
    dashboards:
      enabled: true
    datasources:
      enabled: true

stage.yaml is only used to enable slack notifications for specific accounts

global:
  notifications:
    slack:
      enabled: true
      webhook: <https://hooks.slack.com/services/xxxxxx>

that does almost not seem to be working currently

well it doesn’t get set

works if i set it via the kubecost site after

Thanks for sharing! You’re meaning to be using a ServiceMonitor correct? That looks like the only part I haven’t tested recently. @Jeremy Grodberg actually has a similar config I believe.

so everything else is working fine

I just dont seem to be able to preset those parameters (slack and grafana address)

that said I have not yet pulled the chart apart

Jeremy Grodberg

@Jan As I explained before, the setting you need is global.grafana.domainName but you are only setting grafana.domainName https://sweetops.slack.com/archives/CF9SY7QTB/p1566493293009000

The settings are under global. You need to set

global:
  grafana:
    enabled: false
    domainName: <external domain name of your Grafana>
    scheme: <http or https for accessing your Grafana>

https://github.com/cloudposse/helmfiles/blob/103b1c3b5ab68b568307017f159f07431c07d8b3/releases/kubecost.yaml#L45-L48

1

ah yea!

The settings are under global. You need to set

global:
  grafana:
    enabled: false
    domainName: <external domain name of your Grafana>
    scheme: <http or https for accessing your Grafana>

https://github.com/cloudposse/helmfiles/blob/103b1c3b5ab68b568307017f159f07431c07d8b3/releases/kubecost.yaml#L45-L48

1

which I should probably just do

any idea on the slack notification where im going wrong?

Jeremy Grodberg

The slack notification is a feature of AlertManager. If you are using your own Prometheus then you are also using your own AlertManager and have to configure Slack there.

ah I see

cheers mate

much appreciated!

Jan, looks like you got this resolved but let me know if we can help with anything else! Our product can actually integrate directly with Slack too, but we have not exposed this integration via Helm yet.

Thanks bro, I think I have everything yea

2019-08-22

So despite setting the grafana.domainName

the nginx-conf config map still uses

upstream grafana {
        server cost-analyzer-grafana.default.svc.cluster.local;

how do I change these settings from the chart?

Erik Osterman
cloudposse/helmfiles

Comprehensive Distribution of Helmfiles. Works with helmfile.d - cloudposse/helmfiles

Jeremy Grodberg

The settings are under global. You need to set

global:
  grafana:
    enabled: false
    domainName: <external domain name of your Grafana>
    scheme: <http or https for accessing your Grafana>

https://github.com/cloudposse/helmfiles/blob/103b1c3b5ab68b568307017f159f07431c07d8b3/releases/kubecost.yaml#L45-L48

cloudposse/helmfiles

Comprehensive Distribution of Helmfiles. Works with helmfile.d - cloudposse/helmfiles

1
Erik Osterman

@Jan

2019-08-20

Jeremy Grodberg
04:06:49 PM

@Jeremy Grodberg has joined the channel

2019-08-19

any one ever run into the cost-analyzer-frontend container spitting the following error ?

[emerg] host not found in upstream "cost-analyzer-grafana.default.svc.cluster.local" in /etc/nginx/conf.d/default.conf:11

ah oki found teh issue

I hand global.grafana.enabled: false yet was passing a custom domain

mmm no thats not it

Erik Osterman

do you have a service called cost-analyzer-grafana in the default namespace?

we also dont use the built in Prometheus and grafana as we have our own running

Erik Osterman

@Jeremy Grodberg has had pretty good success. we have it all up and running on our prometheus and grafana. @Ajay Tripathy @ have been a huge help.

Erik Osterman

our helmfiles are up to date

Thanks, Eric. @Jan, happy to connect if we can be useful!

Thanks @Erik Osterman I will take a look, @ I will shout

Erik Osterman

@Jan are you still using helmfile?

Will need to check

Tomorrow

We don’t use helm file at all

2019-06-26

Ryan Richards

@Erik Osterman what is the best way to convert standard helm charts into the helmfile format?

Ryan Richards

or is simply the matter of a helmfile pointing to a helm chart?

Erik Osterman

yep! the latter

Erik Osterman

all a helmfile does is describe how to install a helm chart

Erik Osterman

have you seen our #helmfile for kubecost?

Ryan Richards

ah great - now it makes more sense! thanks

2019-06-14

Ryan Richards

Will kubecost work on GKE? just now learning about it

Yep It will @Ryan Richards!

Ryan Richards

I just deployed it to test it out lol yeah works great

Ryan Richards

really liking the helmfile util

Erik Osterman

@Ryan Richards which helmfile did you use?

Ryan Richards

@Erik Osterman we are playing with a few from your repo

Erik Osterman

sweet! thanks for letting me know

@Ryan Richards great to hear! Let me know if there are additions that would be useful

2019-06-05

2019-06-04

Erik Osterman

Where is that from?

Erik Osterman

hilarious

Erik Osterman

@sarkis pointed it out to me

2019-05-22

Finally at the point where I can explore kubecost again

@ are the instructions on https://kubecost.com/install the ones to use?

Those are the ones Jan! Excited to hear more of your feedback now that we’ve publicly launched :)

Erik Osterman

@Jan we have a Helmfile that @Jeremy Grodberg is working on

wikid

Anything I can jump into?

I have the basics running already, does look pretty slick thus far

will explore more over the next few days

Erik Osterman

Our effort has gone into running it with existing grafana, Prometheus and keycloak

cool cool

I am currently running with preexisting prom and grafana too

went really smoothly

2019-04-08

Hey guys, @, @Ajay Tripathy! We’ve managed to upgrade node-exporter to v0.17.0. Also made kubecost working with our prometheus and grafana. We still missing one panel (top left) on homepage of kubecost. May I ask you to take a look into it? We’ve closed ingress with basic http auth, so I will share credentials in private chat.

@ we’re happy to take a look!! Shoot me credentials when you can!

Hey guys, we investigated this issue and it looks like Prometheus (prometheus-kube-prometheus-0 : monitoring) is being throttled for some reason. This is causing kubecost queries to return slowly and occasionally timeout. We’ve tested on 50+ nodes recently so this behavior is unexpected. Several observations on our end: 1) prometheus is being cpu throttled and 2) mem usage is 3x mem request. Any ideas on your end? We can keep investigating but wanted to check in first. @ @Erik Osterman

2019-03-25

@Igor Rodionov @ @Erik Osterman quick update… we were able to confirm why you were missing a couple metrics on the Kubecost frontend. The node-exporter metrics in question were introduced in v0.16.0 on 2018-05-15. It appears this test cluster is running node-exporter:v0.15.2. As mentioned last week, our app falls back gracefully but you would get a number of new metrics/fixes with an node-exporter upgrade. Anyways, just wanted to share this to close the case on root cause — no action required.

1
Erik Osterman

We’re going to upgrade node exporter on our side

1

2019-03-21

@Ajay Tripathy fix applied to PR, should work now

Erik Osterman

@Ajay Tripathy @ we’ve had some challenges getting it up and running

Erik Osterman

@Igor Rodionov can share more details, but in short the web UI is not working correctly & no log events

Erik Osterman

also, the chart lacks an ingress

Erik Osterman

we can submit PRs as necessary, but I think what would really help @ and @Igor Rodionov is to see what it should look like when working

Erik Osterman

and we can work backwards from there

@Igor Rodionov how can we be most helpful? Would you want to jump on phone/video call

@Erik Osterman it’s true that we don’t ship with an ingress out of the box today

Erik Osterman
05:14:21 PM
Erik Osterman

we did get this far

So that would say that KSM+Prometheus was installed correctly… that’s good

Are you able to successfully port-forward?

Is this on AWS?

Erik Osterman

we are not doing portforwarding

Erik Osterman

our objective is to expose it behind IAP (as part of our portal)

Erik Osterman

but right now it’s public on our test account

Is there an endpoint you can share?

Erik Osterman

i’ll DM you

We typically have teams get port forwarding working and then stand up an end point soon after.

05:26:52 PM

@ and @Igor Rodionov we’re able to successfully load your UI but it looks like one query (idleness) is returning null. We’re investigating why now.

@Erik Osterman @ do you know why this prometheus query node_cpu_seconds_total would not be returning data on your cluster? Maybe node_exporter doesn’t have the permissions needed?

Igor Rodionov

hm… we need to check that

Igor Rodionov

really we expected that helm install will guarantee all required permissions

we expected that as well. we haven’t seen this before. we’ll continue investigating on our end. it does seem to be related to node exporter from what we’ve seen so far.

but just to be clear… the app loads fine for us it’s just this one issue that we’re seeing..

Igor Rodionov

how about to schedule the meeting to debug this togeather?

Igor Rodionov

the problem is that there are poor logging in cost-analizer server

Igor Rodionov

so we do not where to look

Igor Rodionov

also I do not know how you configured scrappers for prometheus

Igor Rodionov

if you can speedup us with that - would be perfect

yes — happy to meet, are you free in 20 mins? we’ll investigate further before then.

Igor Rodionov

can we schedule it your evening?

Igor Rodionov

in my zone it is 23:57

Igor Rodionov

and I have few calls before sleep (

Igor Rodionov

how about your 20:00 ?

Yes, we can speak this evening. @Ajay Tripathy has to go to the airport around that time though. Could we speak at 19:30 Pacific?

Igor Rodionov

sec

Igor Rodionov

ok

Igor Rodionov

I will wake up that time

Sg, we’re also looking this problem now. It looks like you may have had an existing node exporter deployment on this cluster? Does that sound right?

Erik Osterman

Yep!

Erik Osterman

We have a full kube-prometheus deployment which includes node exporter

Ok, that appears to be causing the issue. Still investigating.

@Erik Osterman @Igor Rodionov we’ve been able to reproduce. We don’t reinstall node_exporter if there’s an existing installation in your cluster. That works fine with the default install. But for some reason the configuration on your node exporter isn’t allowing metrics to land in prometheus. Regardless, we’ve pushed a change so that the app still functions without any issues if you restart the kubecost-cost-analyzer pod. We’ll discuss this underlying problem further with Igor tonight. Let me know if you have any questions!

1
Erik Osterman

Thanks @!

Erik Osterman

maybe it’s cause our node exporter is wired up with kube-prometheus

Erik Osterman

yet we don’t have kubecost pointed to that prometheus (which is our ultimate goal, but we thought we’d try to first get it up with the built-in prometheus and grafana)

Yeah, that sounds like it could be the cause… we’ll look into some more before our call with Igor. Positive is that not having this data just slightly limits functionality… it shouldn’t break anything

1
Erik Osterman

“graceful degradation”

Igor Rodionov

Here

Erik Osterman

@

hmm, we’re on zoom

you on another meeting id?

2019-03-20

Hey @Ajay Tripathy! May I ask to check for PR: <https://github.com/kubecost/cost-analyzer-helm-chart/pull/3>

Ajay Tripathy

Hey @ – taking a look

Ajay Tripathy

seems to still not run after the spacing fix– I can take a look

Erik Osterman

@Ajay Tripathy you can hold off

Erik Osterman

@ is going to pair with @Igor Rodionov on the helm stuff (he’s just getting up to speed on helm)

Erik Osterman

…they are working on it today (OMST)

Ajay Tripathy

Ack, thanks.

2019-03-19

@Erik Osterman this access isn’t required for the initial kubecost installation. This wouldn’t be blocking you at this point would it?

Erik Osterman

No, not blocking per say

Erik Osterman

Just was hoping to knock it all out at once

@Ajay Tripathy and I will discuss today. Might be something we can support quickly. Did you guys want to submit a PR?

Erik Osterman

@ a quick call with @Ajay Tripathy and we can probably sort it all out

Ajay Tripathy

@Erik Osterman put some time on your calendar for 4:45– happy to help.

Ajay Tripathy

err, 4:45 pm PST today, to be clear.

Erik Osterman

thanks!

2019-03-18

Erik Osterman

@ we are going to take a stab at the helmfile today for kubecost

Erik Osterman

@ is going to work on it

04:44:20 AM

@ has joined the channel

Erik Osterman

we’re going to integrate it with our version of grafana/prometheus

Erik Osterman

@ might be reaching out if he gets stuck

sweet! please to meet you @. @Ajay Tripathy and I are here if we can help in any way!

Erik Osterman

@Ajay Tripathy @ will he need an IAM role for the chart? … to be able to injest cost data and/or AWS account data?

I’ll let @Ajay Tripathy confirm but you should just need the ability to allow Tiller to install charts, at least in the namespace kubecost will run in

No IAM account needed for out of the box billing data!

1
Erik Osterman

oh nice!

Erik Osterman

and for the ability to pull in cost data for stuff outside of k8s? (e.g. rds)

Erik Osterman

…or is that an enterprise feature

that will require a key to access your accounts billing data but it’s not required at installation…

out of the box we just use this AWS/GCP public billing api

Erik Osterman

does it support pod annotations? (we use kiam)

it does look at pod annotations/labels for cost allocation…

how are you using kiam in this context?

Erik Osterman

@ we’ll need to submit a PR to support annotations here for kiam

Erik Osterman
uswitch/kiam

Integrate AWS IAM with Kubernetes. Contribute to uswitch/kiam development by creating an account on GitHub.

Erik Osterman

by placing the <http://iam.amazonaws.com/role> annotation on a pod, we’re able to grant specific permissions to a pod (E.g. readonly AWS access)

Erik Osterman

@Ajay Tripathy do you have a minimal IAM policy for kubecost? we don’t want to grant all readonly b/c we have a lot of secrets in SSM

Ajay is taking a look now. I’m pretty sure we don’t need secrets read permission. Are there others that might be problematic?

Ajay Tripathy

Hi @Erik Osterman, we don’t need to read kubernetes secrets. I believe we currently use all the others detailed here https://github.com/kubecost/cost-analyzer-helm-chart/blob/master/cost-analyzer/templates/cost-analyzer-cluser-role-template.yaml for insights. Are there specific concerns?

kubecost/cost-analyzer-helm-chart

Contribute to kubecost/cost-analyzer-helm-chart development by creating an account on GitHub.

Erik Osterman

this doesn’t have to do with kubernetes secrets

Erik Osterman

this as to do with how to access AWS resources securely from kubernetes pods

Erik Osterman

…if we are to use kubecost to ingest data from AWS APIs, we need credentials

Erik Osterman

hardcoding credentials is an anti-pattern

Erik Osterman

(e.g. do not ever set AWS_ACCESS_KEY_ID or AWS_SECRET_ACCESS_KEY)

Erik Osterman

instead, we rely on the fact that the AWS SDK automatically handles STS tokens (short lived, automatically rotated tokens)

Erik Osterman

kiam is the “glue” that makes all of this possible in k8s on AWS

Erik Osterman

as I recall, a recent release of kubecost added the ability to ingest resources running in an account outside of what’s running inside of the k8s cluster (e.g. an RDS database)

Erik Osterman

in order to be able to do that, we’ll need to setup an IAM role with sufficient permissions

Erik Osterman

anyways, it’s a very easy thing for @ to open a PR for. . .

Erik Osterman

more importantly, I was hoping to find out what IAM permissions were needed (or basically, which resources it currently supports indexing)

Ajay Tripathy

So, the current integration with billing data does set the AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY. We’d accept a PR to handle STS tokens– agreed it should not be hard, just hasn’t come up before. The required IAM permissions are AmazonEC2ReadOnlyAccess read and AmazonAthenaFullAccess .

Erik Osterman

Do you use the official AWS SDK?

Erik Osterman

(if so, then it works automatically; however if kubecost adds extra validation that AWS_ACCESS_KEY_ID, and AWS_SECRET_ACCESS_KEY are set, then that may break it since they won’t be set)

Ajay Tripathy

yes, for golang.

Erik Osterman

ok, go sdk supports it.

2019-02-14

Erik Osterman
05:20:32 AM

@Erik Osterman set the channel purpose: Kubernetes resource and cost management Archive: https://archive.sweetops.com/kubecost/

2019-02-07

richwine
06:44:13 AM

@richwine has joined the channel

2019-01-29

bamaral
01:34:55 PM

@bamaral has joined the channel

Ajay Tripathy

Hey guys, we just built a tool to help understand why the cluster autoscaler is scaling up or down. It’s also generally useable to determine whether nodes are safe to turn down whether you use autoscaling or not. We wrote a blog post about it, here’s a pre-release draft: https://medium.com/@ajaytripathy/understanding-kubernetes-cluster-autoscaling-675099a1db92 . Would love any feedback!

Understanding Kubernetes Cluster Autoscaling – Ajay Tripathy – Medium

One of the great promises of using Kubernetes is that it has the ability scale your infrastructure dynamically based on user demand.

1
1
Erik Osterman

@Dylan

Dylan
11:40:43 PM

@Dylan has joined the channel

2019-01-28

05:07:26 PM

@ has joined the channel

2019-01-27

08:39:27 AM

@ has joined the channel

2019-01-25

davidvasandani
05:52:18 PM

@davidvasandani has joined the channel

2019-01-18

Dan Garfield
10:09:06 PM

@Dan Garfield has joined the channel

2019-01-17

Mornig

How do I get started?

Erik Osterman

@ can I share that document you shared with me?

Erik Osterman

… if so, okay to post here in this channel?

@Jan sorry for just seeing this… here’s the link to our first pilot install doc! it would be awesome to get your initial thoughts. all feedback is welcome https://docs.google.com/document/d/1_aYbaq6IZR4tpeltA8HzWnnEhODhMZF0eysfv3VbzPg/edit

Haha thanks

No problem!

@Erik Osterman does it look like this setup process will fit with the way you build helm charts?

Will take a dog tomorrow

1

lol I meant a dig

1

Sounds good!

Ajay Tripathy
04:27:56 PM

@Ajay Tripathy has joined the channel

05:42:54 PM

@ has joined the channel

2019-01-12

05:35:58 PM

@ has joined the channel

2019-01-10

dustinvb
04:52:02 PM

@dustinvb has joined the channel

this is fantastic!

thanks, @Adam! we’re just getting started so lots more on the way

1
1

2019-01-09

Erik Osterman
06:35:13 AM

@Erik Osterman has joined the channel

06:35:13 AM

@ has joined the channel

Erik Osterman
06:35:13 AM

@Erik Osterman set the channel purpose: Kubernetes resource and cost management

Erik Osterman
06:35:37 AM

@Erik Osterman set the channel topic: https://www.kubecost.com/

Jan
06:36:28 AM

@Jan has joined the channel

Erik Osterman

I was telling @Jan about KubeCost today

Erik Osterman

One of his top priorities is to reduce spend by moving to k8s. It’s too early for them right now, but I think KubeCost directly suits their use-case.

Erik Osterman

@ gave me an awesome demo yesterday

Erik Osterman

can’t wait to add it to our distribution

Igor Rodionov
06:39:52 AM

@Igor Rodionov has joined the channel

aknysh
06:39:52 AM

@aknysh has joined the channel

joshmyers
06:39:52 AM

@joshmyers has joined the channel

awesome, we’re available to help any time @Jan!

thanks for creating the channel and the shoutout @Erik Osterman

Erik Osterman
06:51:13 AM
Erik Osterman

grafana dashboards for #kubernetes cost management

Erik Osterman

plus they have some other dashboards to help “right size” pods

Hey hey

Yea it will suit our needs really well

And will start to be useful nearly right away

Grafana based in even better

We will be moving all monitoring away from new relic and into in cluster prometheous / grafana during our migration as is

patrickleet
07:06:46 AM

@patrickleet has joined the channel

Erik Osterman

@patrickleet this is the dashboard for kube cost management I was telling you about. started also by xgooglers

1
tamsky
07:08:25 AM

@tamsky has joined the channel

Adam
07:10:20 AM

@Adam has joined the channel

Nice, @Jan! Lots more functionality on the way. Happy to discuss everything that we’re building if it will be helpful.

I’m used to building functionality like this specifically for k8s and aws so I may be of use for feedback and deep diving

@Jan love it! let’s talk when you’re ready

Absolutely, it’s currently not redundant but it’s working

Oh. Sorry, mixing up conversations

Still super happy to contribute :)

    keyboard_arrow_up