#kubecost (2019-03)

https://www.kubecost.com/

Kubernetes resource and cost management

Archive: https://archive.sweetops.com/kubecost/

2019-03-18

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@webb we are going to take a stab at the helmfile today for kubecost

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Maxim Mironenko (Cloud Posse) is going to work on it

Maxim Mironenko (Cloud Posse) avatar
Maxim Mironenko (Cloud Posse)
04:44:20 AM

@Maxim Mironenko (Cloud Posse) has joined the channel

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

we’re going to integrate it with our version of grafana/prometheus

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Maxim Mironenko (Cloud Posse) might be reaching out if he gets stuck

webb avatar

sweet! please to meet you @Maxim Mironenko (Cloud Posse). @Ajay Tripathy and I are here if we can help in any way!

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Ajay Tripathy @webb will he need an IAM role for the chart? … to be able to injest cost data and/or AWS account data?

webb avatar

I’ll let @Ajay Tripathy confirm but you should just need the ability to allow Tiller to install charts, at least in the namespace kubecost will run in

webb avatar

No IAM account needed for out of the box billing data!

1
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

oh nice!

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

and for the ability to pull in cost data for stuff outside of k8s? (e.g. rds)

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

…or is that an enterprise feature

webb avatar

that will require a key to access your accounts billing data but it’s not required at installation…

webb avatar

out of the box we just use this AWS/GCP public billing api

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

does it support pod annotations? (we use kiam)

webb avatar

it does look at pod annotations/labels for cost allocation…

webb avatar

how are you using kiam in this context?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
kubecost/cost-analyzer-helm-chart

Contribute to kubecost/cost-analyzer-helm-chart development by creating an account on GitHub.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Maxim Mironenko (Cloud Posse) we’ll need to submit a PR to support annotations here for kiam

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
uswitch/kiam

Integrate AWS IAM with Kubernetes. Contribute to uswitch/kiam development by creating an account on GitHub.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

by placing the [iam.amazonaws.com/role](http://iam.amazonaws.com/role) annotation on a pod, we’re able to grant specific permissions to a pod (E.g. readonly AWS access)

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Ajay Tripathy do you have a minimal IAM policy for kubecost? we don’t want to grant all readonly b/c we have a lot of secrets in SSM

webb avatar

Ajay is taking a look now. I’m pretty sure we don’t need secrets read permission. Are there others that might be problematic?

Ajay Tripathy avatar
Ajay Tripathy

Hi @Erik Osterman (Cloud Posse), we don’t need to read kubernetes secrets. I believe we currently use all the others detailed here https://github.com/kubecost/cost-analyzer-helm-chart/blob/master/cost-analyzer/templates/cost-analyzer-cluser-role-template.yaml for insights. Are there specific concerns?

kubecost/cost-analyzer-helm-chart

Contribute to kubecost/cost-analyzer-helm-chart development by creating an account on GitHub.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

this doesn’t have to do with kubernetes secrets

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

this as to do with how to access AWS resources securely from kubernetes pods

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

…if we are to use kubecost to ingest data from AWS APIs, we need credentials

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

hardcoding credentials is an anti-pattern

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

(e.g. do not ever set AWS_ACCESS_KEY_ID or AWS_SECRET_ACCESS_KEY)

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

instead, we rely on the fact that the AWS SDK automatically handles STS tokens (short lived, automatically rotated tokens)

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

kiam is the “glue” that makes all of this possible in k8s on AWS

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

as I recall, a recent release of kubecost added the ability to ingest resources running in an account outside of what’s running inside of the k8s cluster (e.g. an RDS database)

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

in order to be able to do that, we’ll need to setup an IAM role with sufficient permissions

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

anyways, it’s a very easy thing for @Maxim Mironenko (Cloud Posse) to open a PR for. . .

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

more importantly, I was hoping to find out what IAM permissions were needed (or basically, which resources it currently supports indexing)

Ajay Tripathy avatar
Ajay Tripathy

So, the current integration with billing data does set the AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY. We’d accept a PR to handle STS tokens– agreed it should not be hard, just hasn’t come up before. The required IAM permissions are AmazonEC2ReadOnlyAccess read and AmazonAthenaFullAccess .

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Do you use the official AWS SDK?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

(if so, then it works automatically; however if kubecost adds extra validation that AWS_ACCESS_KEY_ID, and AWS_SECRET_ACCESS_KEY are set, then that may break it since they won’t be set)

Ajay Tripathy avatar
Ajay Tripathy

yes, for golang.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

ok, go sdk supports it.

2019-03-19

webb avatar

@Erik Osterman (Cloud Posse) this access isn’t required for the initial kubecost installation. This wouldn’t be blocking you at this point would it?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

No, not blocking per say

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Just was hoping to knock it all out at once

webb avatar

@Ajay Tripathy and I will discuss today. Might be something we can support quickly. Did you guys want to submit a PR?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@webb a quick call with @Ajay Tripathy and we can probably sort it all out

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
Ajay Tripathy avatar
Ajay Tripathy

@Erik Osterman (Cloud Posse) put some time on your calendar for 4:45– happy to help.

Ajay Tripathy avatar
Ajay Tripathy

err, 4:45 pm PST today, to be clear.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

thanks!

2019-03-20

Maxim Mironenko (Cloud Posse) avatar
Maxim Mironenko (Cloud Posse)

Hey @Ajay Tripathy! May I ask to check for PR: <https://github.com/kubecost/cost-analyzer-helm-chart/pull/3>

Ajay Tripathy avatar
Ajay Tripathy

Hey @Maxim Mironenko (Cloud Posse) – taking a look

Ajay Tripathy avatar
Ajay Tripathy

seems to still not run after the spacing fix– I can take a look

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Ajay Tripathy you can hold off

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Maxim Mironenko (Cloud Posse) is going to pair with @Igor Rodionov on the helm stuff (he’s just getting up to speed on helm)

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

…they are working on it today (OMST)

Ajay Tripathy avatar
Ajay Tripathy

Ack, thanks.

2019-03-21

Maxim Mironenko (Cloud Posse) avatar
Maxim Mironenko (Cloud Posse)

@Ajay Tripathy fix applied to PR, should work now

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Ajay Tripathy @webb we’ve had some challenges getting it up and running

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Igor Rodionov can share more details, but in short the web UI is not working correctly & no log events

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

also, the chart lacks an ingress

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

we can submit PRs as necessary, but I think what would really help @Maxim Mironenko (Cloud Posse) and @Igor Rodionov is to see what it should look like when working

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

and we can work backwards from there

webb avatar

@Igor Rodionov how can we be most helpful? Would you want to jump on phone/video call

webb avatar

@Erik Osterman (Cloud Posse) it’s true that we don’t ship with an ingress out of the box today

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
05:14:21 PM
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

we did get this far

webb avatar

hehe

webb avatar

So that would say that KSM+Prometheus was installed correctly… that’s good

webb avatar

Are you able to successfully port-forward?

webb avatar

Is this on AWS?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

we are not doing portforwarding

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

our objective is to expose it behind IAP (as part of our portal)

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

but right now it’s public on our test account

webb avatar

Is there an endpoint you can share?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

i’ll DM you

webb avatar

We typically have teams get port forwarding working and then stand up an end point soon after.

webb avatar
webb
05:26:52 PM

@Maxim Mironenko (Cloud Posse) and @Igor Rodionov we’re able to successfully load your UI but it looks like one query (idleness) is returning null. We’re investigating why now.

webb avatar

@Erik Osterman (Cloud Posse) @Maxim Mironenko (Cloud Posse) do you know why this prometheus query node_cpu_seconds_total would not be returning data on your cluster? Maybe node_exporter doesn’t have the permissions needed?

Igor Rodionov avatar
Igor Rodionov

hm… we need to check that

Igor Rodionov avatar
Igor Rodionov

really we expected that helm install will guarantee all required permissions

webb avatar

we expected that as well. we haven’t seen this before. we’ll continue investigating on our end. it does seem to be related to node exporter from what we’ve seen so far.

webb avatar

but just to be clear… the app loads fine for us it’s just this one issue that we’re seeing..

Igor Rodionov avatar
Igor Rodionov

how about to schedule the meeting to debug this togeather?

Igor Rodionov avatar
Igor Rodionov

the problem is that there are poor logging in cost-analizer server

Igor Rodionov avatar
Igor Rodionov

so we do not where to look

Igor Rodionov avatar
Igor Rodionov

also I do not know how you configured scrappers for prometheus

Igor Rodionov avatar
Igor Rodionov

if you can speedup us with that - would be perfect

webb avatar

yes — happy to meet, are you free in 20 mins? we’ll investigate further before then.

Igor Rodionov avatar
Igor Rodionov

can we schedule it your evening?

Igor Rodionov avatar
Igor Rodionov

in my zone it is 23:57

Igor Rodionov avatar
Igor Rodionov

and I have few calls before sleep (

Igor Rodionov avatar
Igor Rodionov

how about your 20:00 ?

webb avatar

Yes, we can speak this evening. @Ajay Tripathy has to go to the airport around that time though. Could we speak at 19:30 Pacific?

Igor Rodionov avatar
Igor Rodionov

sec

Igor Rodionov avatar
Igor Rodionov

ok

Igor Rodionov avatar
Igor Rodionov

I will wake up that time

webb avatar

Sg, we’re also looking this problem now. It looks like you may have had an existing node exporter deployment on this cluster? Does that sound right?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Yep!

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

We have a full kube-prometheus deployment which includes node exporter

webb avatar

Ok, that appears to be causing the issue. Still investigating.

webb avatar

@Erik Osterman (Cloud Posse) @Igor Rodionov we’ve been able to reproduce. We don’t reinstall node_exporter if there’s an existing installation in your cluster. That works fine with the default install. But for some reason the configuration on your node exporter isn’t allowing metrics to land in prometheus. Regardless, we’ve pushed a change so that the app still functions without any issues if you restart the kubecost-cost-analyzer pod. We’ll discuss this underlying problem further with Igor tonight. Let me know if you have any questions!

1
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Thanks @webb!

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

maybe it’s cause our node exporter is wired up with kube-prometheus

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

yet we don’t have kubecost pointed to that prometheus (which is our ultimate goal, but we thought we’d try to first get it up with the built-in prometheus and grafana)

webb avatar

Yeah, that sounds like it could be the cause… we’ll look into some more before our call with Igor. Positive is that not having this data just slightly limits functionality… it shouldn’t break anything

1
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

“graceful degradation”

Igor Rodionov avatar
Igor Rodionov

Here

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@webb

webb avatar

hmm, we’re on zoom

webb avatar

you on another meeting id?

2019-03-25

webb avatar

@Igor Rodionov @Maxim Mironenko (Cloud Posse) @Erik Osterman (Cloud Posse) quick update… we were able to confirm why you were missing a couple metrics on the Kubecost frontend. The node-exporter metrics in question were introduced in v0.16.0 on 2018-05-15. It appears this test cluster is running node-exporter:v0.15.2. As mentioned last week, our app falls back gracefully but you would get a number of new metrics/fixes with an node-exporter upgrade. Anyways, just wanted to share this to close the case on root cause — no action required.

1
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

We’re going to upgrade node exporter on our side

1
    keyboard_arrow_up