#kubernetes (2018-11)
Archive: https://archive.sweetops.com/kubernetes/
2018-11-04

hi all, what are you opinion on different networking option in kubernetes on aws. Which is more preferred and felt robust. We did tried with aws-vpc-cni but felt that its not stable enough even with 1.1.0 for kuberntes 1.10.6. This becomes more unstable when all your worker nodes are unstable and started giving exception as sandox ip changed etc..

we then switched to calico, but somehow its observed that its impacting the way pods terminate. If we delete a deployment, pods remain in terminating state for 5+ minutes.

The pods stuck in a terminating state is a very frequently observed problem. Could it be related to the network layer? Maybe - but I would explore other possibilities. To me the network culprit seems like a red herring.

Lots of posts/issues on it. Usually related to zombies.
2018-11-06

@rohit.verma we saw something like that with some of k8s pods, in particular kiam
- when deleted, the pods take many minutes to terminate

so maybe it’s an issue with some deployments, not the network itself?

But the pods I am referring here is generic like nginx or spring boot app

Anyways more concerned about a general opinion on different kubernetes networks

We haven’t had the opportunity to explore/optimize the network layer in k8s

Also are you familiar with the dumb-init “fix” ?

This is to address the same symptoms

A minimal init system for Linux containers. Contribute to Yelp/dumb-init development by creating an account on GitHub.

@Andriy Knysh (Cloud Posse) hello again, do you have any doco’s and best practices for promoting kube, within nonp? … ie … dev to staging?

for clarification, are you talking about promoting images and helm charts? or promoting usage of kubernetes within a company

right now we are using diff name spaces in k8s

currently it is within company

So the same cluster for staging and production?

dev and staging

bump

sorry, i let this fall through the cracks.

we don’t have a well documented process for what you want. we’ve implemented and documented it internally for customers, but still need to document it on our site.

we have something rough here: https://docs.cloudposse.com/release-engineering/cicd-process/

also, looks like the video was taken down =/

nice

this is the same thing that I had in mind …

do what is your view databases with persistent volumes?

Use fully managed databases for anything you care about

Use database containers for disposable environments

So for example, when we deploy environments for every PR we use containers


what are your thoughts on some of the work that Kelsey Hightower has done in this space? https://github.com/kelseyhightower/pipeline
A step by step guide on creating build and deployment pipelines for Kubernetes. - kelseyhightower/pipeline

have’t taken a look at it

Nonp?

ya we run a two accounts … were one is prod and one is none-prod

and all our none-prod stuff happens in nonp

Aha gotcha

We don’t have the promotion process documented but I can share how it looks (we use Codefresh)

I am currently on my phone so will share a little later

np sounds good

using kops or terraform for creating kubernetes Production. What is better and cons ??

@Tee we use terraform to create kops resources, e.g.

Collection of Terraform root module invocations for provisioning reference architectures - cloudposse/terraform-root-modules

Collection of Terraform root module invocations for provisioning reference architectures - cloudposse/terraform-root-modules

and then use kops
to provision k8s clusters

there was some discussion earlier in #terraform I think related to EKS

we also have TF modules for EKS

@Tee are you thinking GCP or AWS?

AWS

so on AWS, my opinion is that it’s more work than than necessary to manage EKS with terraform. the challenge comes down to upgrading. there’s some discussions on strategies for that.

@Andriy Knysh (Cloud Posse) are ya’ll using it with kops? Looks like it. How does TF generation fit in if at all?

with kops
, the ability to do rolling-updates is built in; it’s a purpose built tool like kops will do a better job at managing lifecycles.

if fargate announces EKS support at the end of the month, I might change my stance

But the EKS and FARGATE gets pretty expensive

as far i think

humans aren’t cheap either

Right

So what do you suggest for longterm. Not considering the cost. With less bottlenecks and nightmares

not an easy question

I mean in terms of stability

kops
is well established and works well, and does lifecycle management

EKS is new and lacks a lot of features, but it will stay and they will improve it

Fargate will improve and cost will be reduced

(we’re not using EKS in production yet, so our story will be biased towards kops)

Oh ok. Thanks @Erik Osterman (Cloud Posse) & @Andriy Knysh (Cloud Posse) for your suggestions.

yea, the point is that with the current state of EKS, you need to do and provision even more resources than using kops

and it does not support many features

Fargate could improve it, but as many mentioned it’s costly (and it does not exists yet)

That makes sense

I am currently moving all of our infrastructure off Mesosphere DC/OS onto EKS and EKS has been phenomenal in my opinion - just lots of support from many different aspects such as AWS and the Kubernetes community

as well as great folks like Cloud Posse

yea thanks @Matthew

the point is that with EKS, if for example you need to perform a rolling update, it’s not supported out of the gate

so a lot of friction with many things

with kops
it just works

but sure for longterm EKS/Fargate would be better

Yeah i’ve talked with EKS specialist from AWS and they currently suggest a blue/green strategy for upgrading which can be tedious and at times break backwards compatibility

how do you export a single context of your kubeconfig?

say my local kubeconfig has a dev
qa
prod
context

@btai we don’t have multiple contexts. We use containers + ENV vars
pattern (implemented in geodesic
+ repo per env + Dockerfile(s)). So in each container (prod
, staging
, dev
, etc), when we run it, we have all ENV vars defined for that particular env (ENV vars come from Dockerfiles or from SSM if they are secrets). That includes everything for Terraform, kops, k8s, etc.

So when we do for example kops export kubecfg
, the environment knows what context we want

and we can run those geodesic
containers locally and also in CI/CD pipelines (for which we use Codefresh since it can run each pipeline step as a Docker container)

nice thanks

Have you guys used Codefresh enterprise? I know you’re all big into codefresh here. Just curious of any pitfalls or bits of advice you guys have

(enterprise to run on-prem)

So Codefresh enterprise has 3 variations: full SaaS, hybrid and on-prem

we’ve been working exclusively with the enterprise SaaS

Ooo

what’s the primary driver for going on-prem?

compliance requiring no dependence on external SaaS providers


which compliance certification?

oh wow, you haven’t even taken me out on a date yet to be asking such risquè questions.

lol jk, I think fedramp

lol

ok - that’s a whole ’nother cup of tea

not familiar with

but sounds like you’d need full on-prem.

so i probably wouldn’t enlighten you more than you already know

No worries. We’re new to codefresh–so just probing for any gotcha’s really

@dustinvb can definitely elaborate

are you using the helm based install?

I think at the moment, yes

debating about the release of terraform 0.12 and using all the templating stuff

rather than 2 templating engines.. tiller.. and all that jazz

so i get where you’re coming from - but from what i’ve gleaned the current helm provider is too basic to handle all kinds of helm charts. maybe with 0.12 it’s better off

you’ve seen our helmfiles repo? basically you can’t do half of what we do with helmfile using that provider

https://marketplace.fedramp.gov/#/product/aiware-government?sort=productName
Just as a cover my ass that I’m not saying providing any confidential information.. it’s publicly available that we’re on fedramp ^^^ lol

Oh, my bad. I didn’t mean the helm provider.. I meant generating the k8s.yml files on the fly based on the infra-state.. no helm installation anywhere

just a thought at the moment–not necessarily going that direction for sure

but yea, you could basically create terraform modules in place of helm charts

… if terraform templating is sufficient

hehe, yeah, big “if”

it’s been my experience, the “simple” case always works well regardless of the technology

ah

how have you guys been liking helm? any complaints with the tiller stuff, or you guys are experienced enough with it all–nothing really bugs you?

i mean, it sucks about the tiller and all

but i look at helm more like an interface

and the interface won’t change dramatically, but the underlying implementation is getting a big overhaul as you’re probably aware

as part of that tiller is going away and the template engine going pluggable

“tillerless helm” is the buzz

yea

as a way to manage a complex apps it’s great

and app dependencies

i say (and with some humility) that those before us have invested a lot of time in what it takes to manage software releases

deb, rpm, apk, etc.

we tried to avoid that with just a Makefile
; it worked well until it didn’t. in the end, we needed all that a package manager provides and conceded to package .apk
alpine packages

my point is that just templatizing raw kubernetes resources and applying them seems easy enough and i’m sure you can get away with it for a long time

but then you realize you want to have dependencies, triggers on deployment or uninstall, and rollbacks, etc. then you’re on your own.

the more homegrown/spun, the more the solution diverges from the trajectory the community is taking

because the community is solving problems around a standardized toolset

all true

so i’m curious.. you bring up rollbacks

codefresh/spinnaker’s solutions didn’t offer enough in that aspect?

codefresh relies on the fact that helm does rollbacks automatically

ah

and even bakes that into the UI with one-click rollbacks

they also have some even more cool stuff in the works - but you’ll have to ask them to see it

For sure

we have meetings setup with them

We’ll probe


does all this reveal a well needed niche (product offering) in the CI/CD process for k8s?

since there always ends up being handrolled stuff?

haha, not sure - there are more CI/CD platforms today than ever

https://github.com/gaia-pipeline/gaia I like their philosophy at that in particular
Build powerful pipelines in any programming language. - gaia-pipeline/gaia

yea

i can’t keep them straight anymore

spinnaker is now coming out with an enterprise offering too

haha nice. Well, after the bloodbath, hopefully the best solution reigns supreme

ah

halyard
was surprising when I first played with it

and then github actions

then I looked at the helm chart for spinnaker.. and it was just a bunch of hal
commands

yea

but i agree that there’s still big room for improvement

the fact there is so much handrolling and independent tooling

I think codefresh is well poised to do that as it relates to cicd+kubernetes+helm

does your gut think helm isn’t going anywhere?

until I see an alternative that has anywhere near the critical mass of helm, yes - i think it’s here for the foreseeable future

for example, there’s ksonnet
(based on jsonnet
) which looks interesting

but i think some variation of that could be used as a pluggable engine for helm

also, i don’t want to see proliferation of more packaging systems right now - it’s too early

Helm tiller plugin aka Tillerless Helm. Contribute to rimusz/helm-tiller development by creating an account on GitHub.

have you seen this plugin?

this is pretty smart.

I thiiiink I’ve seen this one.. if not it was something similar

basically, it’s a drop in replacement. it still stores all configs in the cluster (per namespace if you want)

you run a temporary tiller locally

this can be run as part of CI

interesting.. hmm.. nice actually!

(though would break the codefresh helm UI, since it would need to talk to the tiller and there would be none running)

ah, right
2018-11-07

Experimental ksonnet plugin for Helm. Contribute to technosophos/helm-ksonnet development by creating an account on GitHub.

Dig it

In order to provide jsonnet rendering for helm charts a new ReleaseModule similar to the Rudder ReleaseModule should be developed. This module would take charts and render them as Jsonnet templates…

Guess my hopes of seeing ksonnet as a template engine in helm were misguided

I know Lua is coming. I’d heard such great things about jsonnet, that I assumed it would be well suited. But Lua I guess is a better understood embeddable language

Last I had to write Lua was 14 years ago when dealing with Nginx
2018-11-08

For the codefresh peeps out there.. does it matter what/how the ingress controller looks when using codefresh for deployments?

Was reading through: https://docs.traefik.io/user-guide/kubernetes/#traffic-splitting and it came to mind
2018-11-09

Nope, we use for example the CloudFlare Acesss/Argo ingress and nginx-ingress controller in the same cluster

have you seen this https://www.youtube.com/watch?v=kOa_llowQ1c

I love his presentations and he’s definitely the best evangelist for kubernetes

and i think he’s presenting the simple side that should be presented

and here comes the but…..

but in the real world of deploying complex applications with interdependencies, secrets, configurations, etc… it devolves into something much more complicated

his presentations are always awesome

for sure

and the gap to cross from the hello world examples to customer apps is huge

he makes it look so “easy button”

Comprehensive Distribution of Helmfiles. Works with helmfile.d
- cloudposse/helmfiles

PLEASE SOMEONE SHOW ME HOW TO MAKE THIS EASIER

i want to

i hate this

and here’s the rest of all the other apps

Comprehensive Distribution of Helmfiles. Works with helmfile.d
- cloudposse/helmfiles

so with one of my customers we are working on two distinct steps… once to build the app, then a seperate one to update (deploy) the app in an ongoing fashion

so we’re using helm, and some hate on helm for one reason or another. but one things for sure, this is hiding an even more enormous pile of YAML/go templating on the backend.

i hate kelsey hightower in the best way possible

this is interesting https://aws.amazon.com/blogs/opensource/continuous-delivery-eks-jenkins-x/

Amazon Elastic Container Service for Kubernetes (Amazon EKS) provides a container orchestration platform for building and deploying modern cloud applications using Kubernetes. Jenkins X is built on Kubernetes to provide automated CI/CD for such applications. Together, Amazon EKS and Jenkins X provide a continuous delivery platform that allows developers to focus on their applications. This […]

i did not think it could do so much, it creates pipelines for infrastructure itself (prod and staging), and pipelines for the app, and even spawns a separate testing/staging env in k8s for each PR, and comments on GitHub on PRs (like atlantis
), and creates GitHub repos with Helm charts for the infrastructure (prod and staging)

https://github.com/jenkins-x/sso-operator (@Erik Osterman (Cloud Posse) already posted it before)
Single Sign-On Kubernetes operator for Dex identity provider - jenkins-x/sso-operator

one thing it can’t do is to upgrade the k8s cluster b/c it itself sits in the same cluster

wow

@here Any recommendations for learning distributed systems from basics to advance?

noticed, lot of people knows the tools but not the concepts…
2018-11-11

@ramesh.mimit I found this site very interesting and with lots of resources about distributed systems, and real-life examples from many companies





@Andriy Knysh (Cloud Posse) thanks..


“Google Kubernetes Engine’s third consecutive day of service disruption”
2018-11-12

anyone use the official python kube library?

can you load the config from a dict?

~why not use config profiles instead?~

~e.g. ~AWS_DEFAULT_PROFILE=cp-prod-admin

~the underlying aws SDK should then handle everything automatically~

the kube config?

heh, my bad @btai


how would i run kubectl within a container running from a job?

here’s an example doing it from a deployment: https://github.com/onfido/k8s-rabbit-pod-autoscaler
Kubernetes autoscaler for pods that consume RabbitMQ - onfido/k8s-rabbit-pod-autoscaler

doing it from a job wouldn’t be any different

just need the proper role bindings

in this case, kubectl
is gettin called from in the autoscale.sh

so if i have the wrong role bindings

would i be getting this error:

The connection to the server localhost:8080 was refused - did you specify the right host or port?

all i know is when we implemented it for redis using the strategy above (for rabbit), we didn’t need to specify any hosts

it just autodiscovers it

the pod autodiscovers

ok thats what i was hoping for

it also provides a kube context

so the pod itself didnt have any kubeconfig or kube api secrets

yea, it didn’t have anythign like that

Contribute to vanvalenlab/kiosk-autoscaler development by creating an account on GitHub.

yeah i have a job basically doing the same thing

executing a shell script that makes a kubectl call

but i get the above error

kops cluster?

aks

ok

we tested it on gke and kops

yeah it works in kops

that job

oh interesting!

you have rbac enabled in kops?

although the kops

doesnt have rbac enabled


haha

yeah

so do i create a clusterrolebinding for the job?

Kubernetes autoscaler for pods that consume RabbitMQ - onfido/k8s-rabbit-pod-autoscaler
2018-11-13

thanks

sorry, still new to k8s

hypothetically if i create a cluster role binding with the namespace and name that matches the job, that should work?

More or less

I don’t know the specific matching selectors that are available

This is near. Copy secrets from a centralized system of record. https://github.com/mittwald/kubernetes-replicator/
Kubernetes controller for synchronizing secrets & config maps across namespaces - mittwald/kubernetes-replicator
2018-11-14

How are you guys handling busy helm deployments where the tiller is busy attending to other deployments…
Error: could not find a ready tiller pod

@Max Moon @dustinvb

Introduce –replicas option to configure amount of Tiller instances on the cluster. Fixes #2334. The next PR will be about distributed lock, this one is just exterior.

I haven’t ran into this scaling issue yet.


@michal.matyjek @Daren have you run into this?

I have not run into this yet either

I have not

not yet

how many deployments are we talking about?

just concurrency

so we’re running helm
on every PR synchronization for unlimited staging environments

so we’re getting it

e.g. 2 developers push at around the same time
2018-11-19

hey all - curious what the verdict is on kiam vs kube2iam… it seems like kiam was created to address some issues with kube2iam - is kiam the way to go these days?

@sarkis yea, kube2iam is dead and should not be used. It’s a massive liability to even deploy in an AWS account. If you run more than N hosts (N ~10), you’ll DoS AWS APIs and they rate limit you.

kiam addresses this by having a client/server model. clients run on all nodes (agents), and talk to the server. the server is responsible for fetching the credentials which reduces rate of requests

it also caches

I think there’s been some frustration related to the rate of development on Kiam, but the worse bugs are fixed.

Also, I don’t know of any alternatives to kiam
and kube2iam
for AWS

thanks @Erik Osterman (Cloud Posse)!
2018-11-21

A helm plugin that help manage secrets with Git workflow and store them anywhere - futuresimple/helm-secrets
2018-11-28


You can use a PodPreset object to inject information like secrets, volume mounts, and environment variables etc into pods at creation time.
2018-11-30

set the channel topic: