#kubernetes (2020-05)

kubernetes

Archive: https://archive.sweetops.com/kubernetes/

2020-05-01

Zachary Loeber avatar
Zachary Loeber

Passed the CKA, whoop

7
roth.andy avatar
roth.andy
01:44:08 PM
bradym avatar
bradym
05:25:40 PM
Zachary Loeber avatar
Zachary Loeber

thanks guys

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

that’s awesome! let’s hear more about it on #office-hours next week if you’re around

Zachary Loeber avatar
Zachary Loeber

sure, I missed the last office hours as I was taking the exam then

2020-05-04

AugustasV avatar
AugustasV

configuring ingress-nginx on kubernetes, but page still can’t see the endpoints - backend is working, showing this is nginx page

AugustasV avatar
AugustasV

404 Not Found

omerfsen avatar
omerfsen

404 is being taken always if there is domainname mismatch

omerfsen avatar
omerfsen

though nginx ingress has a default ingress

omerfsen avatar
omerfsen

can you check pods logs or using stern

AugustasV avatar
AugustasV
12:03:21 PM

the thing is that it was working before, I just wanted to reinstall ingress. I can ping and reach services inside cluster

404 is being taken always if there is domainname mismatch

AugustasV avatar
AugustasV

I only see 404 error, cant reach other pages. maybe nginx ingress doesnt use ingress file or something? logs show nothing

curious deviant avatar
curious deviant

Hello,

Does anyone have experience using Rancher to manage their EKS clusters ? What I am specifically looking for is if there’s a way for me to specify namespaces and resource quotas in a yaml etc. for my clusters and then feed that into Rancher ? I am just getting started with this tool and looks like everything happens through the UI thus far.

2020-05-05

omerfsen avatar
omerfsen

Hello may I ask what do you use to authenticate against AWS EKS so you can use kubectl etc. Is there an alternative to https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html ?

Managing users or IAM roles for your cluster - Amazon EKS

The aws-auth ConfigMap is applied as part of the guide which provides a complete end-to-end walkthrough from creating an Amazon EKS cluster to deploying a sample Kubernetes application. It is initially created to allow your worker nodes to join your cluster, but you also use this ConfigMap to add RBAC access to IAM users and roles. If you have not launched worker nodes and applied the

omerfsen avatar
omerfsen

for example i want to use AD or SSO or what do you use generally.

Zachary Loeber avatar
Zachary Loeber

anyone here ever futz around with the Istio ingress deployment before?

roth.andy avatar
roth.andy

Yep, check out https://github.com/RothAndrew/istio-practice/tree/master/eks

uses Gateway, VirtualService, Cert-Manager with LetsEncrypt. Would love any feedback you might have

RothAndrew/istio-practice

Repo to collect the things I do to practice with Istio - RothAndrew/istio-practice

Zachary Loeber avatar
Zachary Loeber

quick question then, if you deploy the operator via istioctl and use the demo profile, the ingress gateway it creates is usable?

Zachary Loeber avatar
Zachary Loeber

or should you create a gateway for your published apps like you have done and have multiple gateways?

roth.andy avatar
roth.andy

yep, for sure

roth.andy avatar
roth.andy

it is usable

roth.andy avatar
roth.andy

Also, each profile is just a set of defaults that get set, that you are free to override. In my deployments I have deployed demo with half a dozen or so overrides, like enabling HTTPS and SDS

Zachary Loeber avatar
Zachary Loeber

so, should there be a gateway per namespace?

roth.andy avatar
roth.andy

It’s really up to you. You can totally get away with just one gateway for the whole cluster. That way you have a standard set of rules, like always redirecting HTTP to HTTPS for example.

The limitation that I have discovered is, that if you are using SDS with cert-manager, then each Gateway gets one-and-only-one Certificate, and the Gateway and Certificate must be in the same namespace as the ingress deployment, which is almost always istio-system

roth.andy avatar
roth.andy

You can assign as many dnsNames as you want in the Certificate resource

Zachary Loeber avatar
Zachary Loeber

And thanks for this istio guide. I dig it for certain. With minor modifications it works brilliantly with a local kind cluster running metallb and defreitas/dns-proxy-server

roth.andy avatar
roth.andy

Nice. Did you document any of those mods? I’d love to include that as another option. MetalLB looks awesome for LoadBalancers in other clouds such as Hetzner

Zachary Loeber avatar
Zachary Loeber

https://github.com/zloeber/CICDHelper -> I just finished up my first round of making this hacky thing work

zloeber/CICDHelper

Fairly large set of scripts for crafting and working with devops tools - zloeber/CICDHelper

Zachary Loeber avatar
Zachary Loeber

I do perform an istioctl operator deployment with a custom manifest generated from the demo profile to enable the ingress for prometheus, grafana, and kiali but that never seems to work properly from what I can tell.

Zachary Loeber avatar
Zachary Loeber

so I gave up on it and just used your example to craft a quick helmfile to do the bookinfo deployment as an example instead

Zachary Loeber avatar
Zachary Loeber

as always, thanks for the inspiration good sir.

Zachary Loeber avatar
Zachary Loeber

(the istio profile example used was only tested for a kind cluster thus far, need to give k3d a whirl at some point too…)

roth.andy avatar
roth.andy

Zachary Loeber avatar
Zachary Loeber

And MetalLB is pretty nifty for testing, not entirely certain how performant it is but generically it does work

btai avatar

can you increase memory limit on pods w/o triggering a restart?

btai avatar

or daemon set

Zachary Loeber avatar
Zachary Loeber

Not that I’m aware @btai That would be considered vertical pod autoscaling

Zachary Loeber avatar
Zachary Loeber

start your search with that term and let me know if I’m wrong (I’d rather be corrected than make wrong assumptions :))

btai avatar

what does everyone else do when their ingress controller is nearing a mem limit w/o incurring downtime

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

and no HPA?

btai avatar

personally it’s easy for us to provision a new cluster w/ new limits and move traffic to that

btai avatar

but i was wondering for those that don’t do ephemeral clusters

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

is every nginx-ingress pod reaching memory limit?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

what’s considered “downtime” in this situation? no request dropped?

btai avatar

yeah all my pods are close

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

isn’t it enough to ensure you have a rolling-update strategy in place and then change the limits?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

that should be ~zero downtime (possibly some dropped connections)

Zachary Loeber avatar
Zachary Loeber

I would almost certainly deploy ingress with autoscaling moving forward

btai avatar

@Erik Osterman (Cloud Posse) you would think but we have had alot of connection errors doing tat in the past. will need to add some telemetry around it, very possible not at fault of the ingress controller

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

i don’t doubt there would be some connection errors, but I don’t consider that downtime

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

that’s failover working the way it should

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

you can also throttle the rate of the roll out

2020-05-06

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

EKS users:

bradym avatar

I’m curious how others handle deploying things like an nginx config for a reverse proxy to a third-party endpoint for something like email click tracking or subdomain redirects. Do you just add it to the ingress-nginx config? Deploy a separate nginx pod with its own config file? Something else? Any input?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

What are your concerns? … e.g. why would email click tracking need to be handled differently.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

e.g. architecturally, i’ve seen this implemented with lambdas + kinesis, but that might not need to be an optimization most need

bradym avatar

These are configs for things that aren’t tied to a specific app of ours, so it’s just a little unclear where to put them.

bradym avatar

With that in mind I’m just thinking about what would be easiest to maintain and add new stuff to as needed.

2020-05-07

wannafly37 avatar
wannafly37

Does anyone have input or a link to what could be a checklist of tasks for setting up a basic EKS/k8s cluster? I’m generally bad at abstracting ideas away until I’m knee deep in them, I can only think of general tasks like 1) build nodes 2) user auth 3) logging 4) metrics collection

wannafly37 avatar
wannafly37

^^ This is with the presumption the vpc/infra is all setup

Zachary Loeber avatar
Zachary Loeber

steps: 1.) Build EKS cluster, 2.) ???, 3.) Profit!

1
1
wannafly37 avatar
wannafly37

^^ Pretty much hah

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

assuming that you provisioned not only EKS cluster, but all the IAM roles and SSO stuff for humans (to be able to access the cluster) and for apps (EKS service account IAM roles), then we usually deploy these k8s releases:

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)
- `external-dns`
- `nginx-ingress`
- `cert-manager`
- `reloader`
- `metrics-server`
- `kubernetes-dashboard`
- `efs-provisioner`
- `aws-secret-operator`
- `prometheus stuff

`

2
timduhenchanter avatar
timduhenchanter

Anyone have experience exposing status.hostIP to a third-party application to hit a DaemonSet that does not natively support referencing environment variables with Downward API (use-case: Datadog Agent and Kong)? https://github.com/kubernetes/kubernetes/issues/74265 asked for this to be revisited but in the meantime…

Support Downward API for HostAliases · Issue #74265 · kubernetes/kubernetes

What would you like to be added Currently, HostAliases provide a way to inject entries into /etc/hosts files inside Pods. While this is useful for previously-known static IPs, there are times when …

joey avatar

like

          env:
            - name: DD_AGENT_HOST
              valueFrom:
                fieldRef:
                  fieldPath: status.hostIP

in your container env?

Support Downward API for HostAliases · Issue #74265 · kubernetes/kubernetes

What would you like to be added Currently, HostAliases provide a way to inject entries into /etc/hosts files inside Pods. While this is useful for previously-known static IPs, there are times when …

timduhenchanter avatar
timduhenchanter

Right but Kong does not support referencing the Downward API (that declaration) so looking for an alternative solution outside of forking the supported Helm Chart for DD agent or the codebase for Kong

joey avatar

ah dang, sorry, i am of no immediate use here

timduhenchanter avatar
timduhenchanter

no worries, thanks for helping!

2020-05-08

Zachary Loeber avatar
Zachary Loeber

A new blog I dropped recently about screwing around with Istio on k3d and kind: https://zacharyloeber.com/2020/05/the-istio-rabbithole/

The Istio Rabbitholeattachment image

The Istio Rabbithole - Zachary Loeber’s Personal Site

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@roth.andy

The Istio Rabbitholeattachment image

The Istio Rabbithole - Zachary Loeber’s Personal Site

roth.andy avatar
roth.andy

Zachary Loeber avatar
Zachary Loeber

@roth.andy was the inspiration for that post

1

2020-05-09

Joey avatar

Hey guys, devops noob here…

I followed this tutorial to dockerize a react app https://mherman.org/blog/dockerizing-a-react-app/ and it turns out that -it (interactive mode) is required for docker run commands

Now I’m trying to follow this tutorial to deploy via Kubernetes dashboard https://www.youtube.com/watch?time_continue=279&v=je5WRKxOkWQ My app keeps crashing with the same error that occurs if I don’t add -it to docker run

So does anyone here know how I could add -it when deploying via Kubernetes dashboard?

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

is the app a server app or browser app?

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

if a Docker container starts and the process exits right away, you would see the same behavior you described

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

for example, for a server app, you start an HTTP listener which prevents the app from exiting

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

-it mode starts an interactive session with the container, so even if the app exits, the container is being kept alive

Joey avatar

I believe it’s a server app @Andriy Knysh (Cloud Posse)

Joey avatar

I’ll try to find the error hold on

Joey avatar
> [email protected] start /app
> react-scripts start

ℹ 「wds」: Project is running at <http://172.17.0.2/>
ℹ 「wds」: webpack output is served from 
ℹ 「wds」: Content not from webpack is served from /app/public
ℹ 「wds」: 404s will fallback to /
Starting the development server...
Joey avatar

Then the app never launches, vs if I do -it in the docker run command it launches the app accordingly.

Joey avatar

but I can’t find a way to specify -it when running the pod through the kubernetes dash board @Andriy Knysh (Cloud Posse)

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

you don’t need to specify that

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

something is wrong with the app when it runs in a container

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

a Node app should work in a container, and you should be able to access it from your local computer via the port binding (should be able to open a browser on the host port and see the app)

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

did you test that?

Joey avatar

Yeah

Joey avatar

It’s literally just the default create-react-app app

Joey avatar

npm start opens it in localhost:3000

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

did you test it in a docker container on your local computer?

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

for example, using Docker compose like this:

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)
version: '3.1'

services:
  "app":
    image: app
    build: .
    expose:
      - "3000"
    ports:
      - "3000:3000"
    volumes:
      - "./:/usr/src/app"
Joey avatar

yeah I have a docker compose file taken from the tutorial in my first link

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

and when you run the composition, you can see your app on localhost:3000 (or 3001 in your case)?

Joey avatar
docker-compose up -d --build
Joey avatar

works fine for me

Joey avatar

I’ll see what port hold on

Joey avatar

or come to think of it, it just builds another image

Joey avatar

which has the same effect of only working when docker run has the -it command after

Joey avatar

my docker compose file:

version: '3.7'
  
services:

  sample:
    container_name: sample
    build:
      context: .
      dockerfile: Dockerfile
    volumes:
      - '.:/app'
      - '/app/node_modules'
    ports:
      - 3001:3000
    environment:
      - CHOKIDAR_USEPOLLING=true
    stdin_open: true
    tty: true
Joey avatar

@Andriy Knysh (Cloud Posse)

Josephs-MacBook-Air:web josephbennett$ docker-compose up -d --build
Building sample
Step 1/9 : FROM node:13.12.0-alpine
 ---> 483343d6c5f5
Step 2/9 : WORKDIR /app
 ---> Using cache
 ---> 961768ca865e
Step 3/9 : ENV PATH /app/node_modules/.bin:$PATH
 ---> Using cache
 ---> 8c67f044ee11
Step 4/9 : COPY package.json ./
 ---> Using cache
 ---> 38c4fc32e5b5
Step 5/9 : COPY package-lock.json ./
 ---> Using cache
 ---> 1e03505a795d
Step 6/9 : RUN npm install --silent
 ---> Using cache
 ---> 4c69c439a90f
Step 7/9 : RUN npm install [email protected] -g --silent
 ---> Using cache
 ---> 8de2388a9ba5
Step 8/9 : COPY . ./
 ---> Using cache
 ---> 2cc9fce978f4
Step 9/9 : CMD ["npm", "start"]
 ---> Using cache
 ---> 1b726fc6c6ce

Successfully built 1b726fc6c6ce
Successfully tagged web_sample:latest
my-react-app is up-to-date
Josephs-MacBook-Air:web josephbennett$ docker run web_sample:latest

> [email protected] start /app
> react-scripts start

ℹ 「wds」: Project is running at <http://172.17.0.2/>
ℹ 「wds」: webpack output is served from 
ℹ 「wds」: Content not from webpack is served from /app/public
ℹ 「wds」: 404s will fallback to /
Starting the development server...

Josephs-MacBook-Air:web josephbennett$ docker run -it web_sample:latest
Joey avatar

^ it works only after that last line

Joey avatar

& that’s what I can’t seem to replicate in the Kubernetes console, just figuring out how to add the -it

Joey avatar

thanks for the help btw

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)
Get started with Docker Compose

On this page you build a simple Python web application running on Docker Compose. The application uses the Flask framework and maintains a hit counter in Redis. While the sample…

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

in your example, it just builds the image and exits

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

(that’s why you have to run docker run after)

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

so something is wrong with your app or docker compose

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

try to remove

stdin_open: true
tty: true
Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

and run again

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

you should be able to run docker-compose up (it should build the image and start the container, with the -d option it should start the container and exit, but you should be able to see the container running) and then open a browser and see your app running on localhost:3001

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

only after that you need to think about deploying to Kubernetes (it has nothing to do with docker -it arguments)

Joey avatar

oohhh gotcha, thanks

Joey avatar

so I guess that there’s something up with create-react-app’s default application that doesn’t allow it to be dockerized

maarten avatar
maarten

@Joey The documentation you posted https://mherman.org/blog/dockerizing-a-react-app/ explains at “What’s happening here?” #2

-it starts the container in interactive mode. Why is this necessary? As of version 3.4.1, react-scripts exits after start-up (unless CI mode is specified) which will cause the container to exit. Thus the need for interactive mode.

You will need to enable the ci mode and then it will work without -it, you can do that by setting the env variable CI to true.

Dockerizing a React App

Let’s look at how to Dockerize a React app.

1
1
1
Joey avatar

Thanks! I’ll try it when I get the chance. Sounds promising

Dockerizing a React App

Let’s look at how to Dockerize a React app.

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

thanks @maarten for finding the reason why the app exits after start

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

@Joey you should put the app in the CI mode, then run docker-compose up -d

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

and then see the app running in the browser at localhost:3001

Joey avatar

@Andriy Knysh (Cloud Posse) @maarten it worked!! Thank you guys so much!

2
Joey avatar

simply just adding CI=true before npm start

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

nice catch @maarten!

2020-05-10

2020-05-11

Joey avatar

Hey guys, do you know if it’s possible to specify the DNS whenever you create a new service in Kubernetes? If not, then what’s a stable way to stay routed if your service fails?

e.g. let’s say we create a service with the DNS http://abc123-456us-east-2.elb.amazonaws.com<i class="em em-3000|abc123-456us-east-2.elb.amazonaws.com"</i>3000>, and we have our URL www.mysite.com pointing to it… Then we create a backup service with the DNS http://def456-789us-east-2.elb.amazonaws.com<i class="em em-3000|def456-789us-east-2.elb.amazonaws.com"</i>3000>. Now we have to take the time to go on godaddy or wherever, to direct www.mysite.com to point to our backup DNS…

Sorry if this is a super noob question, but is there something that I can do in case our service fails? Or should I just assume that the service will live on?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

use the external-dns controller with route53

Chris Fowles avatar
Chris Fowles

move your dns to something like route53 for a start

Chris Fowles avatar
Chris Fowles

then you can use something like https://github.com/kubernetes-sigs/external-dns

kubernetes-sigs/external-dns

Configure external DNS servers (AWS Route53, Google CloudDNS and others) for Kubernetes Ingresses and Services - kubernetes-sigs/external-dns

2
joey avatar

and then you can do even more fancy stuff like using weighted records and associating healthchecks with cname’s, because you probably want to have www.mysite.com CNAME to something like point to primariy.cluster.mysite.com weight 100 and backup.cluster.mysite.com weight something else each with their independent healthchecks and maybe you want to delegate only cluster.mysite.com to route 53 instead of your whole domain

1
joey avatar

i probably wouldn’t want to point my mysite.com cnames directly to my clb/alb/nlb’s because they’re probably going to get clobbered at some point and the beauty of alb-ingress-controller and external-dns is that’ll all just magically work

1
Joey avatar

thanks for all the suggestions guys, I’ll do the fancy stuff later. I’m googling what a route 53 is and how to move my DNS to it thanks!

joey avatar

just be sure to do all your testing in production (LOTS OF SARCASM HERE)

alrightythen1
Chris Fowles avatar
Chris Fowles

i’m having trouble keeping my joey’s straight here

Joey avatar

You can just call me deadpool

1

2020-05-12

Pierre Humberdroz avatar
Pierre Humberdroz
spotahome/service-level-operator

Manage application’s SLI and SLO’s easily with the application lifecycle inside a Kubernetes cluster - spotahome/service-level-operator

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

This is a cool project. We’ve deployed it and it works well.

spotahome/service-level-operator

Manage application’s SLI and SLO’s easily with the application lifecycle inside a Kubernetes cluster - spotahome/service-level-operator

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

We didn’t end up doing much with it yet, but have a PR open for it here: https://github.com/cloudposse/helmfiles/pull/186

service-level-operator by maximmi · Pull Request #186 · cloudposse/helmfiles

what [service-level-operator] Helmfile added why Helmfile for service-level-operator chart

dalekurt avatar
dalekurt

Has anyone done feature branch deployments within AWS EKS? I’m planning on doing this, but I see a potential issue with AWS ALB route limits.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

yes, we do this on EKS, but we still use nginx-ingress

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

do you need to use the ALB with other AWS services?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

If not, why not just use nginx-ingress with NLBs

dalekurt avatar
dalekurt

Use NLB with random ports?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

to use NLB with nginx-ingress, you just need to set 1 annotation. It’s a very quick win.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

No need to worry about random ports. Or are you talking about some application requiremnet you have to use random ports?

dalekurt avatar
dalekurt

I will have a deep look into that option.

Pierre Humberdroz avatar
Pierre Humberdroz

you could use nginx-ingress as a reverse proxy to your application ports.

AugustasV avatar
AugustasV

Want to create nodegroup via eksctl create nodegroup command, got error

Error: timed out (after 25m0s) waiting for at least 1 nodes to join the cluster and become ready in "standard-workers"

any ideas? It didnt appear in AWS EKS panel Error: timed out (after 25m0s) waiting for at least 1 nodes to join the cluster and become ready in “

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
awslabs/amazon-eks-serverless-drainer

Amazon EKS node drainer with AWS Lambda. Contribute to awslabs/amazon-eks-serverless-drainer development by creating an account on GitHub.

Joey avatar

Does anyone here know of an easy way for me to pull from a private docker repo, via dashboard?

Joey avatar

I keep getting permission denied and idk how to log in to docker on my kubernetes dashboard

2020-05-13

Joey avatar

nvm figured it out! kubectl create secret docker-registry regcred --docker-server=[docker.io/](http://docker.io/)<user>/<image> --docker-username=<your-username> --docker-password=<your-pword> --docker-email=<your-email> then in the deployment’s .yaml:

      imagePullSecrets:
        - name: regcred
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
CDK for Kubernetes

Define Kubernetes apps and components using familiar languages

2
btai avatar

the vicious cycle of switching from writing code to configuration language back to writing code

CDK for Kubernetes

Define Kubernetes apps and components using familiar languages

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

ya, seriously

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

jenkins feels like an interesting case study related to this:

  1. first there were freestyle jobs that everyone created with clickops
  2. then came the Jenkinsfile littered with imperative groovy script
  3. people loved it. wrote “pipelines as applications” so complicated that no one knew how they worked and they couldn’t be tested. people revolted.
  4. then jenkins came out with declarative pipelines. people loved it. they converted all their groovy pipelines to the declarative style.
  5. then came gocd, argocd, flux, spinnaker, etc. everyone started to hate the declarative jenkins pipelines.
  6. yadda yadda yadda
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

in infrastructure as code. first we had things like boto (python) and wrote infrastructure as code in a pure language. that became untennable. so we all gathered around terraform and cloudformation based on the lessons learned. then that became limiting. so pulumi came out to show that we really need is to return to pure programming of infrastructure as code. aws followed suit with CDK.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

everything comes full circle.

Joey avatar

Hey guys, I’m trying to make my web page only viewable by me. So I’m modifying the ELBs security group….

This works:

Type - All Traffic
Protocol - All
Port range - All
Destination - 0.0.0.0/0

This doesn’t work:

Type - All Traffic
Protocol - All
Port range - All
Destination - <my ip address>/32

& I don’t know why. This ELB was generated from Kubernetes, so is there anything that I can do in that application?

joey avatar

i suspect you’re probably more interested in the ‘source’ of the traffic rather than the destination

Joey avatar

@joey the inbound rules don’t seem to be making a difference

Joey avatar

I can still access the page after I delete them

joey avatar

you can access the page from something other than your ip?

Joey avatar

actually… it’s buffering now I think there was just a delay

Joey avatar

oh man @joeys for the win!!!

Joey avatar

yeah it definitely makes a difference changing the inbound

Joey avatar

working on my computer but not my phone now (as expected)

Joey avatar

Do you know if there’s a way to add other people’s IP addresses?

Joey avatar

Or would they just have to log into the console themselves?

joey avatar

destination is for your server connecting to other stuff, e.g. twitter api or onlyfans.com api

joey avatar

source is for anything coming IN to your server

Joey avatar

gotcha

joey avatar

once a source-based tcp connection is established, your server can return data to it

Joey avatar

makes sense

joey avatar

to add other people.. well.. depends on how creative you want to get

Joey avatar

on second thought I think it’s totally doable just to type it in manually

Joey avatar

then add /32

Joey avatar

about to try with my phone

joey avatar

yes

joey avatar

easiest way is just get their ip and add the /32 (cidr for ipv4 single ip address) for their ip’s yourself

Joey avatar

boom! It works!!!

joey avatar

beyond that, google has your answers

Joey avatar

Thanks man! I really appreciate it

Joey avatar

alright

joey avatar

you could get all sorts of creative and create an api with lambda that has access to update the security group when some criteria is met on your page that someone updates or something crazy stupid like that

Joey avatar

good stuff to think about for the future

Joey avatar

my project is incredibly rushed though lol

joey avatar

it really depends on what you’re trying to do, and thus no one here can necessarily tell you what you need to do in those cases

2020-05-14

Karoline Pauls avatar
Karoline Pauls

I’m starting to think of moving my existing Helm charts to Terraform. Reasons:

  • Helm badly designed, e.g. if an initial deployment fails, none after will succeed, while Terraform for starters knows its job is to apply locally described desired state to a remote stateful API.
  • It’s easier to onboard people if they only need one tool.
  • Terraform displays diffs of what it wants to do before doing that.
  • Terraform allows to inspect and manage its state.
  • Terraform does not force us to template with the worst templating language I’ve ever seen, gotmpl. Resources can be defined in its language.
  • Terraform has escape hatches, like provisioners.
  • Terraform has modules, which work a bit like very cumbersome macros, while gotmpl has subtemplates, which work like macros with broken parameters/global environment (pick 1).
  • Terraform can create Helm releases better than Helm can do (Helm cannot pass a templated value to a sub-chart that doesn’t expect that value to be templated)
  • Terraform can use remote state to source values from different state.
  • I already render some values files with Terraform, then copy them to the helm charts repo. Those files contain multiple copies of the same value in many places because Helm Cannot Template Values.

My idea for deploying temporary releases from the CI is to use independent S3 objects per state.

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)
roboll/helmfile

Deploy Kubernetes Helm Charts. Contribute to roboll/helmfile development by creating an account on GitHub.

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

it solves a lot of issues from your list (but not all, and not “with the worst templating language I’ve ever seen” )

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

but I agree with your points that having just one tool (Terraform, and it’s a very good tool) offers many advantages

Karoline Pauls avatar
Karoline Pauls

i kind of expect in the best case it will be “i had 8 problems with 1 tool” to “i have 5 problems with 2 tools”

Zachary Loeber avatar
Zachary Loeber

I’ve actually deployed full team projects using pure terraform before

Zachary Loeber avatar
Zachary Loeber

Honestly, there was a certain elegance to using metadata for implicit dependency chaining that was hard to deny.

Zachary Loeber avatar
Zachary Loeber

But the devs hated HCL and almost immediately demanded some additional simplistic yaml formatting/pipelines for their dev environment to deploy configmap changes from

Zachary Loeber avatar
Zachary Loeber

My use case was to deploy the cluster with workload in a single pipeline to facilitate an ephemeral cluster deployment

Zachary Loeber avatar
Zachary Loeber

almost everything I read about doing such a thing told me that the cluster deployment and workload deployment to the cluster should be separate states (I ignored all the warnings and it worked fine for me though…)

Zachary Loeber avatar
Zachary Loeber

Where I found that helm excelled was the post deployment tests I was able to run. Both tf and helm can deploy a workload that never actually works and be considered successful. I suppose that post-deployment e2e testing functionality could easily be done any number of other ways. I know that it is not baked into terraform the way it is into a purpose built Kubernetes tool is.

Zachary Loeber avatar
Zachary Loeber

I’m having a hard time understanding your first statement though. ‘…if an initial deployment fails, none after will succeed..’

Zachary Loeber avatar
Zachary Loeber

usually that is what you would want in your pipeline right?

Zachary Loeber avatar
Zachary Loeber

perhaps you are overcomplicating your charts?

Zachary Loeber avatar
Zachary Loeber

for instance, I almost never use subcharts to reduce overall complexity and downstream issues.

Zachary Loeber avatar
Zachary Loeber

helm also can do diffs

Zachary Loeber avatar
Zachary Loeber

and it stores state in secrets within kubernetes then does three way merging. I don’t think that terraform remote state and helm three-way merging is a like-for-like comparison. Remote state is offset by the fact that it is an additional outside dependency

Stratos avatar
Stratos

I’ve already shared this in another workspace but: My biggest pain-point with the TF K8s provider is dealing with any project with CRDs (Istio/Spinnaker et al). To work around that I had to use the Helm TF provider which is excellent! I am still struggling with tools that come packed with their own CLIs and I hope the community shifts to embrace operators ASAP

Zachary Loeber avatar
Zachary Loeber

Excellent point, any CRDs inherently are not supported in the terraform kubernetes provider. You only get common kubernetes resources to work with.

Karoline Pauls avatar
Karoline Pauls


I’m having a hard time understanding your first statement though. ‘…if an initial deployment fails, none after will succeed..’
The reality looks like:

  1. A developer presses the deploy button in the CI to deploy their branch
  2. Release fails but the app runs
  3. They don’t look and keep deploying when needed
  4. Nothing happens
Karoline Pauls avatar
Karoline Pauls


But the devs hated HCL and almost immediately demanded some additional simplistic yaml formatting/pipelines for their dev environment to deploy configmap changes from
I also hate HCL but to prefer YAML over HCL is some serious stockholm syndrome.

Zachary Loeber avatar
Zachary Loeber

If you add a helm test to your chart and subsequent deployment pipeline code you should be able to capture some of the false positives (if I read things right)

Zachary Loeber avatar
Zachary Loeber

I ran into that issue at least. Deployments worked just fine but there were actual underlying container or service issues so no one was the wiser when some component wouldn’t be working. The helm test can be a simple pod that wget’s the service endpoint (or similar)

Karoline Pauls avatar
Karoline Pauls

helm tests are an option but i may as well turn them into monitors once i have them

Zachary Loeber avatar
Zachary Loeber

I’d say both are important

Karoline Pauls avatar
Karoline Pauls

my idea is that if someone goes so far as to write a helm test (i had one.. once), it can as well be a monitor cronjob running every minute, sending metrics

Zachary Loeber avatar
Zachary Loeber

I’d put the test directly in a pipeline for validation of deployment success and if a cronjob were to be used as the monitoring solution (not what I’d personally recommend) I’d make that part of the chart itself.

Zachary Loeber avatar
Zachary Loeber

otherwise, including a prometheus rule as part of the chart to trigger alerts would be more holistic I’d think.

Zachary Loeber avatar
Zachary Loeber

@Karoline Pauls, You always spark super interesting conversations btw

Karoline Pauls avatar
Karoline Pauls

Do I? I thought I was just one person pissed at Jenkins.

Zachary Loeber avatar
Zachary Loeber

what better reason to innovate better solutions than being riled up at the current ones?

Karoline Pauls avatar
Karoline Pauls

regarding the cronjob, in my Salt days I would sometimes write custom Datadog healthchecks (in Python) that would run on instances, hence the idea to have an analogue on kubernetes.

Karoline Pauls avatar
Karoline Pauls

The issue is, as I’m trying to get Jenkins to do what I need (currently a manual deploy step with a toggle for online/offline migrations, dev/prod, etc), it is burning me out. My external ALB ingress controller experiment saw no commits in a month.

Zachary Loeber avatar
Zachary Loeber

I never got to actually deploy custom healthchecks in datadog (it seems to require a local agent which I didn’t have available to me in the last environment I worked in). I used terraform to setup metrics based alerts instead though. There are also some datadog integration project out there for auto-creating some alerts for kube based deployments I thought (https://github.com/fairwindsops/astro)

FairwindsOps/astro

Emit Datadog monitors based on Kubernetes state. Contribute to FairwindsOps/astro development by creating an account on GitHub.

Zachary Loeber avatar
Zachary Loeber

not what you are likely looking for though, sorry

Karoline Pauls avatar
Karoline Pauls

Does anyone else find immutable pod templates in a statefulset a pain? (i know now of –cascade=false)

2020-05-15

Neil Gealy avatar
Neil Gealy

Anybody have experience with pods that are crashing due to heap out of memory, and had to try to capture the core dump?

Neil Gealy avatar
Neil Gealy

I was thinking to change the location of the core dump to write to a persistent volume, but running into an issue because /proc/sys/kernel/core_pattern is read-only and I don’t know why. I’m running on EKS.

Neil Gealy avatar
Neil Gealy

Locally I can edit that file only if i run docker with the “–privileged” flag, so my question is how do I do the equivalent with deployments/pods in EKS

2020-05-17

s2504s avatar

Hi all! What do you advise me, guys, for rotating logs that are being stored in hostPath directory? After pod is terminated, logs retain on a node and will be removed only after node is terminated. Currently we are not able to change log streams to stdout, but we are working on it.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Kubernetes handles the log rotation automatically

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

There are a few tunable parameters for this

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Note, that containers started outside of kubernetes (e.g. docker in docker with jenkins) are not automatically log rotated

s2504s avatar

Thanks. Yes, thas is correct when we talk about containers that are able to redirect logs to stdout/stderror. But That application is not able to send logs to stdout/stderror. That application stores its logs on the container filesystem. Just writes into files /var/log/php/application_errors.log and /var/log/php/application_events.log So, I’ve mounted hostPath volume from node to each container to directory /var/log/php Now, I can scrape log using FluentD, that was deployed to my k8s cluster (DaemonSet). And everything is OK but the files are growing each day and should be rotated. It was what I asked about Legacy application is pain

bradym avatar
How To Manage Logfiles with Logrotate on Ubuntu 16.04 | DigitalOceanattachment image

Logrotate is a system utility that manages the automatic rotation and compression of log files. If log files were not rotated, compressed, and periodically pruned, they would eventually consume all available disk space on a system. In this article, we

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

But you told the application to log to /var/log/php/application_errors.log, right? In that case you can tell it to log to /dev/stdout or /dev/stderr

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

that way it will play nicely with the logging platform

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

you can also ln -s /var/log/php/application_errors.log /dev/stderr

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

inside the Dockerfile

bradym avatar

I’ve done that exact thing before and it works well. Completely forgot about it earlier.

1
Chris Fowles avatar
Chris Fowles

man 90% of keeping up with modern tech is remembering what to forget and trying not to forget what to remember

s2504s avatar

Great! Thank you, guys! Will try it today and get back with additional questions

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
10 most common mistakes using kubernetesattachment image

We had the chance to see quite a bit of clusters in our years of experience with kubernetes (both managed and unmanaged - on GCP, AWS and Azure), and we see some mistakes being repeated. No shame in that, we’ve done most of these too! I’ll try to show the ones we see very often and talk a bit about how to fix them.

1
Zachary Loeber avatar
Zachary Loeber

Excellent list, highly recommended to run through this if you are just getting into kubernetes for your workloads.

10 most common mistakes using kubernetesattachment image

We had the chance to see quite a bit of clusters in our years of experience with kubernetes (both managed and unmanaged - on GCP, AWS and Azure), and we see some mistakes being repeated. No shame in that, we’ve done most of these too! I’ll try to show the ones we see very often and talk a bit about how to fix them.

2020-05-18

adefemi171 avatar
adefemi171

Hi all, does anyone know what could lead to podinitializing state even after increasing storage, more cpu and memory?

Stratos avatar
Stratos

Have a look at the events when you describe the pod, in a case I’ve dealt with in the past it was that it couldn’t allocate an IP

2020-05-19

sahil kamboj avatar
sahil kamboj

Hey Guys Need help regarding traffic load i have websites running on K8 which are sql oriented , need to test how they will behaves in heavy user load and how my cluster handle that(need to upgrade node or not) how should i implement it and what tools should i use

joey avatar

you should write a script to send your cluster a bunch of traffic or use something like ab or gatling or vegeta or any number of load testing tools

Pierre Humberdroz avatar
Pierre Humberdroz

this sucks..

 rpc error: code = Unknown desc = Error response from daemon: Get <https://quay.io/v2/pusher/oauth2_proxy/manifests/v4.0.0>: received unexpected HTTP status: 500 Internal Server Error

How are you all managing external docker images ? Are you putting them in your own registry?

David Scott avatar
David Scott

They’re having an outage: https://status.quay.io/

Quay.io Status

Welcome to Quay.io’s home for real-time and historical data on system performance.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
Registry as a pull through cache

Use-case If you have multiple instances of Docker running in your environment, such as multiple physical or virtual machines all running Docker, each daemon goes out to the internet and…

Pierre Humberdroz avatar
Pierre Humberdroz

yea I know that they have an outage I am just wondering how people mitigate it in general

Pierre Humberdroz avatar
Pierre Humberdroz

so for me I am removing a node every hour from our dev cluster and adding a new one just to see how it would feel to have short lived nodes and now quay was down (500’s) and the node could get the load up and running

Pierre Humberdroz avatar
Pierre Humberdroz

so it is my fault but was just curious to hear how people handle docker images are they all hosting them themselves or would they be affected by this as well

jose.amengual avatar
jose.amengual
Nexus Repository OSS - Software Component Management | Sonatype

The world’s only repository manager with FREE support for popular formats.

btai avatar

this outage has been going on for awhile now https://status.quay.io/incidents/kw2627bsdwd9

Quay.io outage

Quay.io’s Status Page - Quay.io outage.

joey avatar

yeah… ouch.

btai avatar

i’m curious what the root cause was. they had ~19 hour outage

2

2020-05-21

Zachary Loeber avatar
Zachary Loeber

I converted localstack to run in kubernetes for locally testing out AWS scripts on kind clusters. Example includes the use of kompose, helmfile, the raw helm chart, and my own little framework for stitching it all together. https://zacharyloeber.com/2020/05/aws-testing-with-localstack-on-kubernetes/

Aws Testing With Localstack on Kubernetesattachment image

Aws Testing With Localstack on Kubernetes - Zachary Loeber’s Personal Site

4
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

wow, rad idea

Mike F. avatar
Mike F.

Super cool stuff!

2020-05-24

Zachary Loeber avatar
Zachary Loeber

well that’s useful. kind of like argocd or flux?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

ya, it seems like it

Zachary Loeber avatar
Zachary Loeber

sometimes I swear that the matrix is glitching on me, I’m deep diving into argocd right now

Zachary Loeber avatar
Zachary Loeber

If I go outside for a walk and see 3 instances where someone is walking a pet cat or iguana or something I’ll know I’m in the matrix…

1
Zachary Loeber avatar
Zachary Loeber

you happen to find direct links to the CRDs for this thing?

Zachary Loeber avatar
Zachary Loeber

nm, just used my gcloud account to grab the files

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Haven’t looked past the news article yet

2020-05-26

Matt Gowie avatar
Matt Gowie

Hey k8s folks — Is there a defined best place to get spun up on k8s? I’ve put off ramping up on the topic for a while, but I’d like to dive in while I have some downtime from clients. I’m sure some folks here have good resources or strong opinions on where to head for info.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Hrm… so I think what you’ll find is there is a wide gap between how it’s done on AWS vs Azure vs GKE vs bare metal, etc

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Did you have one in mind?

Matt Gowie avatar
Matt Gowie

Huh interesting. Yeah, I’d say I’d probably focus on AWS.

Matt Gowie avatar
Matt Gowie

Your suggestion would be to tailor learning towards k8s on EKS?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

I think there are two sides of it. On the one side, you need to learn k8s the platform (how to run stuff on k8s). that will be the most similar across cloud providers (but not identical).

Matt Gowie avatar
Matt Gowie

@joey — Good stuff, I’ll definitely check those out. Thanks for sharing.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Then there’s operating k8s as a platform. that’s where you need to pick one cloud provider and kick the tires.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

this is where you get the operational experience.

Matt Gowie avatar
Matt Gowie

Makes sense.

joey avatar

and if you’re going to choose a specific cloud provider and do it like that, you might as well be using terraform and/or terragrunt

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

then I think the “Best Practice” will be determined if you want to go the AWS-native approach with cloud formation, or use the eksctl. Or if you want to use #terraform.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

to use k8s effectively, you’ll inevitably need to provision a bunch of stuff that isn’t handled by EKS (e.g. IAM roles). so using terraform is advisable.

Matt Gowie avatar
Matt Gowie

Yeah, I’m already very bought in on Terraform so that would be my approach for sure.

Matt Gowie avatar
Matt Gowie

@Erik Osterman (Cloud Posse) Do you have a resource you’d suggest for the “you need to learn k8s the platform (how to run stuff on k8s)” side of things? I think that’s really what I’m looking for.

Zachary Loeber avatar
Zachary Loeber

If you are on a linux platform you can look at using libvirt+terraform to get running pretty quickly with your own local cluster. https://github.com/zloeber/k8s-lab-terraform-libvirt

zloeber/k8s-lab-terraform-libvirt

A Kubernetes lab environment using terraform and libvirt - zloeber/k8s-lab-terraform-libvirt

Zachary Loeber avatar
Zachary Loeber

for getting comfy with kube deployments and getting around I’d just start up a kind,k3d,minikube, or microk8s local cluster and start looking to deploy things to it that you might find yourself deploying for work.

Zachary Loeber avatar
Zachary Loeber

a sufficiently complex app that you could start with and feasibly could be in several types of environments might be airflow

Zachary Loeber avatar
Zachary Loeber

You don’t need cloud resources to dive into the deep end pretty quickly with k8s

Matt Gowie avatar
Matt Gowie

@Zachary Loeber Good stuff, I’ll check those out and keep that in mind. Thanks for sharing!

Zachary Loeber avatar
Zachary Loeber

let us know how it goes, glad you are diving deeper into kube, take a deep swig of the koolaide….whatever they laced the kube-koolaide with is addicting

1
1
roth.andy avatar
roth.andy
11:36:35 PM

If I get this demo working I’ll be using the new Kubernetes provider for Terraform during my keynote at the Crossplane Community Day virtual event. https://www.eventbrite.com/e/crossplane-community-day-tickets-104465284478 https://twitter.com/mitchellh/status/1265414263281029120

2020-05-27

roth.andy avatar
roth.andy

Can I get a dummy check on my plan for deploying kiam to my K8s cluster?

  1. Add an additional node pool of small instances to each cluster. 1-2 instances is really all that is needed
  2. Apply the Instance Profile to the new node pool
  3. Apply a Taint to the new nodes that tells the cluster not to schedule any pods to them
  4. Deploy the kiam server to the new nodes using a Toleration
  5. Deploy the kiam agent to all nodes
  6. Annotate namespaces that are allowed to use IAM with the [iam.amazonaws.com/permitted](http://iam.amazonaws.com/permitted): <regex that matches allowed roles> annotation
  7. Annotate pods inside the permitted namespaces with [iam.amazonaws.com/role](http://iam.amazonaws.com/role): <role name>
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Yup that looks good

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

As an extra safety precaution you can use iptables to firewall direct access to the real metadata api

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Also implicitly, you’ll need to provision the IAM roles that you will need

roth.andy avatar
roth.andy

those are provided. In the account this will be in I don’t have permission to manage IAM

Chris Fowles avatar
Chris Fowles

as of 3.5 on kiam you can use service account iam roles instead of instance profiles: https://github.com/uswitch/kiam/blob/master/CHANGELOG.md#v35

uswitch/kiam

Integrate AWS IAM with Kubernetes. Contribute to uswitch/kiam development by creating an account on GitHub.

roth.andy avatar
roth.andy

Not using EKS

Chris Fowles avatar
Chris Fowles

allegedly you can do it without eks: https://github.com/aws/amazon-eks-pod-identity-webhook/

i’ve not seen or deployed this tho

aws/amazon-eks-pod-identity-webhook

Amazon EKS Pod Identity Webhook. Contribute to aws/amazon-eks-pod-identity-webhook development by creating an account on GitHub.

roth.andy avatar
roth.andy

Interesting

Chris Fowles avatar
Chris Fowles

i’ve got kiam running on service accounts and it works pretty well - i only did it that way because .net core credential provider in the aws sdk didn’t support webidenties for a while so we couldn’t go full service account roles.

i’m probably going to rip it out soon, as the .net sdk supports it now.

Chris Fowles avatar
Chris Fowles

but it’s been solid enough

2020-05-28

Zachary Loeber avatar
Zachary Loeber

seems quay is down again

2
1
Zachary Loeber avatar
Zachary Loeber

Are there any projects out there that make it easy to report on all of the cached images in a cluster, pull them into some other registry, then apply a mutating admission webhook to rewrite the container registry source when the deployments get applied to point to the new image source?

roth.andy avatar
roth.andy

Something far easier that I’ve seen done is to just add a Validating Admission Controller to make sure containers come from the registry you want.

roth.andy avatar
roth.andy

But your description is obviously much more elegant than that

Zachary Loeber avatar
Zachary Loeber

That is part of the preventative portion of a holistic solution for certain

Zachary Loeber avatar
Zachary Loeber

I suppose the tool I’m thinking of would be for migration of outside dependencies to local registries only

joey avatar

this would be pretty cool, would love to hear if you find something before i set off on a path to try to do something related

roth.andy avatar
roth.andy

Just make sure the juice is worth the squeeze. Unless you have compliance/regulatory requirements, something like that will add a huge operational headache

1
Zachary Loeber avatar
Zachary Loeber

I guess that is not ‘easy’ at all but there should be such a project if not to help avoid things like this

Zachary Loeber avatar
Zachary Loeber

I’d specifically target things like cert-manager, kafka, or any other vendor(ish) images that would typically go through a review and testing process before simply upgrading them (so core services in a deployment, not developer workloads that might get updated multiple times a day)

Zachary Loeber avatar
Zachary Loeber

I have peeked over kube-fledged and it seems to be on the right path towards something like this https://github.com/senthilrch/kube-fledged

senthilrch/kube-fledged

A kubernetes add-on for creating and managing a cache of container images directly on the cluster worker nodes, so application pods start almost instantly - senthilrch/kube-fledged

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@btai @Pierre Humberdroz

senthilrch/kube-fledged

A kubernetes add-on for creating and managing a cache of container images directly on the cluster worker nodes, so application pods start almost instantly - senthilrch/kube-fledged

Pierre Humberdroz avatar
Pierre Humberdroz

Oh awesome !

Zachary Loeber avatar
Zachary Loeber

It may be useful for figuring out how to eliminate core service outside dependencies

Zachary Loeber avatar
Zachary Loeber

outside deps == evil

btai avatar

this looks awesome

btai avatar

reminds me a little bit of that uber project too

Zachary Loeber avatar
Zachary Loeber

Three devops holy war creeds; 1. latest tag is evil, 2. outside dependencies are our enemy; 3. incremental improvement of all things….

Zachary Loeber avatar
Zachary Loeber

kube-fledged use case #4: If a cluster administrator or operator needs to roll-out upgrades to an application and wants to verify before-hand if the new images can be pulled successfully.

Zachary Loeber avatar
Zachary Loeber

cool beans

Zachary Loeber avatar
Zachary Loeber
senthilrch/kube-fledged

A kubernetes add-on for creating and managing a cache of container images directly on the cluster worker nodes, so application pods start almost instantly - senthilrch/kube-fledged

btai avatar

the name for “that uber project” is kraken https://github.com/uber/kraken

uber/kraken

P2P Docker registry capable of distributing TBs of data in seconds - uber/kraken

Zachary Loeber avatar
Zachary Loeber

I ran across that one as well, the name certainly is apropos…

Zachary Loeber avatar
Zachary Loeber

(too bad there isn’t a simple helm chart deployment for the thing, it is either straight yaml or the flux helm-operator it seems….)

Zachary Loeber avatar
Zachary Loeber

In fact, that is the only project I’ve seen that allows for any direct image cache manipulation within a cluster

btai avatar

wonder what they’re doing at quay to be causing this many major outages

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

quay acquired by coreos

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

coreos acquired by redhat

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

coreos (the os) now EOL

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

quay the registry? not sure, but seeing how acquisitions go - the project leads are probably all gone. the project itself open sourced. who knows? maybe it’s on life support?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

(this is all rampant, unqualified speculation. I have no inside knowledge of what’s going on.)

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

#codefresh decide to deprecate their registry

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

my guess is these registries cost a fortune to operate

this2
Zachary Loeber avatar
Zachary Loeber

I imagine they would eat bandwidth and cloud storage space like no tomorrow

Zachary Loeber avatar
Zachary Loeber

I’ve seen builds based on upstream public images as the base image before and cried on the inside (then promptly eliminated those outside deps…)

btai avatar

i feel like many of the major kubernetes tools have been hosting their images on quay.

    keyboard_arrow_up