#kubernetes (2020-09)

kubernetes

Archive: https://archive.sweetops.com/kubernetes/

2020-09-03

Karim Benjelloun avatar
Karim Benjelloun

Hello! does anyone store secrets on ansible-vault?

Matt Gowie avatar
Matt Gowie

I have a project that I’m on that does this, but we’re moving away from it towards using sops.

kskewes avatar
kskewes

We’ve been doing this with bare metal and it has been fine but moving away to EC2.

Amit Karpe avatar
Amit Karpe

I Use secret manager (and EKS) with help of godaddy Kubernetes External Secrets project

2020-09-04

Matt Gowie avatar
Matt Gowie

For folks on EKS using alb-ingress-controller — Do you use nginx / nginx-ingress-controller as well? A team member (who has more k8s experience than myself) is suggesting we go ALB => Nginx => Applications. And I’m just wondering if that is overly complex and not needed. As in, is it more standard to have a single ALB Ingress resource that fronts any external application services.

roth.andy avatar
roth.andy

Personally I’ve stuck with Nginx-ingress controller or Istio’s ingress.

2
Vlad Ionescu (he/him) avatar
Vlad Ionescu (he/him)

Why would you use both ALB Ingress Controller and nginx? Is there some specific use-case that you need?

On AWS the big advantage of the ALB Ingress Controller is the fact that traffic goes straight into the pod. Internet -> ALB -> Pod. Low latency, clear path. The ALB Ingress Controller just looks at what pods are in the cluster and updates the ALB.

On the other side, nginx-ingress works in a different way. Traffic goes Internet -> ELB( usually, if you use type: LoadBalancer) -> random node -> move through nodes and kube-proxy -> pod. That’s a… longer way. For higher-scale cluster it takes a while for all routes and IPs to be propagated so you may hit some super-hard to debug issues( timeouts, not founds).

Vlad Ionescu (he/him) avatar
Vlad Ionescu (he/him)

ALB Ingress Controller is more expensive since it creates 1 ALB per Ingress( there’s a public beta with a shared ALB, but it’s a beta) so not really an option if you have 100s of Ingresses

Matt Gowie avatar
Matt Gowie

Is the new shared ALB work the way to enable host-based routing?

Matt Gowie avatar
Matt Gowie

I agree that nginx-controller is more complicated. I’m not sold on it and I’m trying to figure out how I should direct this.

Vlad Ionescu (he/him) avatar
Vlad Ionescu (he/him)

No idea about the host-based routing, sorry I know the ALB Ingress Controller creates 1 ALB per Ingress, but you can have multiple hosts for that ingress.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

so I don’t see any reason for alb -> nginx-ingress -> pods, however, there’s no reason you can’t run multiple ingress controllers

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

alb is nice if you want to integrate with WAF or other AWS services under one load balancer

1
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

that said, one load balancer per ingress definition sucks - so we mostly gravitate towards nginx-ingress for that reason

Matt Gowie avatar
Matt Gowie

Ah great point on WAF.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

your coworker mgith be suggesting alb -> nginx-ingress so you can still have everything under the same load balancer entry point, but also have multiple ingresses without creating multiple load balancers.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

istio also has it’s own ingress controller

Matt Gowie avatar
Matt Gowie

@Erik Osterman (Cloud Posse) Are you not able to configure multiple hosts / subdomains under one ALB ingress?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

so anyways, i’d just concede that in the long run, you might have multiple controllers. ingress is just a way to get traffic into the cluster. there are multiple ways to do that.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

No, it’s just that for every Ingress CRD, a load balancer is created with the alb-ingress-controller

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

(IMO not in the spirit of what kubernetes should do)

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

nginx-ingress creates one load balancer period. then for each Ingress CRD creates the routing rules in the proxy to handle it. That means if you deploy 25 different charts, each with their own Ingress, then with nginx-ingress you get 1 load balancer, and with alb-ingress you get 25

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

but as @Vlad Ionescu (he/him) said, apparently they have a fix for that in beta of the alb-ingress-controller

Matt Gowie avatar
Matt Gowie

Yeah… we have 3 main services that we want to expose externally. Most of the services that we have are internal. So even though any number more than 1 is too many, I’m not worried about creating an absurd amount of ALBs. On non-k8s projects, I typically gravitate towards a single ALB that then does host-based routing to services. And since I definitely need to roll a WAF into this platform in the future and the integration between WAF <=> ALBs is easy, I don’t know if I like the nginx route.

I’m leaning toward pushing for just using alb-ingress for now and then we’ll switch to this upcoming single ALB for multiple ingresses work that is in beta at some point in the future.

2020-09-05

2020-09-07

Soumya avatar

hi has anyone knowledge around the iam role association to pods through service-account in EKS, Im able to do assume the role (assume-role-with-web-identity) of the same account , but now I need to assume a role present in a different account, I already tried out with attaching assume-role policy to my role for that (role present in 2nd account ) and even editing its trust relationships. Thanks in advance

Raymond Liu avatar
Raymond Liu

Have you try aws sts assume-role --role-arn <2nd-account-rold-arn> in the pod? What’s error raised ?

Raymond Liu avatar
Raymond Liu

Also, please check aws sts get-caller-identity first, make sure the pod can assume role of the same account.

Douglas Clow avatar
Douglas Clow
Cross account IAM roles for Kubernetes service accounts | Amazon Web Servicesattachment image

With the introduction of IAM roles for services accounts (IRSA), you can create an IAM role specific to your workload’s requirement in Kubernetes. This also enables the security principle of least privilege by creating fine grained roles at a pod level instead of node level. In this blog post, we explore a use case where […]

2020-09-08

Tiago Meireles avatar
Tiago Meireles

Does anyone run ingress-nginx at scale(i.e 200k/rps+)? Any recommendations at reaching that scale?

joey avatar

that’s some serious rps. what are you doing over there?

2020-09-09

Vlad Ionescu (he/him) avatar
Vlad Ionescu (he/him)

Per-pod SecurityGroups are now a native thing in EKS: https://docs.aws.amazon.com/eks/latest/userguide/security-groups-for-pods.html

Security groups for pods - Amazon EKS

Security groups for pods integrate Amazon EC2 security groups with Kubernetes pods. You can use Amazon EC2 security groups to define rules that allow inbound and outbound network traffic to and from pods that you deploy to nodes running on many Amazon EC2 instance types.

8
Matt Gowie avatar
Matt Gowie

FYI @olga.bilan — we might want to roll this in at some point.

Security groups for pods - Amazon EKS

Security groups for pods integrate Amazon EC2 security groups with Kubernetes pods. You can use Amazon EC2 security groups to define rules that allow inbound and outbound network traffic to and from pods that you deploy to nodes running on many Amazon EC2 instance types.

Vlad Ionescu (he/him) avatar
Vlad Ionescu (he/him)

No more need for Calico

Vlad Ionescu (he/him) avatar
Vlad Ionescu (he/him)

^^^ For now, it requires EKS 1.17 eks.3 and CNI v1.7.1 or later, and setting ENABLE_POD_ENI=true

2020-09-10

Emmanuel Gelati avatar
Emmanuel Gelati

No support for t3

2020-09-14

Craig Dunford avatar
Craig Dunford

quick, hopefully simple question - many applications we deploy with 2 replicas for high availability (should one pod hang/crash, the other is still there to offer the service). I had made the (obviously incorrect) assumption that these pods would always be scheduled on different nodes, but I have observed that is not the case (usually it is, but not always); this leaves us in a bit of a compromised position regarding high availability; should we lose a node that has 2 pods from the same deployment on it, that service is lost for a time.

my question - is it possible to somehow annotate a deployment to indicate pods should be scheduled on different nodes?

Raymond Liu avatar
Raymond Liu

Hey, you could configure inter-pod affinity/anti-affinity constrains.

roth.andy avatar
roth.andy
Assigning Pods to Nodes

You can constrain a PodA Pod represents a set of running containers in your cluster. to only be able to run on particular Node(s)A node is a worker machine in Kubernetes. , or to prefer to run on particular nodes. There are several ways to do this, and the recommended approaches all use label selectors to make the selection. Generally such constraints are unnecessary, as the scheduler will automatically do a reasonable placement (e.

Craig Dunford avatar
Craig Dunford

perfect, thanks

Issif avatar

something like that :

Issif avatar
   spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchLabels:
                  app: api-private-web
              topologyKey: kubernetes.io/hostname
Raymond Liu avatar
Raymond Liu

And also, you could apply deschedule plugin to force reschedule your pod when pod anti-affinity constrain is violated.

kubernetes-sigs/descheduler

Descheduler for Kubernetes. Contribute to kubernetes-sigs/descheduler development by creating an account on GitHub.

Raymond Liu avatar
Raymond Liu

Don’t forget to set proper PodDisruptionBudget https://kubernetes.io/docs/tasks/run-application/configure-pdb/

Specifying a Disruption Budget for your Application

FEATURE STATE: Kubernetes v1.5 [beta] This page shows how to limit the number of concurrent disruptions that your application experiences, allowing for higher availability while permitting the cluster administrator to manage the clusters nodes. Before you begin You are the owner of an application running on a Kubernetes cluster that requires high availability. You should know how to deploy Replicated Stateless Applications and/or Replicated Stateful Applications. You should have read about Pod Disruptions.

1
Issif avatar

Hi, I’m using this great module https://github.com/cloudposse/terraform-aws-eks-workers and would like to change the root volume size, is that possible? Can’t figure out how until now

cloudposse/terraform-aws-eks-workers

Terraform module to provision an AWS AutoScaling Group, IAM Role, and Security Group for EKS Workers - cloudposse/terraform-aws-eks-workers

kskewes avatar
kskewes

Yes. See the ASG module that EKS uses for block mappings. I’ll be able to post a code snippet soon when back in front of laptop.

cloudposse/terraform-aws-eks-workers

Terraform module to provision an AWS AutoScaling Group, IAM Role, and Security Group for EKS Workers - cloudposse/terraform-aws-eks-workers

kskewes avatar
kskewes
  # some applications need scratch disk...
  block_device_mappings = [{
    device_name  = "/dev/xvda"
    no_device    = false
    virtual_name = "ephemeral0"
    ebs = {
      delete_on_termination = true
      encrypted             = true
      volume_size           = 100
      volume_type           = "gp2"
      iops                  = null
      kms_key_id            = null
      snapshot_id           = null
    }
  }]
Issif avatar

I saw that block, but the comment says it’s for EBS aside root volume

Issif avatar

I’ll give it a try, thanks

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

yea, confirmed in our jenkins setup

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
locals {
  root_device = {
    device_name  = "/dev/xvda"
    no_device    = "false"
    virtual_name = "root"
    ebs = {
      encrypted             = true
      volume_size           = var.jenkins_disk_size
      delete_on_termination = true
      iops                  = null
      kms_key_id            = null
      snapshot_id           = null
      volume_type           = "gp2"
    }
 }

...

  block_device_mappings = [local.root_device]
Issif avatar

awesome, thanks dudes

Issif avatar

It worked, thanks again

1

2020-09-15

Juan Soto avatar
Juan Soto


hey who has information about this? Google Kubernetes Engine vulnerability (CVE-2020-14386)

Mithra avatar

Hi I’m having a issue with kubectl commands where unable to get the response while running these commands– kubectl get nodes error from the server (Timeout): the server was unable to return a response in the time allotted but may still be processing the request (get node)

kubectl top nodes Error from the server (service unavailable) the server is currently unable to handle the request I have two master nodes and and 3 worker nodes with one API server

can anyone help with this issue to resolve please.and this is not in cloud.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Have you ruled out that it’s not a simple networking/firewall issue?

Mithra avatar

Yes I Have

jason einon avatar
jason einon

Hey @Mithra was you able to solve the issue? have you confirmed the kubectl conf is configured as per the cluster you are connecting to, Is the cluster public or private? if the latter is the network you are connecting from been granted access to manage the nodes?

• If you have the solution, also be good to share what the root cause was

Mithra avatar

Hi @jason einon yes it was solved with out config file and the servers are managed by Chef- Manage and we ran a check in on Chef-client(Configuration ) and the the Master server and worker nodes are up and running and kubectl cmds are working

2020-09-17

jose.amengual avatar
jose.amengual

Anyone here with experience with Rancher and Kiosk could tell me if they are not compatible with each other? are they competing products in any way?

jose.amengual avatar
jose.amengual

and any feedback about Loft users?

jose.amengual avatar
jose.amengual

I’m looking for a multi tenant , cluster management tool that can help us migrating and to K8s from ECS and make things easier for everyone

roth.andy avatar
roth.andy

I’m a big fan of Rancher

jose.amengual avatar
jose.amengual

Does it do similar stuff like kiosk or Is better to say that rancher is a cluster management and kiosk is a multitenant facilitator?

roth.andy avatar
roth.andy

I’ve never used kiosk, but Rancher can manage clusters and do user management as well. If you are looking to multi-tenant a single cluster, kiosk might be a better fit since it looks like that is its primary purpose. If you are looking to centrally manage one cluster per tenant (which IMO is the better strategy) then Rancher is great for that.

roth.andy avatar
roth.andy

Rancher can do it by mapping users to Projects (which is Rancher’s name for a set of namespaces grouped together), and I have done that in the past

roth.andy avatar
roth.andy

but I’m still skittish about doing true multi-tenant on a single cluster

jose.amengual avatar
jose.amengual

I was looking at loft that have this concept of virtual cluster

jose.amengual avatar
jose.amengual

I need to read more about it

jose.amengual avatar
jose.amengual

the multitenant is mostly for POC of different teams

jose.amengual avatar
jose.amengual

but multicluster will be the idea

jose.amengual avatar
jose.amengual

I will like to have some sort of pipeline where basically rancher can call Atlantis and run a pipeline to create a cluster using Terraform

jose.amengual avatar
jose.amengual

hopefully there will be little to no aws cli/console work from developers to do

2020-09-18

2020-09-22

shtrull avatar
shtrull

quick question if i patch a manifest, and after a while i apply again the original manifest will it remove my patches?

Issif avatar

it will

shtrull avatar
shtrull

cool thanks

Dhrumil Patel avatar
Dhrumil Patel

Hello, does any one know any ingress controller that can be used in target Group of AWS. We have ALB which send traffic to EC2 instance and for only one service we want to send traffic to Kubernetes pod from that ALB. for that we need one target group which can route traffic from ALB to Kubernetes pod…This target group should updated once pod recycle…

joey avatar

aws-alb-ingress-controller ?

Dhrumil Patel avatar
Dhrumil Patel

I only need target group ALB will remain old one just one /xyz routed in kubernetes all othere route go in ec2 server..I think alb-ingress-controller also spawn ALB ….

joey avatar

ahh you’re right

btai avatar

traefik

1
Emmanuel Gelati avatar
Emmanuel Gelati

you could setup the ingress controller service using nodePort

Emmanuel Gelati avatar
Emmanuel Gelati

and after you created a target group, Now you could create the routing rule in the alb for /xyz to your new target group

Emmanuel Gelati avatar
Emmanuel Gelati

When you use nodePort you could specify the port

1
Dhrumil Patel avatar
Dhrumil Patel

thanks to all of you

2020-09-23

Aumkar Prajapati avatar
Aumkar Prajapati

Hey guys, does anyone know why a Kubernetes cluster wouldn’t be creating any pods whatsoever?

Aumkar Prajapati avatar
Aumkar Prajapati

Even for new deployments, deleted pods with replicasets, it’s simply stopped creating new pods.

Aumkar Prajapati avatar
Aumkar Prajapati

Yeah I posted the cause of the problem, it was a bad master @Emmanuel Gelati

Aumkar Prajapati avatar
Aumkar Prajapati

Resolved, was a bad master

2020-09-24

dalekurt avatar
dalekurt

Hello wave I’m looking for a solution which allows users to use a self-service catalog to deploy (using Helm Charts) a web app.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

GitHub PR =P

dalekurt avatar
dalekurt

I’m trying to implement it for engineers and end users who just want to click a button.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

one other option is Lens

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

one-click helm deployment

dalekurt avatar
dalekurt

Yes, I’ve used lens. I will need to setup a repo for helm charts hosted on-prem and provide the option to deploy the app for QA testing or review app

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
Kubermatic Announces Open Source Service Hub KubeCarrierattachment image

Kubermatic, a German-based startup, introduced an open-source tool called KubeCarrier that automates lifecycle management of services, applications, and hardware using Kubernetes Operators. The goal of the tool is to provide scalability and repeatability to meet the organization’s requirements.

dalekurt avatar
dalekurt

One such solution I’ve come across has been KubeApps - https://kubeapps.com/

1
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

looks nice! hadn’t seen it before

rei avatar

Nice!

2020-09-25

Jan avatar

Any one noticed that the default security group for the eks module doesn’t seem to be applied to anything? https://github.ada.com/infrastructure/infra-common-aws/blob/master/terraform-aws-eks-cluster/main.tf#L66-L67

Build software better, togetherattachment image

GitHub is where people build software. More than 40 million people use GitHub to discover, fork, and contribute to over 100 million projects.

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)
cloudposse/terraform-aws-eks-cluster

Terraform module for provisioning an EKS cluster. Contribute to cloudposse/terraform-aws-eks-cluster development by creating an account on GitHub.

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

yes, for managed node groups and fargate profiles, it’s not in use

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

because

Managed Node Groups do not expose nor accept any Security Groups.
Instead, EKS creates a Security Group and applies it to ENI that is attached to EKS Control Plane master nodes and to any managed workloads.
Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

but it’s in use when we use unmanaged workers, in which case EKS does not do anything to connect them, and we connect EKS control plane to the worker nodes via that default SG

1
Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

since the EKS cluster module is used for both cases, the SG is there

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

with managed Node Groups, the below SG is used to allow access to other resources (e.g. Aurora, EFS, etc.) instead of the default SG (which is just created but not used for managed nodes):

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)
output "eks_cluster_managed_security_group_id" {
  description = "Security Group ID that was created by EKS for the cluster. EKS creates a Security Group and applies it to ENI that is attached to EKS Control Plane master nodes and to any managed workloads"
  value       = join("", aws_eks_cluster.default.*.vpc_config.0.cluster_security_group_id)
}
Jan avatar

ah right, makes a lot of sense thanks dude

Jan avatar

maybe I have misunderstood it

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
07:39:03 PM
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Cannot view

2020-09-26

Jan avatar

oh lol, I linked to a private repo

2020-09-28

2020-09-29

Craig Dunford avatar
Craig Dunford

hello all, I have a couple unrelated (hopefully easy) questions:

  1. I am looking at implementing a StatefulSet for an application, mostly to satisfy an ordered startup requirement of the application; I notice that StatefulSets require a headless service which does not do any sort of load balancing for the pods (looks like it’s just for DNS name assignment); can I also place a ClusterIP service in front of the statefulset pods and achieve the same thing one would with a Deployment?
  2. I deploy an application that has an excessively long startup time, such that the liveness probe for the app has an initial timeout of 3 minutes + the actual failure for the liveness probe is another 3 minutes (meaning a startup failure takes 6 minutes to be detected. If we end up in a startup failure related crash loop, we never seem to trip the crashloopbackoff behavior and end up restarting perpetually until someone intervenes:

• to date, startup probes haven’t been available to us on our managed cluster, but I believe they will be soon - would implementing a startup probe change this behavior?

• is there anything else I might be able to do to affect this so we do indeed trip the backoff behavior if this happens (I believe I’ve read that backoff timers are not configurable today)?

• this application also has a prestop hook that takes several minutes to complete - does this have any impact on the backoff behavior (I potentially have some flexibility here to change things)?

2020-09-30

Matt Gowie avatar
Matt Gowie

Possible #office-hours question — How are folks managing service env var configuration in K8s with Helm?

Matt Gowie avatar
Matt Gowie

The client project I am on at the moment had a pattern in place when I had joined on:

  1. Raw env variables in values.yaml
  2. A values.yaml map of env var names to a single cluster wide ConfigMap
  3. A values.yaml map of env var names to a single cluster wide Secret The ConfigMap + Secret mentioned are created by Terraform when the cluster is initially spun up and populated with various config from tf remote state and similar. The above ends up looking like the following in each Chart’s values.yaml:
secretMapping:
  RABBIT_PASSWORD: rabbit_pass # rabbit_pass key in shared Secret
  # ... 

configMapping:
  SOME_ENV_VAR_NAME: some_configmap_name # same as above but in shared ConfigMap
  # ... 

env:
  RAW_ENV_VAR: "Value"
  # ...

Then when supplying environment to any container in the Charts, we use a shared helper to mash the 3 together with valueFrom.configMapKeyRef, valueFrom.secretKeyRef, and just name value pairs from env. This works of course, but it’s lot of mapping this to that and there is no single source of truth for values (split between Terraform driven Secret / ConfigMap and values.yaml files in each Chart (which there are 20 of right now).

I’m considering throwing most of this away and creating a ConfigMap + Secret per Chart/Service via Terraform. Then a shared helper could just iterate over the service in question’s ConfigMap and Secret without any raw values in the Chart. Thus creating a single source of truth and hopefully saving microservice configuration headaches.

Wondering if that sounds like a decent pattern or if there are other, more mainstream approaches to this.

Matt Gowie avatar
Matt Gowie

@Erik Osterman (Cloud Posse) I won’t be able to make it today so if you want to hold this one for next week I won’t be upset

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

No worries! yes, we ran out of time - let’s discuss next week

1
Matt Gowie avatar
Matt Gowie

@Erik Osterman (Cloud Posse) Did this one come up in this past week’s office hours? Will listen to the recording if so, but wanted to check.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

not yet - we had a surprise visit by @Andriy Knysh (Cloud Posse) and talked about [serverless.tf](http://serverless.tf) instead

1
Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

it was not me, it was @antonbabenko

1
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

oh haha

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

yes!

loren avatar
loren
02:31:39 PM

live streaming all day, https://www.twitch.tv/cdkday

1
    keyboard_arrow_up