#kubernetes (2020-09)
Archive: https://archive.sweetops.com/kubernetes/
2020-09-03
Hello! does anyone store secrets on ansible-vault
?
I have a project that I’m on that does this, but we’re moving away from it towards using sops.
We’ve been doing this with bare metal and it has been fine but moving away to EC2.
I Use secret manager (and EKS) with help of godaddy Kubernetes External Secrets project
2020-09-04
For folks on EKS using alb-ingress-controller — Do you use nginx / nginx-ingress-controller as well? A team member (who has more k8s experience than myself) is suggesting we go ALB => Nginx => Applications. And I’m just wondering if that is overly complex and not needed. As in, is it more standard to have a single ALB Ingress resource that fronts any external application services.
Why would you use both ALB Ingress Controller and nginx? Is there some specific use-case that you need?
On AWS the big advantage of the ALB Ingress Controller is the fact that traffic goes straight into the pod. Internet -> ALB -> Pod. Low latency, clear path. The ALB Ingress Controller just looks at what pods are in the cluster and updates the ALB.
On the other side, nginx-ingress works in a different way. Traffic goes Internet -> ELB( usually, if you use type: LoadBalancer
) -> random node -> move through nodes and kube-proxy -> pod. That’s a… longer way. For higher-scale cluster it takes a while for all routes and IPs to be propagated so you may hit some super-hard to debug issues( timeouts, not founds).
ALB Ingress Controller is more expensive since it creates 1 ALB per Ingress( there’s a public beta with a shared ALB, but it’s a beta) so not really an option if you have 100s of Ingresses
Is the new shared ALB work the way to enable host-based routing?
I agree that nginx-controller is more complicated. I’m not sold on it and I’m trying to figure out how I should direct this.
No idea about the host-based routing, sorry I know the ALB Ingress Controller creates 1 ALB per Ingress, but you can have multiple hosts for that ingress.
so I don’t see any reason for alb -> nginx-ingress -> pods
, however, there’s no reason you can’t run multiple ingress controllers
alb is nice if you want to integrate with WAF or other AWS services under one load balancer
that said, one load balancer per ingress definition sucks - so we mostly gravitate towards nginx-ingress for that reason
Ah great point on WAF.
your coworker mgith be suggesting alb
-> nginx-ingress
so you can still have everything under the same load balancer entry point, but also have multiple ingresses without creating multiple load balancers.
istio also has it’s own ingress controller
@Erik Osterman (Cloud Posse) Are you not able to configure multiple hosts / subdomains under one ALB ingress?
so anyways, i’d just concede that in the long run, you might have multiple controllers. ingress is just a way to get traffic into the cluster. there are multiple ways to do that.
No, it’s just that for every Ingress
CRD, a load balancer is created with the alb-ingress-controller
(IMO not in the spirit of what kubernetes should do)
nginx-ingress
creates one load balancer period. then for each Ingress
CRD creates the routing rules in the proxy to handle it. That means if you deploy 25 different charts, each with their own Ingress
, then with nginx-ingress
you get 1 load balancer, and with alb-ingress
you get 25
but as @Vlad Ionescu (he/him) said, apparently they have a fix for that in beta of the alb-ingress-controller
Yeah… we have 3 main services that we want to expose externally. Most of the services that we have are internal. So even though any number more than 1 is too many, I’m not worried about creating an absurd amount of ALBs. On non-k8s projects, I typically gravitate towards a single ALB that then does host-based routing to services. And since I definitely need to roll a WAF into this platform in the future and the integration between WAF <=> ALBs is easy, I don’t know if I like the nginx route.
I’m leaning toward pushing for just using alb-ingress for now and then we’ll switch to this upcoming single ALB for multiple ingresses work that is in beta at some point in the future.
2020-09-05
2020-09-07
hi has anyone knowledge around the iam role association to pods through service-account in EKS, Im able to do assume the role (assume-role-with-web-identity) of the same account , but now I need to assume a role present in a different account, I already tried out with attaching assume-role policy to my role for that (role present in 2nd account ) and even editing its trust relationships. Thanks in advance
Have you try aws sts assume-role --role-arn <2nd-account-rold-arn>
in the pod? What’s error raised ?
Also, please check aws sts get-caller-identity
first, make sure the pod can assume role of the same account.
With the introduction of IAM roles for services accounts (IRSA), you can create an IAM role specific to your workload’s requirement in Kubernetes. This also enables the security principle of least privilege by creating fine grained roles at a pod level instead of node level. In this blog post, we explore a use case where […]
2020-09-08
Does anyone run ingress-nginx at scale(i.e 200k/rps+)? Any recommendations at reaching that scale?
that’s some serious rps. what are you doing over there?
2020-09-09
Per-pod SecurityGroups are now a native thing in EKS: https://docs.aws.amazon.com/eks/latest/userguide/security-groups-for-pods.html
Security groups for pods integrate Amazon EC2 security groups with Kubernetes pods. You can use Amazon EC2 security groups to define rules that allow inbound and outbound network traffic to and from pods that you deploy to nodes running on many Amazon EC2 instance types.
FYI @olga.bilan — we might want to roll this in at some point.
Security groups for pods integrate Amazon EC2 security groups with Kubernetes pods. You can use Amazon EC2 security groups to define rules that allow inbound and outbound network traffic to and from pods that you deploy to nodes running on many Amazon EC2 instance types.
No more need for Calico
^^^ For now, it requires EKS 1.17 eks.3 and CNI v1.7.1 or later, and setting ENABLE_POD_ENI=true
2020-09-10
No support for t3
2020-09-14
quick, hopefully simple question - many applications we deploy with 2 replicas for high availability (should one pod hang/crash, the other is still there to offer the service). I had made the (obviously incorrect) assumption that these pods would always be scheduled on different nodes, but I have observed that is not the case (usually it is, but not always); this leaves us in a bit of a compromised position regarding high availability; should we lose a node that has 2 pods from the same deployment on it, that service is lost for a time.
my question - is it possible to somehow annotate a deployment to indicate pods should be scheduled on different nodes?
Hey, you could configure inter-pod affinity/anti-affinity constrains.
You can constrain a PodA Pod represents a set of running containers in your cluster. to only be able to run on particular Node(s)A node is a worker machine in Kubernetes. , or to prefer to run on particular nodes. There are several ways to do this, and the recommended approaches all use label selectors to make the selection. Generally such constraints are unnecessary, as the scheduler will automatically do a reasonable placement (e.
perfect, thanks
something like that :
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
app: api-private-web
topologyKey: kubernetes.io/hostname
And also, you could apply deschedule plugin to force reschedule your pod when pod anti-affinity constrain is violated.
Descheduler for Kubernetes. Contribute to kubernetes-sigs/descheduler development by creating an account on GitHub.
Don’t forget to set proper PodDisruptionBudget
https://kubernetes.io/docs/tasks/run-application/configure-pdb/
FEATURE STATE: Kubernetes v1.5 [beta] This page shows how to limit the number of concurrent disruptions that your application experiences, allowing for higher availability while permitting the cluster administrator to manage the clusters nodes. Before you begin You are the owner of an application running on a Kubernetes cluster that requires high availability. You should know how to deploy Replicated Stateless Applications and/or Replicated Stateful Applications. You should have read about Pod Disruptions.
Hi, I’m using this great module https://github.com/cloudposse/terraform-aws-eks-workers and would like to change the root volume size, is that possible? Can’t figure out how until now
Terraform module to provision an AWS AutoScaling Group, IAM Role, and Security Group for EKS Workers - cloudposse/terraform-aws-eks-workers
Yes. See the ASG module that EKS uses for block mappings. I’ll be able to post a code snippet soon when back in front of laptop.
Terraform module to provision an AWS AutoScaling Group, IAM Role, and Security Group for EKS Workers - cloudposse/terraform-aws-eks-workers
# some applications need scratch disk...
block_device_mappings = [{
device_name = "/dev/xvda"
no_device = false
virtual_name = "ephemeral0"
ebs = {
delete_on_termination = true
encrypted = true
volume_size = 100
volume_type = "gp2"
iops = null
kms_key_id = null
snapshot_id = null
}
}]
I saw that block, but the comment says it’s for EBS aside root volume
I’ll give it a try, thanks
yea, confirmed in our jenkins setup
locals {
root_device = {
device_name = "/dev/xvda"
no_device = "false"
virtual_name = "root"
ebs = {
encrypted = true
volume_size = var.jenkins_disk_size
delete_on_termination = true
iops = null
kms_key_id = null
snapshot_id = null
volume_type = "gp2"
}
}
...
block_device_mappings = [local.root_device]
awesome, thanks dudes
2020-09-15
hey who has information about this? Google Kubernetes Engine vulnerability (CVE-2020-14386)
Hi I’m having a issue with kubectl commands where unable to get the response while running these commands– kubectl get nodes error from the server (Timeout): the server was unable to return a response in the time allotted but may still be processing the request (get node)
kubectl top nodes Error from the server (service unavailable) the server is currently unable to handle the request I have two master nodes and and 3 worker nodes with one API server
can anyone help with this issue to resolve please.and this is not in cloud.
Have you ruled out that it’s not a simple networking/firewall issue?
Yes I Have
Hey @Mithra was you able to solve the issue? have you confirmed the kubectl conf is configured as per the cluster you are connecting to, Is the cluster public or private? if the latter is the network you are connecting from been granted access to manage the nodes?
• If you have the solution, also be good to share what the root cause was
Hi @jason einon yes it was solved with out config file and the servers are managed by Chef- Manage and we ran a check in on Chef-client(Configuration ) and the the Master server and worker nodes are up and running and kubectl cmds are working
2020-09-17
Anyone here with experience with Rancher and Kiosk could tell me if they are not compatible with each other? are they competing products in any way?
and any feedback about Loft users?
I’m looking for a multi tenant , cluster management tool that can help us migrating and to K8s from ECS and make things easier for everyone
I’m a big fan of Rancher
Does it do similar stuff like kiosk or Is better to say that rancher is a cluster management and kiosk is a multitenant facilitator?
I’ve never used kiosk, but Rancher can manage clusters and do user management as well. If you are looking to multi-tenant a single cluster, kiosk might be a better fit since it looks like that is its primary purpose. If you are looking to centrally manage one cluster per tenant (which IMO is the better strategy) then Rancher is great for that.
Rancher can do it by mapping users to Projects (which is Rancher’s name for a set of namespaces grouped together), and I have done that in the past
but I’m still skittish about doing true multi-tenant on a single cluster
I was looking at loft that have this concept of virtual cluster
I need to read more about it
the multitenant is mostly for POC of different teams
but multicluster will be the idea
I will like to have some sort of pipeline where basically rancher can call Atlantis and run a pipeline to create a cluster using Terraform
hopefully there will be little to no aws cli/console work from developers to do
2020-09-18
2020-09-22
quick question if i patch a manifest, and after a while i apply again the original manifest will it remove my patches?
it will
cool thanks
Hello, does any one know any ingress controller that can be used in target Group of AWS. We have ALB which send traffic to EC2 instance and for only one service we want to send traffic to Kubernetes pod from that ALB. for that we need one target group which can route traffic from ALB to Kubernetes pod…This target group should updated once pod recycle…
aws-alb-ingress-controller ?
I only need target group ALB will remain old one just one /xyz routed in kubernetes all othere route go in ec2 server..I think alb-ingress-controller also spawn ALB ….
ahh you’re right
you could setup the ingress controller service using nodePort
and after you created a target group, Now you could create the routing rule in the alb for /xyz to your new target group
thanks to all of you
2020-09-23
Hey guys, does anyone know why a Kubernetes cluster wouldn’t be creating any pods whatsoever?
Even for new deployments, deleted pods with replicasets, it’s simply stopped creating new pods.
Yeah I posted the cause of the problem, it was a bad master @Emmanuel Gelati
Resolved, was a bad master
2020-09-24
Hello I’m looking for a solution which allows users to use a self-service catalog to deploy (using Helm Charts) a web app.
GitHub PR =P
I’m trying to implement it for engineers and end users who just want to click a button.
one other option is Lens
one-click helm deployment
Yes, I’ve used lens. I will need to setup a repo for helm charts hosted on-prem and provide the option to deploy the app for QA testing or review app
Kubermatic, a German-based startup, introduced an open-source tool called KubeCarrier that automates lifecycle management of services, applications, and hardware using Kubernetes Operators. The goal of the tool is to provide scalability and repeatability to meet the organization’s requirements.
One such solution I’ve come across has been KubeApps - https://kubeapps.com/
looks nice! hadn’t seen it before
Nice!
2020-09-25
Any one noticed that the default security group for the eks module doesn’t seem to be applied to anything? https://github.ada.com/infrastructure/infra-common-aws/blob/master/terraform-aws-eks-cluster/main.tf#L66-L67
GitHub is where people build software. More than 40 million people use GitHub to discover, fork, and contribute to over 100 million projects.
@Jan if you are talking about this SG https://github.com/cloudposse/terraform-aws-eks-cluster/blob/master/sg.tf#L1
Terraform module for provisioning an EKS cluster. Contribute to cloudposse/terraform-aws-eks-cluster development by creating an account on GitHub.
yes, for managed node groups and fargate profiles, it’s not in use
because
Managed Node Groups do not expose nor accept any Security Groups.
Instead, EKS creates a Security Group and applies it to ENI that is attached to EKS Control Plane master nodes and to any managed workloads.
but it’s in use when we use unmanaged workers, in which case EKS does not do anything to connect them, and we connect EKS control plane to the worker nodes via that default SG
since the EKS cluster module is used for both cases, the SG is there
with managed Node Groups, the below SG is used to allow access to other resources (e.g. Aurora, EFS, etc.) instead of the default SG (which is just created but not used for managed nodes):
output "eks_cluster_managed_security_group_id" {
description = "Security Group ID that was created by EKS for the cluster. EKS creates a Security Group and applies it to ENI that is attached to EKS Control Plane master nodes and to any managed workloads"
value = join("", aws_eks_cluster.default.*.vpc_config.0.cluster_security_group_id)
}
ah right, makes a lot of sense thanks dude
maybe I have misunderstood it
Cannot view
2020-09-26
oh lol, I linked to a private repo
2020-09-28
2020-09-29
hello all, I have a couple unrelated (hopefully easy) questions:
- I am looking at implementing a StatefulSet for an application, mostly to satisfy an ordered startup requirement of the application; I notice that StatefulSets require a headless service which does not do any sort of load balancing for the pods (looks like it’s just for DNS name assignment); can I also place a ClusterIP service in front of the statefulset pods and achieve the same thing one would with a Deployment?
- I deploy an application that has an excessively long startup time, such that the liveness probe for the app has an initial timeout of 3 minutes + the actual failure for the liveness probe is another 3 minutes (meaning a startup failure takes 6 minutes to be detected. If we end up in a startup failure related crash loop, we never seem to trip the crashloopbackoff behavior and end up restarting perpetually until someone intervenes:
• to date, startup probes haven’t been available to us on our managed cluster, but I believe they will be soon - would implementing a startup probe change this behavior?
• is there anything else I might be able to do to affect this so we do indeed trip the backoff behavior if this happens (I believe I’ve read that backoff timers are not configurable today)?
• this application also has a prestop hook that takes several minutes to complete - does this have any impact on the backoff behavior (I potentially have some flexibility here to change things)?
2020-09-30
Possible #office-hours question — How are folks managing service env var configuration in K8s with Helm?
The client project I am on at the moment had a pattern in place when I had joined on:
- Raw env variables in values.yaml
- A values.yaml map of env var names to a single cluster wide ConfigMap
- A values.yaml map of env var names to a single cluster wide Secret The ConfigMap + Secret mentioned are created by Terraform when the cluster is initially spun up and populated with various config from tf remote state and similar. The above ends up looking like the following in each Chart’s values.yaml:
secretMapping:
RABBIT_PASSWORD: rabbit_pass # rabbit_pass key in shared Secret
# ...
configMapping:
SOME_ENV_VAR_NAME: some_configmap_name # same as above but in shared ConfigMap
# ...
env:
RAW_ENV_VAR: "Value"
# ...
Then when supplying environment to any container in the Charts, we use a shared helper to mash the 3 together with valueFrom.configMapKeyRef
, valueFrom.secretKeyRef
, and just name value pairs from env
. This works of course, but it’s lot of mapping this
to that
and there is no single source of truth for values (split between Terraform driven Secret / ConfigMap and values.yaml files in each Chart (which there are 20 of right now).
I’m considering throwing most of this away and creating a ConfigMap + Secret per Chart/Service via Terraform. Then a shared helper could just iterate over the service in question’s ConfigMap and Secret without any raw values in the Chart. Thus creating a single source of truth and hopefully saving microservice configuration headaches.
Wondering if that sounds like a decent pattern or if there are other, more mainstream approaches to this.
@Erik Osterman (Cloud Posse) I won’t be able to make it today so if you want to hold this one for next week I won’t be upset
No worries! yes, we ran out of time - let’s discuss next week
@Erik Osterman (Cloud Posse) Did this one come up in this past week’s office hours? Will listen to the recording if so, but wanted to check.
not yet - we had a surprise visit by @Andriy Knysh (Cloud Posse) and talked about [serverless.tf](http://serverless.tf)
instead
oh haha
yes!