#kubernetes (2021-09)
Archive: https://archive.sweetops.com/kubernetes/
2021-09-01
2021-09-02
Hi guys, I am looking for a “user friendly” solution to manage multiple clusters for a customer. In the end I’m between Rancher and Kubesphere, has anyone here used any of these solutions in production?. They are using EKS (AWS). Thanks
Probably you’ll find this video interesting. It’s a Victor Farcic’s review of Kubesphere
2021-09-03
Hi People, anyone ever had this issue with the AWS ALB Ingress controller:
failed to build LoadBalancer configuration due to failed to resolve 2 qualified subnet with at least 8 free IP Addresses for ALB. Subnets must contains these tags: 'kubernetes.io/cluster/my-cluster-name': ['shared' or 'owned'] and 'kubernetes.io/role/elb': ['' or '1']. See <https://kubernetes-sigs.github.io/aws-alb-ingress-controller/guide/controller/config/#subnet-auto-discovery> for more details.
So there three subnets with the appropriate tagging and many ips I could not yet find the reason why it is complaining about the subnets
Perhaps there aren’t enough free ips in those tagged subnets?
there are only few nodes running, and there are thousands of ips i added another tag on the subnets so they look like this now:
kubernetes.io/cluster/my-cluster-name shared
kubernetes.io/role/internal-elb 1
kubernetes.io/role/elb
now the AWS ALB Ingress controller starts successfully and registers the targets in the target group but all my requests to any application in the cluster are timing out
Sounds like the first problem was solved. Nice job!
New problem seems like a misconfiguration in the pod that utilizes this new controller?
2021-09-05
2021-09-07
:helm: New to k8 and helm.
Need to define multiple pieces of my internal app, some based on public helm charts, others just internal containers.
I started with kompose
and converted Docker compose files to give me a headstart on what might be contained in k8 yaml schema, but not clear if I need to create my own helm charts or not. I’m since I’m not going to reuse these pieces in other projects, I’m assuming I don’t need helm charts.
If I draw some similarity to Terraform…. would a helm chart be like a terraform module, and the k8 schema yaml be similar to a “root module”? If that parallel applies, then I’d only worry about helm charts when consuming a prebuilt resource or trying to reuse in different places in the company. If it’s a standalone root application definition, I’m assuming I’ll just do this without helm.
How far off am I? #k8newbie
Update: I am reading more on this and see that there are benefits for internal use too that allow using the same deployment with easier templating approach.
Example
helm install --set env=dev --replicates=1
with less templating configs required as it would allow me to set my templating values dynamically. I’m guessing kubectl has this with the overrides file, but perhaps a bit less flexible or easier long term.
Rollbacks also seem to be really smooth, though again i’m guessing kubectl has similar features just be referencing prior source.
Another pro is that you can go beyond the schema of the app and also handle application level configuration. My guess is that that’s where k8 operators would be required to better handle application level configuration actions.
any quick insight on approaching an internal deployment with helm? A lot to learn so making sure I don’t focus on the wrong thing as I try to migrate from terraform ecs infra to kubernetes.
cc @Erik Osterman (Cloud Posse) would welcome any quick insight as this is all new to me from you or your team.
yes, a helm chart is a lot like a terraform module in the sense that you bundle up the “complexity” into a package and then expose some inputs as your declarative interface
also, we’ve relatively recently released https://github.com/cloudposse/terraform-aws-helm-release
Create helm release and common aws resources like an eks iam role - GitHub - cloudposse/terraform-aws-helm-release: Create helm release and common aws resources like an eks iam role
which we’re using to more easily deploy helm releases to EKS
So if I’m newer to this and I’m basically dealing with a root module/application deployment and need just env flexibility, but not a lot of other flexibility or sharing…. do I still stick to using helm or stick with k8 yaml instead? Where do I spend the effort?
A lot to learn
Gotta narrow the scope
there are 2 major camps right now: kustomize and helm
I would first master the raw resources, to learn/appreciate the alternative ways then to manage them.
then look forward at tools like ArgoCD/Flux - not that you will use them, but understand how they fit in to the picture.
will bring up on #office-hours today
thank you. I’ll stick with native k8 schema then as I really have to dial in the basics first and then can dive into others as I go. The less abstractions the better right now as I try to prepare team for such a big jump Doing my best to resist using Pulumi too for now
lol, yes, resist the urge until you appreciate the fundamentals and the limitations.
All roads lead to jsonnet. . Seems to be for me at least…
Grafana Tanka… is pretty awesome to be honest.
Reading https://github.com/cloudposse/terraform-aws-eks-node-group/blob/780163dacd9c892b64b988077a994f6675d8f56d/MIGRATION.md to be able to jump to the module 0.25.0 version (had recent overhaul).
Seems like remote_access_enabled
was removed from the module, but not documented in the migration guide to 0.25.0 …
Terraform module to provision an EKS Node Group. Contribute to cloudposse/terraform-aws-eks-node-group development by creating an account on GitHub.
Join us for a hands-on lab to implement Argo CD with ApplicationSets the new way of bootstrapping your cluster in Kubernetes. Friday 8:30 AEST | Thursday 3:30 https://community.cncf.io/events/details/cncf-cloud-native-dojo-presents-hands-on-lab-getting-started-with-argocd/ |
I’m attending CNCF Cloud Native Dojo w/ Hands on Lab - Getting started with ArgoCD on Sep 10, 2021
2021-09-08
2021-09-10
Hi People, Wanted to ask about experiences upgrading kubernetes eks versions. I recently did an upgrade from 1.19 to 1.20. After the upgrade some of my workloads are experiencing weird high cpu spikes. But correlation does not equal causation so I wanted to ask if anyone here experienced something similar.
2021-09-13
Hello all, Can any one help with the Azure Kubernetes service please what if the namespace is accidentally deleted Is there any recovery process (Disaster Recovery). Any inputs from the team please. ~ Thanks much appreciated.
AFAIK there’s no way to revert this out of the box. What pipeline did you use to get the yamls into that namespace? Usually the expectation is that the pipeline is easily repeatable, that’s why not too many talks around recoveries.
If you still going to approach the problem from a backup/recovery side, there are couple of cloud-generic projects to achieve what you want:
Kubernetes resource state sync to git - GitHub - pieterlange/kube-backup: Kubernetes resource state sync to git
use gitops and problem solved
2021-09-14
2021-09-18
Hey there!
I’m trying to go further with my multi tenant cluster and want to show only their namespaces to my teams. I did not find a way to reduce the number of shown namespaces when I do a k get ns. Any idea how I can get this done?
Author: Adrian Ludwin (Google) Safely hosting large numbers of users on a single Kubernetes cluster has always been a troublesome task. One key reason for this is that different organizations use Kubernetes in different ways, and so no one tenancy model is likely to suit everyone. Instead, Kubernetes offers you building blocks to create your own tenancy solution, such as Role Based Access Control (RBAC) and NetworkPolicies; the better these building blocks, the easier it is to safely build a multitenant cluster.
2021-09-19
2021-09-24
Hello all,
We are using Kubernetes in AWS, deployed using kops. We are using Nginx as our ingress controller, it was working fine for almost 2 years. but recently we started getting 502 bad gateway issues in multiple pods randomly.
ingress log shows 502
[23/Sep/2021:10:53:43 +0000] "GET /service HTTP/2.0" 502 559 "<https://mydomain/>" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36" 4691 0.040 [default-myservice-80] 100.96.13.157:80, 100.96.13.157:80, 100.96.13.157:80 0, 0, 0 0.000, 0.000, 0.000 502, 502, 502 258a09eaaddef85cae2a0c2f706ce06b
..
[error] 1050#1050: *1352377 connect() failed (111: Connection refused) while connecting to upstream, client: CLIENT_IP_HERE , server: [my.domain.com](http://my.domain.com) , request: "GET /index.html HTTP/2.0", upstream: "<http://POD_IP:8080/index.html>", host: "[my.domain.com](http://my.domain.com)", referrer: "<https://my.domain/index.html>"
We tried connecting to pod-ip which gave 502 from ingress pod
www-data@nginx-ingress-controller-664f488479-7cp57:/etc/nginx$ curl 100.96.13.157
curl: (7) Failed to connect to 100.96.13.157 port 80: Connection refused
it showed connection refuced
We monitored tcpdump traffic from the node where the pod gave 502
root@node-ip:/home/admin# tcpdump -i cbr0 dst 100.96.13.157
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cbr0, link-type EN10MB (Ethernet), capture size 262144 bytes
17:39:16.779950 ARP, Request who-has 100.96.13.157 tell 100.96.13.22, length 28
17:39:16.780207 IP 100.96.13.22.57610 > 100.96.13.157.http: Flags [S], seq 2263585697, win 26883, options [mss 8961,sackOK,TS val 1581767928 ecr 0,nop,wscale 9], length 0
17:39:21.932839 ARP, Reply 100.96.13.22 is-at 0a:58:64:60:0d:16 (oui Unknown), length 28
root@node-ip:/home/admin# ping 100.96.13.157
PING 100.96.13.157 (100.96.13.157) 56(84) bytes of data.
64 bytes from 100.96.13.157: icmp_seq=1 ttl=64 time=0.309 ms
64 bytes from 100.96.13.157: icmp_seq=2 ttl=64 time=0.042 ms
64 bytes from 100.96.13.157: icmp_seq=3 ttl=64 time=0.044 ms
it looks like pods can reach each other, and ping is working,
root@node-ip:/home/admin# tcpdump -i cbr0 src 100.96.13.157
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cbr0, link-type EN10MB (Ethernet), capture size 262144 bytes
17:39:16.780076 ARP, Reply 100.96.13.157 is-at 0a:58:64:60:0d:9d (oui Unknown), length 28
17:39:16.780175 ARP, Reply 100.96.13.157 is-at 0a:58:64:60:0d:9d (oui Unknown), length 28
17:39:16.780238 IP 100.96.13.157.http > 100.96.13.22.57610: Flags [R.], seq 0, ack 2263585698, win 0, length 0
17:39:21.932808 ARP, Request who-has 100.96.13.22 tell 100.96.13.157, length 28
Here ingress is sending request but it’s been reset,(flag [R.] = RST-ACK in tcp dump) and http request is lost.
we don’t know where this connection is getting lost, we checked our service and pod labels, everything is configured properly. also most of the time my.domain.com is accessible and ISSUE LOOKS INTERMITTENT, is any other place we need to check for logs….?or has anyone experienced the same issue?
Thanks in advance
Wondering if anyone know how i can let pods spinned up by jenkins on eks to assume a role on the pod level so that i can give that role a cross account trust to another aws account B (where that role will have access to ECR for account B to pull its images)
i dont want to use service accounts because my setup is such that one jenkins serves multiple projects and creating a service account on node level is not something i want
Hello friends!
I am chasing down how to configure clusterDNS: <VALUE>
setting for the Kubernetes node-group that is deployed using https://github.com/cloudposse/terraform-aws-eks-node-group/
Particularly this blob: https://github.com/weaveworks/eksctl/pull/550#issuecomment-464623865
Terraform module to provision an EKS Node Group. Contribute to cloudposse/terraform-aws-eks-node-group development by creating an account on GitHub.
TL;DR; This is the smallest change to allow enabling node-local DNS cache on eksctl-created nodes. What Add a new field named clusterDNS that accepts the IP address to the DNS server used for all t…
Anyone who could know this ?
2021-09-27
Hi there! We are using cloudposse module for node-groups for Kubernetes EKS 1.21 We started noticing that few hours after provisioning node-groups as well as corresponding worker IAM roles, these three IAM managed AWS policies start disappearing from the IAM roles:
AmazonEKSWorkerNodePolicy
AmazonEC2ContainerRegistryReadOnly
AmazonEKS_CNI_Policy
Wonder if anyone has noticed similar behavior?
Specifically we are using 0.25.0 version of this module: https://github.com/cloudposse/terraform-aws-eks-node-group/tree/0.25.0
Terraform module to provision an EKS Node Group. Contribute to cloudposse/terraform-aws-eks-node-group development by creating an account on GitHub.
We are using create_before_destroyed flag set to true …
2021-09-28
It turned out to be our ignore_tags
configuration of AWS provider that was triggering some unexpected effects on the node-group resources including IAM Roles for worker nodes.
2021-09-29
Morning everyone ! this is the 2nd part of my K8S blog post: “Implementing Kubernetes: The Hidden Part of the Iceberg”. I hope you enjoy it!. https://medium.com/gumgum-tech/implementing-kubernetes-the-hidden-part-of-the-iceberg-part-2-d76d21759de0
Kubernetes Scheduling, Resource Management and Autoscaling.
nice write up!
Kubernetes Scheduling, Resource Management and Autoscaling.