#kubernetes (2021-09)

kubernetes

Archive: https://archive.sweetops.com/kubernetes/

2021-09-01

2021-09-02

Carmelo avatar
Carmelo

Hi guys, I am looking for a “user friendly” solution to manage multiple clusters for a customer. In the end I’m between Rancher and Kubesphere, has anyone here used any of these solutions in production?. They are using EKS (AWS). Thanks

Andrew Nazarov avatar
Andrew Nazarov

Probably you’ll find this video interesting. It’s a Victor Farcic’s review of Kubesphere

https://www.youtube.com/watch?v=1OOLeCVWTXE

2021-09-03

Gabriel avatar
Gabriel

Hi People, anyone ever had this issue with the AWS ALB Ingress controller:

failed to build LoadBalancer configuration due to failed to resolve 2 qualified subnet with at least 8 free IP Addresses for ALB. Subnets must contains these tags: 'kubernetes.io/cluster/my-cluster-name': ['shared' or 'owned'] and 'kubernetes.io/role/elb': ['' or '1']. See <https://kubernetes-sigs.github.io/aws-alb-ingress-controller/guide/controller/config/#subnet-auto-discovery> for more details.

So there three subnets with the appropriate tagging and many ips I could not yet find the reason why it is complaining about the subnets

RB avatar

Perhaps there aren’t enough free ips in those tagged subnets?

Gabriel avatar
Gabriel

there are only few nodes running, and there are thousands of ips i added another tag on the subnets so they look like this now:

kubernetes.io/cluster/my-cluster-name shared
kubernetes.io/role/internal-elb 1
kubernetes.io/role/elb

now the AWS ALB Ingress controller starts successfully and registers the targets in the target group but all my requests to any application in the cluster are timing out

RB avatar

Sounds like the first problem was solved. Nice job!

New problem seems like a misconfiguration in the pod that utilizes this new controller?

Gabriel avatar
Gabriel

The kubernetes.io/role/elb tag was missing on the public subnets

1

2021-09-05

2021-09-07

sheldonh avatar
sheldonh

:helm: New to k8 and helm.

Need to define multiple pieces of my internal app, some based on public helm charts, others just internal containers.

I started with kompose and converted Docker compose files to give me a headstart on what might be contained in k8 yaml schema, but not clear if I need to create my own helm charts or not. I’m since I’m not going to reuse these pieces in other projects, I’m assuming I don’t need helm charts.

If I draw some similarity to Terraform…. would a helm chart be like a terraform module, and the k8 schema yaml be similar to a “root module”? If that parallel applies, then I’d only worry about helm charts when consuming a prebuilt resource or trying to reuse in different places in the company. If it’s a standalone root application definition, I’m assuming I’ll just do this without helm.

How far off am I? #k8newbie

sheldonh avatar
sheldonh

Update: I am reading more on this and see that there are benefits for internal use too that allow using the same deployment with easier templating approach.

Example

helm install --set env=dev --replicates=1

with less templating configs required as it would allow me to set my templating values dynamically. I’m guessing kubectl has this with the overrides file, but perhaps a bit less flexible or easier long term.

sheldonh avatar
sheldonh

Rollbacks also seem to be really smooth, though again i’m guessing kubectl has similar features just be referencing prior source.

sheldonh avatar
sheldonh

Another pro is that you can go beyond the schema of the app and also handle application level configuration. My guess is that that’s where k8 operators would be required to better handle application level configuration actions.

sheldonh avatar
sheldonh

any quick insight on approaching an internal deployment with helm? A lot to learn so making sure I don’t focus on the wrong thing as I try to migrate from terraform ecs infra to kubernetes.

cc @Erik Osterman (Cloud Posse) would welcome any quick insight as this is all new to me from you or your team.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

yes, a helm chart is a lot like a terraform module in the sense that you bundle up the “complexity” into a package and then expose some inputs as your declarative interface

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

also, we’ve relatively recently released https://github.com/cloudposse/terraform-aws-helm-release

GitHub - cloudposse/terraform-aws-helm-release: Create helm release and common aws resources like an eks iam roleattachment image

Create helm release and common aws resources like an eks iam role - GitHub - cloudposse/terraform-aws-helm-release: Create helm release and common aws resources like an eks iam role

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

which we’re using to more easily deploy helm releases to EKS

sheldonh avatar
sheldonh

So if I’m newer to this and I’m basically dealing with a root module/application deployment and need just env flexibility, but not a lot of other flexibility or sharing…. do I still stick to using helm or stick with k8 yaml instead? Where do I spend the effort?

sheldonh avatar
sheldonh

A lot to learn

sheldonh avatar
sheldonh

Gotta narrow the scope

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

there are 2 major camps right now: kustomize and helm

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

I would first master the raw resources, to learn/appreciate the alternative ways then to manage them.

1
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

then look forward at tools like ArgoCD/Flux - not that you will use them, but understand how they fit in to the picture.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

will bring up on #office-hours today

sheldonh avatar
sheldonh

thank you. I’ll stick with native k8 schema then as I really have to dial in the basics first and then can dive into others as I go. The less abstractions the better right now as I try to prepare team for such a big jump Doing my best to resist using Pulumi too for now

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

lol, yes, resist the urge until you appreciate the fundamentals and the limitations.

mfridh avatar

All roads lead to jsonnet. . Seems to be for me at least…

mfridh avatar

Grafana Tanka… is pretty awesome to be honest.

azec avatar

Reading https://github.com/cloudposse/terraform-aws-eks-node-group/blob/780163dacd9c892b64b988077a994f6675d8f56d/MIGRATION.md to be able to jump to the module 0.25.0 version (had recent overhaul).

Seems like remote_access_enabled was removed from the module, but not documented in the migration guide to 0.25.0 …

terraform-aws-eks-node-group/MIGRATION.md at 780163dacd9c892b64b988077a994f6675d8f56d · cloudposse/terraform-aws-eks-node-groupattachment image

Terraform module to provision an EKS Node Group. Contribute to cloudposse/terraform-aws-eks-node-group development by creating an account on GitHub.

Brad McCoy avatar
Brad McCoy
Join us for a hands-on lab to implement Argo CD with ApplicationSets the new way of bootstrapping your cluster in Kubernetes. Friday 8:30 AESTThursday 3:30 https://community.cncf.io/events/details/cncf-cloud-native-dojo-presents-hands-on-lab-getting-started-with-argocd/
Hands on Lab - Getting started with ArgoCD | CNCFattachment image

I’m attending CNCF Cloud Native Dojo w/ Hands on Lab - Getting started with ArgoCD on Sep 10, 2021

3
1

2021-09-08

2021-09-10

Gabriel avatar
Gabriel

Hi People, Wanted to ask about experiences upgrading kubernetes eks versions. I recently did an upgrade from 1.19 to 1.20. After the upgrade some of my workloads are experiencing weird high cpu spikes. But correlation does not equal causation so I wanted to ask if anyone here experienced something similar.

1
1

2021-09-13

Mithra avatar

Hello all, Can any one help with the Azure Kubernetes service please what if the namespace is accidentally deleted Is there any recovery process (Disaster Recovery). Any inputs from the team please. ~ Thanks much appreciated.

Max Lobur (Cloud Posse) avatar
Max Lobur (Cloud Posse)

AFAIK there’s no way to revert this out of the box. What pipeline did you use to get the yamls into that namespace? Usually the expectation is that the pipeline is easily repeatable, that’s why not too many talks around recoveries.

Max Lobur (Cloud Posse) avatar
Max Lobur (Cloud Posse)

If you still going to approach the problem from a backup/recovery side, there are couple of cloud-generic projects to achieve what you want:

https://velero.io/

https://github.com/pieterlange/kube-backup

GitHub - pieterlange/kube-backup: Kubernetes resource state sync to gitattachment image

Kubernetes resource state sync to git - GitHub - pieterlange/kube-backup: Kubernetes resource state sync to git

1
Emmanuel Gelati avatar
Emmanuel Gelati

use gitops and problem solved

2021-09-14

2021-09-18

zadkiel avatar
zadkiel

Hey there! I’m trying to go further with my multi tenant cluster and want to show only their namespaces to my teams. I did not find a way to reduce the number of shown namespaces when I do a k get ns. Any idea how I can get this done?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
Introducing Hierarchical Namespaces

Author: Adrian Ludwin (Google) Safely hosting large numbers of users on a single Kubernetes cluster has always been a troublesome task. One key reason for this is that different organizations use Kubernetes in different ways, and so no one tenancy model is likely to suit everyone. Instead, Kubernetes offers you building blocks to create your own tenancy solution, such as Role Based Access Control (RBAC) and NetworkPolicies; the better these building blocks, the easier it is to safely build a multitenant cluster.

2021-09-19

2021-09-24

Shreyank Sharma avatar
Shreyank Sharma

Hello all,

We are using Kubernetes in AWS, deployed using kops. We are using Nginx as our ingress controller, it was working fine for almost 2 years. but recently we started getting 502 bad gateway issues in multiple pods randomly.

ingress log shows 502

[23/Sep/2021:10:53:43 +0000] "GET /service HTTP/2.0" 502 559 "<https://mydomain/>" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36" 4691 0.040 [default-myservice-80] 100.96.13.157:80, 100.96.13.157:80, 100.96.13.157:80 0, 0, 0 0.000, 0.000, 0.000 502, 502, 502 258a09eaaddef85cae2a0c2f706ce06b
..
[error] 1050#1050: *1352377 connect() failed (111: Connection refused) while connecting to upstream, client: CLIENT_IP_HERE , server: [my.domain.com](http://my.domain.com) , request: "GET /index.html HTTP/2.0", upstream: "<http://POD_IP:8080/index.html>", host: "[my.domain.com](http://my.domain.com)", referrer: "<https://my.domain/index.html>"

We tried connecting to pod-ip which gave 502 from ingress pod

www-data@nginx-ingress-controller-664f488479-7cp57:/etc/nginx$ curl 100.96.13.157
curl: (7) Failed to connect to 100.96.13.157 port 80: Connection refused

it showed connection refuced

We monitored tcpdump traffic from the node where the pod gave 502

root@node-ip:/home/admin# tcpdump -i cbr0 dst 100.96.13.157
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cbr0, link-type EN10MB (Ethernet), capture size 262144 bytes
17:39:16.779950 ARP, Request who-has 100.96.13.157 tell 100.96.13.22, length 28
17:39:16.780207 IP 100.96.13.22.57610 > 100.96.13.157.http: Flags [S], seq 2263585697, win 26883, options [mss 8961,sackOK,TS val 1581767928 ecr 0,nop,wscale 9], length 0
17:39:21.932839 ARP, Reply 100.96.13.22 is-at 0a:58:64:60:0d:16 (oui Unknown), length 28


root@node-ip:/home/admin# ping 100.96.13.157
PING 100.96.13.157 (100.96.13.157) 56(84) bytes of data.
64 bytes from 100.96.13.157: icmp_seq=1 ttl=64 time=0.309 ms
64 bytes from 100.96.13.157: icmp_seq=2 ttl=64 time=0.042 ms
64 bytes from 100.96.13.157: icmp_seq=3 ttl=64 time=0.044 ms

it looks like pods can reach each other, and ping is working,

root@node-ip:/home/admin# tcpdump -i cbr0 src 100.96.13.157
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on cbr0, link-type EN10MB (Ethernet), capture size 262144 bytes
17:39:16.780076 ARP, Reply 100.96.13.157 is-at 0a:58:64:60:0d:9d (oui Unknown), length 28
17:39:16.780175 ARP, Reply 100.96.13.157 is-at 0a:58:64:60:0d:9d (oui Unknown), length 28
17:39:16.780238 IP 100.96.13.157.http > 100.96.13.22.57610: Flags [R.], seq 0, ack 2263585698, win 0, length 0
17:39:21.932808 ARP, Request who-has 100.96.13.22 tell 100.96.13.157, length 28

Here ingress is sending request but it’s been reset,(flag [R.] = RST-ACK in tcp dump) and http request is lost.

we don’t know where this connection is getting lost, we checked our service and pod labels, everything is configured properly. also most of the time my.domain.com is accessible and ISSUE LOOKS INTERMITTENT, is any other place we need to check for logs….?or has anyone experienced the same issue?

Thanks in advance

Steffan avatar
Steffan

Wondering if anyone know how i can let pods spinned up by jenkins on eks to assume a role on the pod level so that i can give that role a cross account trust to another aws account B (where that role will have access to ECR for account B to pull its images)

Steffan avatar
Steffan

i dont want to use service accounts because my setup is such that one jenkins serves multiple projects and creating a service account on node level is not something i want

azec avatar

Hello friends!

azec avatar

I am chasing down how to configure clusterDNS: <VALUE> setting for the Kubernetes node-group that is deployed using https://github.com/cloudposse/terraform-aws-eks-node-group/

Particularly this blob: https://github.com/weaveworks/eksctl/pull/550#issuecomment-464623865

GitHub - cloudposse/terraform-aws-eks-node-group: Terraform module to provision an EKS Node Groupattachment image

Terraform module to provision an EKS Node Group. Contribute to cloudposse/terraform-aws-eks-node-group development by creating an account on GitHub.

feat: Node-local DNS cache support by mumoshu · Pull Request #550 · weaveworks/eksctlattachment image

TL;DR; This is the smallest change to allow enabling node-local DNS cache on eksctl-created nodes. What Add a new field named clusterDNS that accepts the IP address to the DNS server used for all t…

azec avatar

Anyone who could know this ?

2021-09-27

azec avatar

Hi there! We are using cloudposse module for node-groups for Kubernetes EKS 1.21 We started noticing that few hours after provisioning node-groups as well as corresponding worker IAM roles, these three IAM managed AWS policies start disappearing from the IAM roles:

AmazonEKSWorkerNodePolicy
AmazonEC2ContainerRegistryReadOnly
AmazonEKS_CNI_Policy
azec avatar

Wonder if anyone has noticed similar behavior?

azec avatar

Specifically we are using 0.25.0 version of this module: https://github.com/cloudposse/terraform-aws-eks-node-group/tree/0.25.0

GitHub - cloudposse/terraform-aws-eks-node-group at 0.25.0attachment image

Terraform module to provision an EKS Node Group. Contribute to cloudposse/terraform-aws-eks-node-group development by creating an account on GitHub.

azec avatar

We are using create_before_destroyed flag set to true

2021-09-28

azec avatar

It turned out to be our ignore_tags configuration of AWS provider that was triggering some unexpected effects on the node-group resources including IAM Roles for worker nodes.

2021-09-29

Santiago Campuzano avatar
Santiago Campuzano

Morning everyone ! this is the 2nd part of my K8S blog post: “Implementing Kubernetes: The Hidden Part of the Iceberg”. I hope you enjoy it!. https://medium.com/gumgum-tech/implementing-kubernetes-the-hidden-part-of-the-iceberg-part-2-d76d21759de0

Implementing Kubernetes: The Hidden Part of the Iceberg — Part 2attachment image

Kubernetes Scheduling, Resource Management and Autoscaling.

2
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

nice write up!

Implementing Kubernetes: The Hidden Part of the Iceberg — Part 2attachment image

Kubernetes Scheduling, Resource Management and Autoscaling.

1
    keyboard_arrow_up