SweetOps #kubernetes for March, 2021

Archive: https://archive.sweetops.com/kubernetes/

2021-03-01

Brad McCoy

Hi everyone, I have two online events coming up that will be recording on CKAD/CKA study and exam tips, here are the links for those interested:

Brad McCoy

09:42:41 PM

https://community.cncf.io/events/details/cncf-manly-presents-cloud-native-dojo-cka-study-exam-tips/#/

Cloud Native Dojo - CKA Study & Exam Tips | CNCF attachment image

I’m attending CNCF Manly w/ Cloud Native Dojo - CKA Study & Exam Tips on Mar 30, 2021

Brad McCoy

09:42:55 PM

https://community.cncf.io/events/details/cncf-manly-presents-cloud-native-dojo-ckad-exam-tips/#/

Cloud Native Dojo - CKAD Study & Exam Tips | CNCF attachment image

I’m attending CNCF Manly w/ Cloud Native Dojo - CKAD Study & Exam Tips on Mar 9, 2021

2021-03-03

Matt Gowie

04:51:01 PM

Not exactly a Kubernetes question, but figured folks in this channel would know what I’m talking about exists — Does anyone know if there is a Network / TCP proxy tool out there that will do a manage-and-forward pattern (my own made up term for describing this) for long lived TCP connections?

I have a client running on K8s and one of their primary microservices holds long lived TCP socket connections with many thousands of clients through an AWS NLB. The problem is that whenever we do a deployment and update those pods the TCP connections require a re-connection which results in problems on the client side. So to provide an better experience for the clients we’re looking at what we can do to have those TCP connections always stay alive. My first thought is for a proxy layer that manages the socket connections with the client and then forwards socket connections to the actual service pods. That way even if the pods are swapped out behind the scenes, the original socket connection is still up and has no adverse affects on the clients.

kskewes

06:15:52 PM

Does the backend service maintain any state? In the event the responsible backend terminates could you send any requests to any other backend?

Be worth looking at the aws NLB issues in kubernetes GitHub. Tldr is that you want eks 1.19 and the new aws load balancer controller because otherwise:

Proxy drain results in 500, no graceful draining of lb targets
Scale up of nodes results in 500s as nodes are healthy on startup before being married unhealthy until complete process and then health check heathy. So there are dragons.

Matt Gowie

06:17:45 PM

Service is stateless. Could send any incoming socket messages to any other backend.

kskewes

06:17:50 PM

I think ultimately you need solid client reconnection behavior on kubernetes due to all the stuff in the network path and what can affect routing.

Alternatively you could run the proxy on EC2 and forward into kubernetes but you would need to solve for service discovery (pod ip) to route direct and minimize hiccups.

Matt Gowie

06:18:36 PM

What issue are you referring to? Client is setup with the aws-lb-controller for the NLB creation / management, but we’re on 1.18… is 1.19 out yet?

kskewes

06:18:39 PM

FWIW I looked at something similar for webrtc sessions and the agones project but ultimately went with EC2.

kskewes

06:19:47 PM

Sorry don’t have issues off hand, nlb termination etc should come up in search. Been some big ones. :)

Shreyank Sharma

05:22:53 PM

Hi All,

we have 4 node kubernetes cluster in production deployed using kops in AWS, 3 worker node and one master nodes are c4.2xlarge 16gb memory each

with other pods we have elasticsearch deployed using helm, there we have 3 elasticsearch-data pods consumes 4000mb of memory each 3 elasticsearch-master pods consumes 2600mb of memory each 3 elasticsearch-client pod consumes 2600mb of memory each

all are distribed amoung nodes but one of the in one node, one of the elasticsearch-data pod is restarting daily like 2-3 times in a same node

i described the restarted pod which says just

Last State: Terminated Reason: OOMKilled Exit Code: 137 Started: Tue, 02 Mar 2021 20:31:07 +0530 Finished: Wed, 03 Mar 2021 17:46:02 +0530

there is no events

and when i checked the syslogs of the nodes in which the pod restarted it shows C2 CompilerThre invoked oom-killer: gfp_mask=0x24000c0(GFP_KERNEL), nodemask=0, order=0, oom_score_adj=901 C2 CompilerThre cpuset=6126d0823d683f51d04603c4c6464c030464d3748c916c1a46621936846aac01 mems_allowed=0 CPU: 2 PID: 7743 Comm: C2 CompilerThre Not tainted 4.9.0-9-amd64 #1 Debian 4.9.168-1 Hardware name: Amazon EC2 c5.2xlarge/, BIOS 1.0 10/16/2017 ........... .. .

the version of elasticsearch is 6.7.0

anyone experienced same issue how to solve this pod restart issue

Matt Gowie

06:14:54 PM

OOMKilled / 137 is Out of Memory. ElasticSearch is super memory hungry so it just needs more memory likely or you need to play with your Pod limits / requests.

Walter Sosa

11:05:01 PM

Fun fact: Exit Code 137 (128 + 9) –> it’s getting a signal 9 (SIGKILL)

2021-03-05

Shreyank Sharma

05:35:43 PM

Hi All, Under what condition pod will exceeds its memory limit,

in my Kubernets cluster i have deployed elasticsearch deployed using helm, elasticsearch version is 6.7.0, we have 3 elasticsearch-data pods and 2 elasticsearch-master pods and 1 client,

memory limit for elasticseach-data pod is 4gb, but one of the data pod is restarted everyday about 5-6 times(oom kill), when i checked in Grafana for pod’s memory and cpu usage, i can see that one of the elasticsearch-data pod is using twice the memory limit(8gb) ,

So i wanted to know Under what condition pod will exceeds its memory limit,

also in syslogs- when oom_kill happened C2 CompilerThre invoked oom-killer: gfp_mask=0x24000c0(GFP_KERNEL), nodemask=0, order=0, oom_score_adj=901 [28621138.637578] C2 CompilerThre cpuset=441fa5603f64f86888937bc911269fca47dfcdb318648cc1ac0832cdfb07134d mems_allowed=0 [28621138.639850] CPU: 5 PID: 7749 Comm: C2 CompilerThre Not tainted 4.9.0-9-amd64 #1 Debian 4.9.168-1 [28621138.641757] Hardware name: Amazon EC2 c5.2xlarge/, BIOS 1.0 10/16/2017 [28621138.643152] 0000000000000000 ffffffff85335284 ffffa53882de7dd8 ffff8dda11dec040 ...... .. . [28621138.662399] [<ffffffff85615f82>] ? schedule+0x32/0x80 [28621138.663485] [<ffffffff8561bc48>] ? async_page_fault+0x28/0x30 [28621138.669097] memory: usage 4096000kB, limit 4096000kB, failcnt 383494862

Here at shows aroung 4gb

but at last [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name [28621138.876368] [24691] 0 24691 256 1 4 2 0 -998 pause [28621138.878201] [ 7436] 1000 7436 2141564 989996 2342 11 0 901 java [28621138.879983] Memory cgroup out of memory: Kill process 7436 (java) score 1870 or sacrifice child [28621138.881978] Killed process 7436 (java) total-vm:8566256kB, anon-rss:3941732kB, file-rss:18252kB, shmem-rss:0kB but here its showing total-vm is 8gb

am confused why its showing 4 in one place and 8 in another place

Marcin Brański

08:19:36 PM

It’s not a pod that exceed the limit but program that you run on the pod - elk. elk is memory hungry and need to be configured to not to use more memory than it’s given otherwise either os or kubelet will kill it. so every program will get a score so it’s known what to kill first to preserve memory for prioritized applications

roth.andy

04:14:54 AM

https://twitter.com/HelmPack/status/1367959661521948673?s=19

The Helm 2nd Security Audit is now completed. Check out the blog post by core maintainer @mattfarina and the reports here: https://helm.sh/blog/helm-2nd-security-audit/

2021-03-07

Issif

09:07:54 PM

I just released a kubectl plugin I developed at my day job, maybe someone will find it useful https://github.com/qonto/kubectl-duplicate

qonto/kubectl-duplicate

A kubectl plugin for duplicate a running pod and auto exec into - qonto/kubectl-duplicate

2021-03-08

2021-03-11

btai

08:19:16 PM

whats the recommendation for pod level iam roles? I know the ones being used the most initially was kube2iam and kiam but iirc one or both of them had rate limiting issues (which is why i avoided them initially). I know AWS came out with one here and was just curious what people are using nowadays or if theres a general consensus of best one https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/ Personally I’m using kops, so curious if theres issues integrating w/ aws irsa

Introducing fine-grained IAM roles for service accounts | Amazon Web Services attachment image

Here at AWS we focus first and foremost on customer needs. In the context of access control in Amazon EKS, you asked in issue #23 of our public container roadmap for fine-grained IAM roles in EKS. To address this need, the community came up with a number of open source solutions, such as kube2iam, kiam, […]

Andriy Knysh (Cloud Posse)

10:37:10 PM

we use EKS IAM roles for Service Accounts, and did not see any issues with them, works fine

Introducing fine-grained IAM roles for service accounts | Amazon Web Services attachment image

Erik Osterman (Cloud Posse)

01:15:17 AM

yea, this is the defacto way to do it on EKS.

2021-03-15

Mr.Devops

08:36:25 AM

Hi does anyone have a recommended approach on injecting passwords to kops templates using cloud int?

Erik Osterman (Cloud Posse)

02:33:09 PM

Can you describe the use-case?

Erik Osterman (Cloud Posse)

02:33:28 PM

E.g. You may be better off reading from SSM

2021-03-19

Fernanda Martins

09:19:42 PM

Hello everyone, I am configuring Federated Prometheus to monitor multiple Cluster for the first time. Any tips? On how to organize operators and etc? Tks!

Emmanuel Gelati

10:21:13 AM

I prefer to use thanos

Pierre-Yves

09:18:57 AM

@Fernanda Martins you may have a look at this video it is well done about monitoring couchbase across multiple kubernetes cluster. You’ll then be able to figure out more

https://connectondemand.couchbase.com/watch/UgdEWF5i4cMRkA9VY8Fwck

Centralized Monitoring Across Multiple Couchbase Clusters Using Prometheus

Prometheus is rapidly becoming the de facto standard for monitoring metrics and is typically combined with Grafana as the front end visualization tool. The Couchbase Autonomous Operator can deploy the Prometheus Exporter along with a cluster. In this session you’ll learn how to deploy the Prometheus Exporter and deploy Prometheus and Grafana services to visualize Couchbase metrics. We’ll show you how to monitor these metrics from multiple Couchbase clusters.

2021-03-20

Azul

01:23:29 PM

anyone using EKS with fargate profiles? I am on a project where we started using it and we had to submit a support request to increase the maximum number of profiles in the cluster from 10 to 20. A fargate profile maps to a kubernetes namespace, so I’m essentially looking into 20 namespaces in these EKS clusters. That’s in my view a fairly small number, and I expect these to increase as we add more apps onto the cluster. Maybe be worth mentioning at this point, that by default I can launch about 1000 fargate nodes on a EKS cluster, so EKS was designed to scale. Anyway the docs list the fargate profiles quota as possible to be raised through the console, but that’s incorrect, so I raised a support request to do this. The feedback I received was that they would raise it with the fargate service team as it was a fairly large increase. My thought here, is he serious ? we’re talking about 20 namespaces/fargate profiles. What exactly is large about this request? A google search didn’t show any relevant posts of the number of fargate profiles, so I thought of coming here to ask: who here is using fargate on EKS, and how many namespaces are you using?

Andriy Knysh (Cloud Posse)

03:00:56 PM

we have a module for it https://github.com/cloudposse/terraform-aws-eks-fargate-profile

cloudposse/terraform-aws-eks-fargate-profile

Terraform module to provision an EKS Fargate Profile - cloudposse/terraform-aws-eks-fargate-profile

Andriy Knysh (Cloud Posse)

03:01:34 PM

and we tested it, but we don’t use it in production (for many reasons, one of them is that Fargate is still very limited)

Andriy Knysh (Cloud Posse)

03:02:17 PM

having said that, 20 namespaces is not a large number, you usually have more than that in a cluster

Vlad Ionescu (he/him)

03:20:39 PM

Yeah, Fargate selector forcing a namespace is very… annoying. There is a feature request to support * for namespace and use labels for selection: https://github.com/aws/containers-roadmap/issues/621 . That would make the “many tiny namespaces” usecase easier.

Support does not lie If they say they need to talk to the Fargate Team, they mean it. Sometimes it’s an instant approval from the Team. Other times, you’ll get into a discussion with the Team through Support (which is like paying a bad game of telephone, but I digress). I think they do this because: 1) research: they want to see what use-cases people use so they can use that info for future improvements 2) complexity: they don’t want to overload something else (10 namespaces may put a bigger load on the EKS Control Plane and so they will have to scale that too, but only in certain cases). Maybe they have to scale some fancy internal Fargate component that requires some extra work 3) safety: they want to offer some protection. In your case, 20 Fargate namespaces will allow you to run 20k pods which would be… expensive. You’d be amazed how many cases of “AWS pls gimme 10k EC2s” are followed by “AWS WHY DID YOU LET ME DO THIS. I want my money back, I did not know I would create an infinite loop here and spend 100.000$”

[EKS/Fargate] Support for Wildcard * Namespaces · Issue #621 · aws/containers-roadmap

the problem Today you have to specify each namespace from which you want to run pods on Fargate in a Fargate profile. the solution EKS with Fargate will support adding wildcard * namespaces to the …

Azul

03:32:44 PM

yeah, they do know what they’re doing, by the way, the number of fargate pods is by default limited to 1000 (which is already a LOT) and not per fargate profile. The number of namespaces won’t generate additional load on the control-pane, however a large number of fargate pods will as in EKS/fargate each pod is a k8s node. which needs to be probed by the control-plane

2021-03-23

2021-03-24

melynda.hunter

05:02:54 PM

Hi. Can anyone recommend articles/videos on configuring k8s on an “airgapped” system? The OS is Centos7. Thanks!

Jonathon Canada

05:28:34 PM

Hi @melynda.hunter. Do you mean as far as accessing a k8s cluster that’s in a an airgapped environment?

roth.andy

05:43:56 PM

kubernetes on a single system? Or an airgapped network with multiple servers

melynda.hunter

05:50:53 PM

@roth.andy this is for an airgapped network with multiple VM servers.

roth.andy

05:54:26 PM

Got it, yeah there’s not a lot out there. The federal government does kubernetes work in restrictive/classified/isolated environments, so looking for information coming out of there would likely bear fruit. Distros I know they are using is OpenShift and D2IQ. Check out the Air Force program called Platform One, they are trailblazing a lot there, and have published a lot as open source. https://software.af.mil is a good starting point

roth.andy

05:55:15 PM

RKE2 from Rancher is also positioning itself in this space, though it isn’t officially released yet

Jonathon Canada

06:43:12 PM

Teleport can also help with access in an airgapped environment: https://github.com/gravitational/teleport

In full disclosure I work there

Vlad Ionescu (he/him)

07:11:44 PM

There also is a CNCF WG around this. It may or may not be relevant for you: https://github.com/cncf/sig-app-delivery/tree/master/air-gapped-wg

cncf/sig-app-delivery

CNCF App Delivery SIG. Contribute to cncf/sig-app-delivery development by creating an account on GitHub.

melynda.hunter

08:44:43 PM

Thanks to everyone I appreciate all of the information!

2021-03-25

Eric Berg

05:27:18 PM

Any body have thoughts on where I should use .Release.Name vs .Chart.Name?

Emmanuel Gelati

12:49:45 AM

Maybe release name is the name you specify in th cli helm install <name> and chart.name is chart used

2021-03-26

Shreyank Sharma

03:07:16 PM

Hi All, we have Elasticsearch cluster running in our Kubernetes cluster and it is deployed using helm chart, and fluentd is sending logs from each nodes, we have 2 data nodes, 2 master node and a client node, and from yesterday data nodes are in not ready state and because of the client keep getting restarted. so as fluentd and kibana elastichq-7cf55c6bbc-998pq 1/1 Running 0 1y elasticsearch-client-5dbccbd776-7kpwk 1/1 Running 79 1d elasticsearch-data-0 0/1 Running 18 1d elasticsearch-data-1 0/1 Running 21 1d elasticsearch-master-0 1/1 Running 0 1d elasticsearch-master-1 1/1 Running 0 1d fluentd-fluentd-elasticsearch-hhh8v 1/1 Running 147 1y fluentd-fluentd-elasticsearch-ksfnx 1/1 Running 110 1y fluentd-fluentd-elasticsearch-lnbll 1/1 Running 94 1y kibana-b7768db9d-r57st 1/1 Running 347 1y logstash-0 1/1 Running 6 1y

after describeing the fluentd pod i come to know that, Killing container with id <docker://fluentd-fluentd-elasticsearch>:Container failed liveness probe.. Container will be killed and recreated.

after referring to some links i found ———— Data nodes — stores data and executes data-related operations such as search and aggregation Master nodes — in charge of cluster-wide management and configuration actions such as adding and removing nodes Client nodes — forwards cluster requests to the master node and data-related requests to data nodes ——————————————————-

in the kubectl get events output it says rediness probe failed for elasticsearch-data pods(we increased timeout values and recreated all pods again)

so am assuming client is failing because of elasticsearch-data pods are in not ready state, and also in one of the data pod i can see java.lang.OutOfMemoryError: Java heap space Dumping heap to data/java_pid1.hprof ... Unable to create data/java_pid1.hprof: File exists data pod’s memory limit is 4gb and heap is 1.9 which is fine i think,,,

Since mater node is responsible for adding and removing the nodes i went inside the master pod and did curl localhost:9200/_cat/nodes elasticsearch-client-5dbccbd776-7kpwk * elasticsearch-master-1 elasticsearch-master-0 data pods are not listed here, after checking the logs of master node, i can see lot of [INFO ][o.e.c.s.ClusterApplierService] [elasticsearch-master-0] removed {{elasticsearch-data-1} master is keep adding and removing the data pods . and in master-1 logs i can see org.elasticsearch.transport.NodeDisconnectedException:

we did a helm upgrade <chartname> -f custom_valuefile.yaml --recreate-pods which did not worked.

is there any workaround or solution for this behaviour, Thanks in advance

Antoine Taillefer

04:03:38 PM

If you have a Java heap space in the data nodes you should probably raise the -Xmx and -Xms Java opts. In the latest elasticsearch chart provided by Elastic, you can use the esJavaOpts value, see https://github.com/elastic/helm-charts/tree/master/elasticsearch (defaults to -Xmx1g -Xms1g).

elastic/helm-charts

You know, for Kubernetes. Contribute to elastic/helm-charts development by creating an account on GitHub.

Shreyank Sharma

04:04:55 PM

thank you for that @Antoine Taillefer, we are using stable version of elasticsearch

Antoine Taillefer

04:07:19 PM

OK, so maybe data.heapSize, see https://github.com/helm/charts/tree/master/stable/elasticsearch

helm/charts

(OBSOLETE) Curated applications for Kubernetes. Contribute to helm/charts development by creating an account on GitHub.

Shreyank Sharma

04:11:48 PM

thank you

Matt Gowie

04:05:02 PM

ArgoCD V2 — https://blog.argoproj.io/argo-cd-v2-0-rc1-is-here-f7d21ff1aa64

Argo CD v2.0-rc1 is here! attachment image

Argoproj team is proud to announce the first release candidate for Argo CD v2.0! As denoted by the version number, this is a major release…

btai

11:45:34 PM

i wasn’t aware of this, but the ridiculously low allowed pod count on EKS (i.e 29 pods on m4.large) is tied specifically to the AWS VPC CNI. apparently we can skirt around that issue by uninstalling the default CNI and installing a different one. anyone try doing this? https://docs.projectcalico.org/getting-started/kubernetes/managed-public-cloud/eks

Amazon Elastic Kubernetes Service (EKS)

Enable Calico network policy in EKS.

Erik Osterman (Cloud Posse)

02:43:09 AM

Yes, this is true. It’s since each pod gets an ENI and ENIs are limited by instance type

Amazon Elastic Kubernetes Service (EKS)

Enable Calico network policy in EKS.

Erik Osterman (Cloud Posse)

02:43:21 AM

I hear good things about https://docs.cilium.io/en/v1.9/gettingstarted/k8s-install-eks/

Erik Osterman (Cloud Posse)

02:44:11 AM

It’s apparently what GKE uses and it works with EKs

Erik Osterman (Cloud Posse)

02:44:15 AM

https://cloud.google.com/blog/products/containers-kubernetes/bringing-ebpf-and-cilium-to-google-kubernetes-engine/

New GKE Dataplane V2 increases security and visibility for containers attachment image

GKE’s new dataplane uses the eBPF-based Cilium project to better integrate Kubernetes and the Linux kernel.

btai

07:00:29 PM

gotta give this a shot (and test out other CNIs + EKS). kinda would finally love to outsource our kube master management to AWS

Andrea

08:02:26 AM

I successfully used Weavenet for a while, until we introduced Windows worker nodes and had to revert (as weavenet does not support Windows)

Andrea

08:04:27 AM

another thing to consider is whether you care about AWS support (enterprise companies would). If you remove the standard AWS CNI for the XYZ CNI, I would suspect AWS would turn down support requests…

Andrea

08:06:47 AM

even if you can test “everything” during the first install, you’ll have to maintain your custom solution forever… through all the EKS updates…

Andrea

08:08:48 AM

having said that, the fact that the AWS CNI uses “physical” IPs sucks (particularly if you have not planned it correctly when deploying the subnets)

btai

12:47:15 AM

fwiw, we don’t keep clusters up forever, we generally blue/green clusters for new kubernetes versions so that gives us flexibility to always update our custom solutions to work w/ EKS updates. although losing AWS support for it would be unfortunate. @Andrea did you use weavenet for the purposes of running more pods on a node?

Andrea

06:27:39 AM

yes, because I was running out of IPs rather than resources on the EC2 worker nodes..

btai

07:20:01 PM

@Andrea i feel the same way about vanilla EKS and their extremely low max pod count, good to know

2021-03-27

2021-03-28

Padarn

07:29:56 AM

We’re looking for a nice way to orchestrate performance tests in a k8s cluster, any suggestions?

An example scenario: We want to test the performance of using redis vs using minio as an object cache. Would like to be able to easily setup, run the test, and teardown

kskewes

03:44:44 AM

Assuming you want to test including cloud provider backing disc etc we would create a pipeline in our CD (Spinnaker) with necessary setup, job, teardown stages.

kskewes

03:47:32 AM

We would do this in an existing cluster versus spinning one up from scratch as easier and we could still do node isolation. If worried then new cluster which will take fair investment in tooling. Running in kind in CI isn’t going to be very apples to apples with prod..

2021-03-29

Andrea

09:35:22 PM

Hi, to anyone who’s running Windows worker nodes, can you please share/suggest how to collect the pod logs? On the linux nodes I’ve been fairly happy with fluent-bit (deployed as a helm chart). fluent-bit collect the logs and send them to elasticsearch. I’m not having much luck with the same procedure on Windows though…

2021-03-30

Christian

01:35:41 PM

Do people generally use managed node groups now?