#kubernetes (2021-03)

kubernetes

Archive: https://archive.sweetops.com/kubernetes/

2021-03-01

Brad McCoy avatar
Brad McCoy

Hi everyone, I have two online events coming up that will be recording on CKAD/CKA study and exam tips, here are the links for those interested:

1

2021-03-03

Matt Gowie avatar
Matt Gowie

Not exactly a Kubernetes question, but figured folks in this channel would know what I’m talking about exists — Does anyone know if there is a Network / TCP proxy tool out there that will do a manage-and-forward pattern (my own made up term for describing this) for long lived TCP connections?

I have a client running on K8s and one of their primary microservices holds long lived TCP socket connections with many thousands of clients through an AWS NLB. The problem is that whenever we do a deployment and update those pods the TCP connections require a re-connection which results in problems on the client side. So to provide an better experience for the clients we’re looking at what we can do to have those TCP connections always stay alive. My first thought is for a proxy layer that manages the socket connections with the client and then forwards socket connections to the actual service pods. That way even if the pods are swapped out behind the scenes, the original socket connection is still up and has no adverse affects on the clients.

kskewes avatar
kskewes

Does the backend service maintain any state? In the event the responsible backend terminates could you send any requests to any other backend?

Be worth looking at the aws NLB issues in kubernetes GitHub. Tldr is that you want eks 1.19 and the new aws load balancer controller because otherwise:

  1. Proxy drain results in 500, no graceful draining of lb targets
  2. Scale up of nodes results in 500s as nodes are healthy on startup before being married unhealthy until complete process and then health check heathy. So there are dragons.
Matt Gowie avatar
Matt Gowie

Service is stateless. Could send any incoming socket messages to any other backend.

1
kskewes avatar
kskewes

I think ultimately you need solid client reconnection behavior on kubernetes due to all the stuff in the network path and what can affect routing.

Alternatively you could run the proxy on EC2 and forward into kubernetes but you would need to solve for service discovery (pod ip) to route direct and minimize hiccups.

Matt Gowie avatar
Matt Gowie

What issue are you referring to? Client is setup with the aws-lb-controller for the NLB creation / management, but we’re on 1.18… is 1.19 out yet?

kskewes avatar
kskewes

FWIW I looked at something similar for webrtc sessions and the agones project but ultimately went with EC2.

kskewes avatar
kskewes

Sorry don’t have issues off hand, nlb termination etc should come up in search. Been some big ones. :)

1
Shreyank Sharma avatar
Shreyank Sharma

Hi All,

we have 4 node kubernetes cluster in production deployed using kops in AWS, 3 worker node and one master nodes are c4.2xlarge 16gb memory each

with other pods we have elasticsearch deployed using helm, there we have 3 elasticsearch-data pods consumes 4000mb of memory each 3 elasticsearch-master pods consumes 2600mb of memory each 3 elasticsearch-client pod consumes 2600mb of memory each

all are distribed amoung nodes but one of the in one node, one of the elasticsearch-data pod is restarting daily like 2-3 times in a same node

i described the restarted pod which says just

Last State: Terminated Reason: OOMKilled Exit Code: 137 Started: Tue, 02 Mar 2021 20:31:07 +0530 Finished: Wed, 03 Mar 2021 17:46:02 +0530

there is no events

and when i checked the syslogs of the nodes in which the pod restarted it shows C2 CompilerThre invoked oom-killer: gfp_mask=0x24000c0(GFP_KERNEL), nodemask=0, order=0, oom_score_adj=901 C2 CompilerThre cpuset=6126d0823d683f51d04603c4c6464c030464d3748c916c1a46621936846aac01 mems_allowed=0 CPU: 2 PID: 7743 Comm: C2 CompilerThre Not tainted 4.9.0-9-amd64 #1 Debian 4.9.168-1 Hardware name: Amazon EC2 c5.2xlarge/, BIOS 1.0 10/16/2017 ........... .. .

the version of elasticsearch is 6.7.0

anyone experienced same issue how to solve this pod restart issue

Matt Gowie avatar
Matt Gowie

OOMKilled / 137 is Out of Memory. ElasticSearch is super memory hungry so it just needs more memory likely or you need to play with your Pod limits / requests.

Walter Sosa avatar
Walter Sosa

Fun fact: Exit Code 137 (128 + 9) –> it’s getting a signal 9 (SIGKILL)

2021-03-05

Shreyank Sharma avatar
Shreyank Sharma

Hi All, Under what condition pod will exceeds its memory limit,

in my Kubernets cluster i have deployed elasticsearch deployed using helm, elasticsearch version is 6.7.0, we have 3 elasticsearch-data pods and 2 elasticsearch-master pods and 1 client,

memory limit for elasticseach-data pod is 4gb, but one of the data pod is restarted everyday about 5-6 times(oom kill), when i checked in Grafana for pod’s memory and cpu usage, i can see that one of the elasticsearch-data pod is using twice the memory limit(8gb) ,

So i wanted to know Under what condition pod will exceeds its memory limit,

also in syslogs- when oom_kill happened C2 CompilerThre invoked oom-killer: gfp_mask=0x24000c0(GFP_KERNEL), nodemask=0, order=0, oom_score_adj=901 [28621138.637578] C2 CompilerThre cpuset=441fa5603f64f86888937bc911269fca47dfcdb318648cc1ac0832cdfb07134d mems_allowed=0 [28621138.639850] CPU: 5 PID: 7749 Comm: C2 CompilerThre Not tainted 4.9.0-9-amd64 #1 Debian 4.9.168-1 [28621138.641757] Hardware name: Amazon EC2 c5.2xlarge/, BIOS 1.0 10/16/2017 [28621138.643152] 0000000000000000 ffffffff85335284 ffffa53882de7dd8 ffff8dda11dec040 ...... .. . [28621138.662399] [<ffffffff85615f82>] ? schedule+0x32/0x80 [28621138.663485] [<ffffffff8561bc48>] ? async_page_fault+0x28/0x30 [28621138.669097] memory: usage 4096000kB, limit 4096000kB, failcnt 383494862

Here at shows aroung 4gb

but at last [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name [28621138.876368] [24691] 0 24691 256 1 4 2 0 -998 pause [28621138.878201] [ 7436] 1000 7436 2141564 989996 2342 11 0 901 java [28621138.879983] Memory cgroup out of memory: Kill process 7436 (java) score 1870 or sacrifice child [28621138.881978] Killed process 7436 (java) total-vm:8566256kB, anon-rss:3941732kB, file-rss:18252kB, shmem-rss:0kB but here its showing total-vm is 8gb

am confused why its showing 4 in one place and 8 in another place

Marcin Brański avatar
Marcin Brański

It’s not a pod that exceed the limit but program that you run on the pod - elk. elk is memory hungry and need to be configured to not to use more memory than it’s given otherwise either os or kubelet will kill it. so every program will get a score so it’s known what to kill first to preserve memory for prioritized applications

roth.andy avatar
roth.andy

The Helm 2nd Security Audit is now completed. Check out the blog post by core maintainer @mattfarina and the reports here: https://helm.sh/blog/helm-2nd-security-audit/

2021-03-07

Issif avatar

I just released a kubectl plugin I developed at my day job, maybe someone will find it useful https://github.com/qonto/kubectl-duplicate

qonto/kubectl-duplicate

A kubectl plugin for duplicate a running pod and auto exec into - qonto/kubectl-duplicate

1

2021-03-08

2021-03-11

btai avatar

whats the recommendation for pod level iam roles? I know the ones being used the most initially was kube2iam and kiam but iirc one or both of them had rate limiting issues (which is why i avoided them initially). I know AWS came out with one here and was just curious what people are using nowadays or if theres a general consensus of best one https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/ Personally I’m using kops, so curious if theres issues integrating w/ aws irsa

Introducing fine-grained IAM roles for service accounts | Amazon Web Servicesattachment image

Here at AWS we focus first and foremost on customer needs. In the context of access control in Amazon EKS, you asked in issue #23 of our public container roadmap for fine-grained IAM roles in EKS. To address this need, the community came up with a number of open source solutions, such as kube2iam, kiam, […]

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

we use EKS IAM roles for Service Accounts, and did not see any issues with them, works fine

Introducing fine-grained IAM roles for service accounts | Amazon Web Servicesattachment image

Here at AWS we focus first and foremost on customer needs. In the context of access control in Amazon EKS, you asked in issue #23 of our public container roadmap for fine-grained IAM roles in EKS. To address this need, the community came up with a number of open source solutions, such as kube2iam, kiam, […]

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

yea, this is the defacto way to do it on EKS.

1

2021-03-15

Mr.Devops avatar
Mr.Devops

Hi does anyone have a recommended approach on injecting passwords to kops templates using cloud int?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Can you describe the use-case?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

E.g. You may be better off reading from SSM

this2

2021-03-19

Fernanda Martins avatar
Fernanda Martins

Hello everyone, I am configuring Federated Prometheus to monitor multiple Cluster for the first time. Any tips? On how to organize operators and etc? Tks!

Emmanuel Gelati avatar
Emmanuel Gelati

I prefer to use thanos

Pierre-Yves avatar
Pierre-Yves

@Fernanda Martins you may have a look at this video it is well done about monitoring couchbase across multiple kubernetes cluster. You’ll then be able to figure out more

https://connectondemand.couchbase.com/watch/UgdEWF5i4cMRkA9VY8Fwck

Centralized Monitoring Across Multiple Couchbase Clusters Using Prometheus

Prometheus is rapidly becoming the de facto standard for monitoring metrics and is typically combined with Grafana as the front end visualization tool. The Couchbase Autonomous Operator can deploy the Prometheus Exporter along with a cluster. In this session you’ll learn how to deploy the Prometheus Exporter and deploy Prometheus and Grafana services to visualize Couchbase metrics. We’ll show you how to monitor these metrics from multiple Couchbase clusters.

2021-03-20

Azul avatar

anyone using EKS with fargate profiles? I am on a project where we started using it and we had to submit a support request to increase the maximum number of profiles in the cluster from 10 to 20. A fargate profile maps to a kubernetes namespace, so I’m essentially looking into 20 namespaces in these EKS clusters. That’s in my view a fairly small number, and I expect these to increase as we add more apps onto the cluster. Maybe be worth mentioning at this point, that by default I can launch about 1000 fargate nodes on a EKS cluster, so EKS was designed to scale. Anyway the docs list the fargate profiles quota as possible to be raised through the console, but that’s incorrect, so I raised a support request to do this. The feedback I received was that they would raise it with the fargate service team as it was a fairly large increase. My thought here, is he serious ? we’re talking about 20 namespaces/fargate profiles. What exactly is large about this request? A google search didn’t show any relevant posts of the number of fargate profiles, so I thought of coming here to ask: who here is using fargate on EKS, and how many namespaces are you using?

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)
cloudposse/terraform-aws-eks-fargate-profile

Terraform module to provision an EKS Fargate Profile - cloudposse/terraform-aws-eks-fargate-profile

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

and we tested it, but we don’t use it in production (for many reasons, one of them is that Fargate is still very limited)

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

having said that, 20 namespaces is not a large number, you usually have more than that in a cluster

Vlad Ionescu (he/him) avatar
Vlad Ionescu (he/him)

Yeah, Fargate selector forcing a namespace is very… annoying. There is a feature request to support * for namespace and use labels for selection: https://github.com/aws/containers-roadmap/issues/621 . That would make the “many tiny namespaces” usecase easier.

Support does not lie If they say they need to talk to the Fargate Team, they mean it. Sometimes it’s an instant approval from the Team. Other times, you’ll get into a discussion with the Team through Support (which is like paying a bad game of telephone, but I digress). I think they do this because: 1) research: they want to see what use-cases people use so they can use that info for future improvements 2) complexity: they don’t want to overload something else (10 namespaces may put a bigger load on the EKS Control Plane and so they will have to scale that too, but only in certain cases). Maybe they have to scale some fancy internal Fargate component that requires some extra work 3) safety: they want to offer some protection. In your case, 20 Fargate namespaces will allow you to run 20k pods which would be… expensive. You’d be amazed how many cases of “AWS pls gimme 10k EC2s” are followed by “AWS WHY DID YOU LET ME DO THIS. I want my money back, I did not know I would create an infinite loop here and spend 100.000$

[EKS/Fargate] Support for Wildcard * Namespaces · Issue #621 · aws/containers-roadmap

the problem Today you have to specify each namespace from which you want to run pods on Fargate in a Fargate profile. the solution EKS with Fargate will support adding wildcard * namespaces to the …

1
Azul avatar

yeah, they do know what they’re doing, by the way, the number of fargate pods is by default limited to 1000 (which is already a LOT) and not per fargate profile. The number of namespaces won’t generate additional load on the control-pane, however a large number of fargate pods will as in EKS/fargate each pod is a k8s node. which needs to be probed by the control-plane

1
1

2021-03-23

2021-03-24

melynda.hunter avatar
melynda.hunter

Hi. Can anyone recommend articles/videos on configuring k8s on an “airgapped” system? The OS is Centos7. Thanks!

Jonathon Canada avatar
Jonathon Canada

Hi @melynda.hunter. Do you mean as far as accessing a k8s cluster that’s in a an airgapped environment?

roth.andy avatar
roth.andy

kubernetes on a single system? Or an airgapped network with multiple servers

melynda.hunter avatar
melynda.hunter

@roth.andy this is for an airgapped network with multiple VM servers.

roth.andy avatar
roth.andy

Got it, yeah there’s not a lot out there. The federal government does kubernetes work in restrictive/classified/isolated environments, so looking for information coming out of there would likely bear fruit. Distros I know they are using is OpenShift and D2IQ. Check out the Air Force program called Platform One, they are trailblazing a lot there, and have published a lot as open source. https://software.af.mil is a good starting point

1
roth.andy avatar
roth.andy

RKE2 from Rancher is also positioning itself in this space, though it isn’t officially released yet

1
Jonathon Canada avatar
Jonathon Canada

Teleport can also help with access in an airgapped environment: https://github.com/gravitational/teleport

In full disclosure I work there

1
1
Vlad Ionescu (he/him) avatar
Vlad Ionescu (he/him)

There also is a CNCF WG around this. It may or may not be relevant for you: https://github.com/cncf/sig-app-delivery/tree/master/air-gapped-wg

cncf/sig-app-delivery

CNCF App Delivery SIG. Contribute to cncf/sig-app-delivery development by creating an account on GitHub.

1
melynda.hunter avatar
melynda.hunter

Thanks to everyone I appreciate all of the information!

2021-03-25

Eric Berg avatar
Eric Berg

Any body have thoughts on where I should use .Release.Name vs .Chart.Name?

Emmanuel Gelati avatar
Emmanuel Gelati

Maybe release name is the name you specify in th cli helm install <name> and chart.name is chart used

2021-03-26

Shreyank Sharma avatar
Shreyank Sharma

Hi All, we have Elasticsearch cluster running in our Kubernetes cluster and it is deployed using helm chart, and fluentd is sending logs from each nodes, we have 2 data nodes, 2 master node and a client node, and from yesterday data nodes are in not ready state and because of the client keep getting restarted. so as fluentd and kibana elastichq-7cf55c6bbc-998pq 1/1 Running 0 1y elasticsearch-client-5dbccbd776-7kpwk 1/1 Running 79 1d elasticsearch-data-0 0/1 Running 18 1d elasticsearch-data-1 0/1 Running 21 1d elasticsearch-master-0 1/1 Running 0 1d elasticsearch-master-1 1/1 Running 0 1d fluentd-fluentd-elasticsearch-hhh8v 1/1 Running 147 1y fluentd-fluentd-elasticsearch-ksfnx 1/1 Running 110 1y fluentd-fluentd-elasticsearch-lnbll 1/1 Running 94 1y kibana-b7768db9d-r57st 1/1 Running 347 1y logstash-0 1/1 Running 6 1y

after describeing the fluentd pod i come to know that, Killing container with id <docker://fluentd-fluentd-elasticsearch>:Container failed liveness probe.. Container will be killed and recreated.

after referring to some links i found ———— Data nodes — stores data and executes data-related operations such as search and aggregation Master nodes — in charge of cluster-wide management and configuration actions such as adding and removing nodes Client nodes — forwards cluster requests to the master node and data-related requests to data nodes ——————————————————-

in the kubectl get events output it says rediness probe failed for elasticsearch-data pods(we increased timeout values and recreated all pods again)

so am assuming client is failing because of elasticsearch-data pods are in not ready state, and also in one of the data pod i can see java.lang.OutOfMemoryError: Java heap space Dumping heap to data/java_pid1.hprof ... Unable to create data/java_pid1.hprof: File exists data pod’s memory limit is 4gb and heap is 1.9 which is fine i think,,,

Since mater node is responsible for adding and removing the nodes i went inside the master pod and did curl localhost:9200/_cat/nodes elasticsearch-client-5dbccbd776-7kpwk * elasticsearch-master-1 elasticsearch-master-0 data pods are not listed here, after checking the logs of master node, i can see lot of [INFO ][o.e.c.s.ClusterApplierService] [elasticsearch-master-0] removed {{elasticsearch-data-1} master is keep adding and removing the data pods . and in master-1 logs i can see org.elasticsearch.transport.NodeDisconnectedException:

we did a helm upgrade <chartname> -f custom_valuefile.yaml --recreate-pods which did not worked.

is there any workaround or solution for this behaviour, Thanks in advance

Antoine Taillefer avatar
Antoine Taillefer

If you have a Java heap space in the data nodes you should probably raise the -Xmx and -Xms Java opts. In the latest elasticsearch chart provided by Elastic, you can use the esJavaOpts value, see https://github.com/elastic/helm-charts/tree/master/elasticsearch (defaults to -Xmx1g -Xms1g).

elastic/helm-charts

You know, for Kubernetes. Contribute to elastic/helm-charts development by creating an account on GitHub.

Shreyank Sharma avatar
Shreyank Sharma

thank you for that @Antoine Taillefer, we are using stable version of elasticsearch

Antoine Taillefer avatar
Antoine Taillefer
helm/charts

(OBSOLETE) Curated applications for Kubernetes. Contribute to helm/charts development by creating an account on GitHub.

Shreyank Sharma avatar
Shreyank Sharma

thank you

Matt Gowie avatar
Matt Gowie
Argo CD v2.0-rc1 is here!attachment image

Argoproj team is proud to announce the first release candidate for Argo CD v2.0! As denoted by the version number, this is a major release…

2
btai avatar

i wasn’t aware of this, but the ridiculously low allowed pod count on EKS (i.e 29 pods on m4.large) is tied specifically to the AWS VPC CNI. apparently we can skirt around that issue by uninstalling the default CNI and installing a different one. anyone try doing this? https://docs.projectcalico.org/getting-started/kubernetes/managed-public-cloud/eks

Amazon Elastic Kubernetes Service (EKS)

Enable Calico network policy in EKS.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Yes, this is true. It’s since each pod gets an ENI and ENIs are limited by instance type

Amazon Elastic Kubernetes Service (EKS)

Enable Calico network policy in EKS.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

It’s apparently what GKE uses and it works with EKs

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
New GKE Dataplane V2 increases security and visibility for containersattachment image

GKE’s new dataplane uses the eBPF-based Cilium project to better integrate Kubernetes and the Linux kernel.

btai avatar

gotta give this a shot (and test out other CNIs + EKS). kinda would finally love to outsource our kube master management to AWS

1
Andrea avatar

I successfully used Weavenet for a while, until we introduced Windows worker nodes and had to revert (as weavenet does not support Windows)

Andrea avatar

another thing to consider is whether you care about AWS support (enterprise companies would). If you remove the standard AWS CNI for the XYZ CNI, I would suspect AWS would turn down support requests…

Andrea avatar

even if you can test “everything” during the first install, you’ll have to maintain your custom solution forever… through all the EKS updates…

Andrea avatar

having said that, the fact that the AWS CNI uses “physical” IPs sucks (particularly if you have not planned it correctly when deploying the subnets)

btai avatar

fwiw, we don’t keep clusters up forever, we generally blue/green clusters for new kubernetes versions so that gives us flexibility to always update our custom solutions to work w/ EKS updates. although losing AWS support for it would be unfortunate. @Andrea did you use weavenet for the purposes of running more pods on a node?

Andrea avatar

yes, because I was running out of IPs rather than resources on the EC2 worker nodes..

btai avatar

@Andrea i feel the same way about vanilla EKS and their extremely low max pod count, good to know

1

2021-03-27

2021-03-28

Padarn avatar

We’re looking for a nice way to orchestrate performance tests in a k8s cluster, any suggestions?

An example scenario: We want to test the performance of using redis vs using minio as an object cache. Would like to be able to easily setup, run the test, and teardown

kskewes avatar
kskewes

Assuming you want to test including cloud provider backing disc etc we would create a pipeline in our CD (Spinnaker) with necessary setup, job, teardown stages.

kskewes avatar
kskewes

We would do this in an existing cluster versus spinning one up from scratch as easier and we could still do node isolation. If worried then new cluster which will take fair investment in tooling. Running in kind in CI isn’t going to be very apples to apples with prod..

2021-03-29

Andrea avatar

Hi, to anyone who’s running Windows worker nodes, can you please share/suggest how to collect the pod logs? On the linux nodes I’ve been fairly happy with fluent-bit (deployed as a helm chart). fluent-bit collect the logs and send them to elasticsearch. I’m not having much luck with the same procedure on Windows though…

2021-03-30

Christian avatar
Christian

Do people generally use managed node groups now?

5
    keyboard_arrow_up