SweetOps #kubernetes for January, 2023

Archive: https://archive.sweetops.com/kubernetes/

2023-01-05

Mr.Devops

hoping someone can help who ran into this. We recently upgraded our EKS cluster from 1.23 to 1.24 and now any new node group we create would fail with Instances failed to join the kubernetes cluster, prior to the upgrade it was working fine.

we also tried creating a new cluster going to 1.24 and that works. Something with the upgrade from 1.23 to 1.24 might be causing this…

venkata.mutyala

12:14:10 AM

AWS support should be able to assist you on this. Also, did you update the AMI used by the new node pools?

Mr.Devops

12:57:38 AM

i’ve reached out and waiting for the reply from support and yes we’re using the latest Linux AMI

Mr.Devops

12:57:49 AM

this one is a very strange issue

2023-01-16

mr.shayv

04:09:39 PM

Hey Anyone has experience with Kubespray? Is it possible to deploy a kubernetes cluster with kubespray with a specific user that has sudo permissions? Rather than default root

Avi Langburd

01:42:20 PM

Definitely. It’s not related to Kubespray, but to Ansible itself. Put this to your ansible.cfg inside the project directory:

[defaults]
deprecation_warnings 	      = True
timeout 				      = 60
remote_user 			      = <name of the remote user with sudo permissions>
private_key_file		      = </path/to/ssh/key>
host_key_checking 		      = False
scp_if_ssh 				      = True
force_valid_group_names       = ignore
stdout_callback               = debug
nocows                        = 1
callbacks_enabled             = profile_tasks

[diff]
always 					      = True

[ssh_connection]
sudo_user 				      = root
pipelining				      = True
ssh_args 				      = -o ControlMaster=auto -o ControlPersist=30m -o ConnectionAttempts=100 -o UserKnownHostsFile=/dev/null

[galaxy]
ignore_certs			      = True

[privilege_escalation]
become 					      = True
become_method 			      = sudo
become_user 			      = root

Reference: https://docs.ansible.com/ansible/latest/reference_appendices/config.html

Avi Langburd

01:43:54 PM

BTW, some setting are already there: https://github.com/kubernetes-sigs/kubespray/blob/master/ansible.cfg

[ssh_connection]
pipelining=True
ansible_ssh_args = -o ControlMaster=auto -o ControlPersist=30m -o ConnectionAttempts=100 -o UserKnownHostsFile=/dev/null
#control_path = ~/.ssh/ansible-%%r@%%h:%%p
[defaults]
# <https://github.com/ansible/ansible/issues/56930> (to ignore group names with - and .)
force_valid_group_names = ignore

host_key_checking=False
gathering = smart
fact_caching = jsonfile
fact_caching_connection = /tmp
fact_caching_timeout = 86400
stdout_callback = default
display_skipped_hosts = no
library = ./library
callbacks_enabled = profile_tasks,ara_default
roles_path = roles:$VIRTUAL_ENV/usr/local/share/kubespray/roles:$VIRTUAL_ENV/usr/local/share/ansible/roles:/usr/share/kubespray/roles
deprecation_warnings=False
inventory_ignore_extensions = ~, .orig, .bak, .ini, .cfg, .retry, .pyc, .pyo, .creds, .gpg
[inventory]
ignore_patterns = artifacts, credentials

2023-01-17

emem

10:39:53 PM

HI everyone. Is it possible to run commands in kubernetes pod without having to override the docker entrypoint.

venkata.mutyala

11:23:32 PM

Is the pod running?

venkata.mutyala

11:24:07 PM

If so you should be able to do a kubectl exec -it <pod-name> bash to obtain a bash shell

emem

01:32:52 PM

So I want to be able run the command using ARGS as part of the pod config

venkata.mutyala

04:45:54 PM

I’ve usually done one or the other but i imagine your approach would work. See if this helps: https://stackoverflow.com/a/49657024

Difference between Docker ENTRYPOINT and Kubernetes container spec COMMAND?

Dockerfile has a parameter for ENTRYPOINT and while writing Kubernetes deployment YAML file, there is a parameter in Container spec for COMMAND.

I am not able to figure out what’s the difference a…

emem

01:19:46 PM

The idea is that I do not want to have to override my entrypoint command from my pod.

Ozzy Aluyi

07:34:51 PM

@emem hopefully it helps.

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  containers:
  - name: my-container
    image: my-image:latest
    command: ["/bin/sh"]
    args: ["-c", "echo 'Hello, Kubernetes!' && ls /app"]

2023-01-18

2023-01-19

2023-01-22

managedkaos

03:58:17 AM

FYI…

Interesting looking tool for simulating k8s clusters… https://kwok.sigs.k8s.io/

Andy

11:00:49 AM

What is the use case for this?

Interesting looking tool for simulating k8s clusters… https://kwok.sigs.k8s.io/

Andy

11:01:15 AM

Validating YAML?

managedkaos

08:13:05 PM

could be various uses. i am looking at it to model existing clusters

managedkaos

08:13:26 PM

but yeah, i suppose you could deploy YAML against it for validation

Andy

04:24:10 PM

Ah, actually there are some “User Stories” here to explain https://kwok.sigs.k8s.io/docs/design/introduction/

Introduction | KWOK

Introduction # This document will introduce the design of Kwok. What’s the kubemark? # kubemark is a kubelet that does not actually run a container. What’s the kind? # kind is run Kubernetes in Docker that is a real cluster. User Stories # Scheduler # As a scheduler developer, I want to test the scheduler with a large number of Nodes and Pods, CRD Controller # As a CRD controller developer, I want to test the controller without fake clients.

2023-01-27

2023-01-31

rei

09:25:56 AM

Hi there, do you know on methods or tooling to assign a Pod a specific IO load (load 1/5/15) profile? We have several pod as queue workers running across the cluster, they only need about 300m CPU and 400Mi RAM, however each Pod generate a lot of IO and compute load, hence slowing everything down. Is there any method to distribute the Pods better and not just set a high resources.requests.cpu, we do not want to waste computing power with almost empty nodes. Currently is not possible to change the software to be more performant

zadkiel

10:50:02 AM

Hey there

zadkiel

10:50:03 AM

We’re using velero to snapshot PVs on GKE clusters. We have classic GCP PD, Netapp CVS CSI and GCP Filestore CSI. We’re trying to use the Velero GCP plugin to backup our GCP PDs and the CSI plugin for the others. However, Velero tries to backup the GCP PDs twice, once with the CSI plugin and once with GCP plugin. Do you know a way to tell Velero to skip CSI snapshots if the PV is handled by GCP plugin (native snapshots)?