hoping someone can help who ran into this. We recently upgraded our EKS cluster from 1.23 to 1.24 and now any new node group we create would fail with
Instances failed to join the kubernetes cluster, prior to the upgrade it was working fine.
we also tried creating a new cluster going to 1.24 and that works. Something with the upgrade from 1.23 to 1.24 might be causing this…
AWS support should be able to assist you on this. Also, did you update the AMI used by the new node pools?
i’ve reached out and waiting for the reply from support and yes we’re using the latest Linux AMI
this one is a very strange issue
Hey Anyone has experience with Kubespray? Is it possible to deploy a kubernetes cluster with kubespray with a specific user that has sudo permissions? Rather than default root
It’s not related to Kubespray, but to Ansible itself.
Put this to your
ansible.cfg inside the project directory:
[defaults] deprecation_warnings = True timeout = 60 remote_user = <name of the remote user with sudo permissions> private_key_file = </path/to/ssh/key> host_key_checking = False scp_if_ssh = True force_valid_group_names = ignore stdout_callback = debug nocows = 1 callbacks_enabled = profile_tasks [diff] always = True [ssh_connection] sudo_user = root pipelining = True ssh_args = -o ControlMaster=auto -o ControlPersist=30m -o ConnectionAttempts=100 -o UserKnownHostsFile=/dev/null [galaxy] ignore_certs = True [privilege_escalation] become = True become_method = sudo become_user = root
BTW, some setting are already there: https://github.com/kubernetes-sigs/kubespray/blob/master/ansible.cfg
[ssh_connection] pipelining=True ansible_ssh_args = -o ControlMaster=auto -o ControlPersist=30m -o ConnectionAttempts=100 -o UserKnownHostsFile=/dev/null #control_path = ~/.ssh/ansible-%%[email protected]%%h:%%p [defaults] # <https://github.com/ansible/ansible/issues/56930> (to ignore group names with - and .) force_valid_group_names = ignore host_key_checking=False gathering = smart fact_caching = jsonfile fact_caching_connection = /tmp fact_caching_timeout = 86400 stdout_callback = default display_skipped_hosts = no library = ./library callbacks_enabled = profile_tasks,ara_default roles_path = roles:$VIRTUAL_ENV/usr/local/share/kubespray/roles:$VIRTUAL_ENV/usr/local/share/ansible/roles:/usr/share/kubespray/roles deprecation_warnings=False inventory_ignore_extensions = ~, .orig, .bak, .ini, .cfg, .retry, .pyc, .pyo, .creds, .gpg [inventory] ignore_patterns = artifacts, credentials
HI everyone. Is it possible to run commands in kubernetes pod without having to override the docker entrypoint.
Is the pod running?
If so you should be able to do a kubectl exec -it <pod-name> bash to obtain a bash shell
So I want to be able run the command using ARGS as part of the pod config
I’ve usually done one or the other but i imagine your approach would work. See if this helps: https://stackoverflow.com/a/49657024
Dockerfile has a parameter for ENTRYPOINT and while writing Kubernetes deployment YAML file, there is a parameter in Container spec for COMMAND.
I am not able to figure out what’s the difference a…
The idea is that I do not want to have to override my entrypoint command from my pod.
@emem hopefully it helps.
apiVersion: v1 kind: Pod metadata: name: my-pod spec: containers: - name: my-container image: my-image:latest command: ["/bin/sh"] args: ["-c", "echo 'Hello, Kubernetes!' && ls /app"]
What is the use case for this?
could be various uses. i am looking at it to model existing clusters
but yeah, i suppose you could deploy YAML against it for validation
Ah, actually there are some “User Stories” here to explain https://kwok.sigs.k8s.io/docs/design/introduction/
Introduction # This document will introduce the design of Kwok. What’s the kubemark? # kubemark is a kubelet that does not actually run a container. What’s the kind? # kind is run Kubernetes in Docker that is a real cluster. User Stories # Scheduler # As a scheduler developer, I want to test the scheduler with a large number of Nodes and Pods, CRD Controller # As a CRD controller developer, I want to test the controller without fake clients.
do you know on methods or tooling to assign a Pod a specific IO load (load 1/5/15) profile?
We have several pod as queue workers running across the cluster, they only need about 300m CPU and 400Mi RAM, however each Pod generate a lot of IO and compute load, hence slowing everything down.
Is there any method to distribute the Pods better and not just set a high
resources.requests.cpu, we do not want to waste computing power with almost empty nodes.
Currently is not possible to change the software to be more performant
We’re using velero to snapshot PVs on GKE clusters. We have classic GCP PD, Netapp CVS CSI and GCP Filestore CSI. We’re trying to use the Velero GCP plugin to backup our GCP PDs and the CSI plugin for the others. However, Velero tries to backup the GCP PDs twice, once with the CSI plugin and once with GCP plugin. Do you know a way to tell Velero to skip CSI snapshots if the PV is handled by GCP plugin (native snapshots)?