#kubernetes (2022-07)
Archive: https://archive.sweetops.com/kubernetes/
2022-07-11
Did anybody ever experience response time lagging between an nginx and an application pod? I have some strange intermittent issues where the application responds in e.g. 300ms but the nginx in front of it responds in 24s.
2022-07-12
Hello All, we are running ELK running in Kubernetes, installed via helm charts. it was installed using the stable helm repo(which is deprecated https://github.com/helm/charts/tree/master/stable now and all charts are moved to the elastic repo). Elasticsearch, Logstash and Kibana all are in version 6.7.0, now we want to upgrade it to the latest or at least 7. the latest version of elasticsearch in charts from the stable repo is 6.8.6. so am assuming I cannot just upgrade it to 7 to 8 version just using the “helm upgrade” command. So my questions are:
- Do we have to recreate the whole ELK cluster if we have to upgrade to 7 or 8, downloading the chart from the elastic repo?
- Is there a way to upgrade to 7 or version without changing repo(stable to elastic) info? Thanks in advance
2022-07-18
Hey all, running EKS.
Is there to get certain pods to scale spot
nodes, and additionally fall back to on demand
when there’s no capacity?
Alternatively, is there a way to run 10% of a workload on demand, and 90% on spot?
Context:
I’m trying to use tolerations
and affinity
to make emr on eks pods run on spot. When these cause a scale up they sometimes autoscale the spot
nodes, sometimes autoscale the on demand
nodes. I would ideally like the spot
nodes to get autoscaled until that’s not possible and then fall back on to on demand
nodes
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: executor-emr
operator: In
values:
- "true"
tolerations:
- key: executor-emr
operator: Equal
value: "true"
effect: NoSchedule
I would highly recommend you look into Karpenter. https://karpenter.sh/ Its an alternative to your traditional autoscaler but with a LOT more smarts. It can do what your looking for easily.
Just-in-time Nodes for Any Kubernetes Cluster
Yeah, I actually started a poc to implement it and got 85-90% of the way there. I’m leaving in two weeks however so I really want to get solid auto scaling implemented first, and the cluster autoscaler seemed like it was going to work but now I don’t think so at least not perfectly
The karpenter devs are active in the kubernetes.slack.com slack on the #karpenter channel if that helps. They may be able to answer any questions you have. They were very helpful for me in the past.
2022-07-19
I have a weird issue with my nginx-ingress-controller.
Sometimes it have following log values from the nginx-ingress-controller
upstream_duration: 4.560, 0.328
Where did 4 seconds go?
Upstream is another nginx with a duration of 328ms
Did anybody experience something like this before? How could I debug this?
2022-07-20
What the normal amount of namespaces you’ve seen in clusters?
Curious as I had a discussion about this with someone and realized the types of companies you’ve worked with probably bias this a lot.
Second question, more thread oriented
If you install a helm chart that is doing automation with Kubernetes such as secrets and other things…. And you want to install to a namespace….
Would you expect the app to only function internally in that namespace even if it was doing kubernetes automation? My gut is that I’d want a namespaced app to be defaulted to only running against secrets and resources in the namespace explicitly, regardless of RBAC allowing more. However, that’s my assumption. Curious if anyone else thinks automation for k8s with a namespace should be opt in to all namespace automation or opt out to all namespace automation as a rule.
Well secret access is scoped to Pods running in that namespace
But I think I follow either way
If you are using kube deployments and such via helm charts to do broader configuration against the cluster I don’t know that namespace scope matters so much as having the rbac rules of its SA approved in version control somewhere.
Maybe I misunderstand ya though
2022-07-21
2022-07-22
Anybody reliably used CRON_TZ
in CronJobs with 1.22 version?
Hello, is anyone here familiar with Istio? I’m a beginner trying to get started but running into some issues.
what’s your question?
Hey @roth.andy, so I’m a complete noob at this. I’m trying to just get it up and running to be honest. I’ve installed 1.14.1 using helm on EKS but the ingressgateway (I’ve just named it gateway) is not coming up. I keep seeing the below errors and I have no idea what it means. Do you have any insight?
2022-07-22T21:45:05.292287Z warning envoy config StreamAggregatedResources gRPC config stream to xds-grpc closed since 6457s ago: 14, connection error: desc = "transport: Error while dialing dial tcp: lookup istiod.istio-system.svc: i/o timeout"
2022-07-22T21:45:28.893251Z warning envoy config StreamAggregatedResources gRPC config stream to xds-grpc closed since 6481s ago: 14, connection error: desc = "transport: Error while dialing dial tcp: lookup istiod.istio-system.svc: i/o timeout"
2022-07-22T21:45:34.294627Z warn ca ca request failed, starting attempt 1 in 105.763472ms
2022-07-22T21:45:34.401066Z warn ca ca request failed, starting attempt 2 in 189.64364ms
2022-07-22T21:45:34.591468Z warn ca ca request failed, starting attempt 3 in 377.860141ms
2022-07-22T21:45:34.970056Z warn ca ca request failed, starting attempt 4 in 732.207437ms
2022-07-22T21:45:35.702977Z warn sds failed to warm certificate: failed to generate workload certificate: create certificate: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup istiod.istio-system.svc on 172.20.0.10:53: read udp 10.253.25.91:45906->172.20.0.10:53: i/o timeout"
Thank you for responding as well btw
This is what my resources look like. It’s weird because my deployment is a bit unstable. Sometimes it works and sometimes it doesn’t
NAME READY STATUS RESTARTS AGE
pod/istio-gateway-7f885db475-jl74c 0/1 Running 0 112m
pod/istiod-57c86cdbd7-cgx42 1/1 Running 0 3h43m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/istio-gateway LoadBalancer 172.20.192.132 <commented_out> 15021:31706/TCP,80:32429/TCP,443:31799/TCP 112m
service/istiod ClusterIP 172.20.9.46 <none> 15010/TCP,15012/TCP,443/TCP,15014/TCP 3h43m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/istio-gateway 0/1 1 0 112m
deployment.apps/istiod 1/1 1 1 3h43m
NAME DESIRED CURRENT READY AGE
replicaset.apps/istio-gateway-7f885db475 1 1 0 112m
replicaset.apps/istiod-57c86cdbd7 1 1 1 3h43m
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontalpodautoscaler.autoscaling/istio-gateway Deployment/istio-gateway <unknown>/80% 1 5 1 112m
horizontalpodautoscaler.autoscaling/istiod Deployment/istiod <unknown>/80% 1 5 1 3h43m
I’m a little confused how the certificate process works
How are you deploying Istio? The operator? Or something else
What happens if you deploy the “demo” configuration profile?
Start from a configuration that works, then modify iteratively
Hey Andrew, thanks for the reply. I was deploying the operator using helm per here actually: https://istio.io/latest/docs/setup/install/helm/
Install and configure Istio for in-depth evaluation.
but over the weekend, I actually realized this might be more of a kubernetes issue so I’m currently looking into that
2022-07-25
Hey Is there an alternative to matchLabels? if for example i want to match only atleast 2 labels and not all
What do you mean? matchLabels is a map of key value pairs
selector:
matchLabels:
app: nginx
tier: frontend
Is it possible to run pods only on nodes within a specified subnet? If for some reason, nodes cannot be started in that subnet, start the pods on the other nodes?
i guess the solution would be something like this
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- antarctica-east1
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 10
preference:
matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- eu-central-1a
- eu-central-1c
- weight: 90
preference:
matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- eu-central-1b
With this nodeAffinity configuration, Kubernetes will try to schedule
• 90% of the pods on nodes with the label [topology.kubernetes.io/zone=eu-central-1b](http://topology.kubernetes.io/zone=eu-central-1b)
and,
• 10% of the pods on nodes with the label [topology.kubernetes.io/zone=eu-central-1a](http://topology.kubernetes.io/zone=eu-central-1a)
or [topology.kubernetes.io/zone=eu-central-1c](http://topology.kubernetes.io/zone=eu-central-1c)
Do I understand this correctly?