Did anybody ever experience response time lagging between an nginx and an application pod? I have some strange intermittent issues where the application responds in e.g. 300ms but the nginx in front of it responds in 24s.
Hello All, we are running ELK running in Kubernetes, installed via helm charts. it was installed using the stable helm repo(which is deprecated https://github.com/helm/charts/tree/master/stable now and all charts are moved to the elastic repo). Elasticsearch, Logstash and Kibana all are in version 6.7.0, now we want to upgrade it to the latest or at least 7. the latest version of elasticsearch in charts from the stable repo is 6.8.6. so am assuming I cannot just upgrade it to 7 to 8 version just using the “helm upgrade” command. So my questions are:
- Do we have to recreate the whole ELK cluster if we have to upgrade to 7 or 8, downloading the chart from the elastic repo?
- Is there a way to upgrade to 7 or version without changing repo(stable to elastic) info? Thanks in advance
Hey all, running EKS.
Is there to get certain pods to scale
spot nodes, and additionally fall back to
on demand when there’s no capacity?
Alternatively, is there a way to run 10% of a workload on demand, and 90% on spot?
I’m trying to use
affinity to make emr on eks pods run on spot. When these cause a scale up they sometimes autoscale the
spot nodes, sometimes autoscale the
on demand nodes. I would ideally like the
spot nodes to get autoscaled until that’s not possible and then fall back on to
on demand nodes
affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 1 preference: matchExpressions: - key: executor-emr operator: In values: - "true" tolerations: - key: executor-emr operator: Equal value: "true" effect: NoSchedule
Yeah, I actually started a poc to implement it and got 85-90% of the way there. I’m leaving in two weeks however so I really want to get solid auto scaling implemented first, and the cluster autoscaler seemed like it was going to work but now I don’t think so at least not perfectly
I have a weird issue with my nginx-ingress-controller.
Sometimes it have following log values from the nginx-ingress-controller
upstream_duration: 4.560, 0.328
Where did 4 seconds go?
Upstream is another nginx with a duration of 328ms
Did anybody experience something like this before? How could I debug this?
What the normal amount of namespaces you’ve seen in clusters?
Second question, more thread oriented
If you install a helm chart that is doing automation with Kubernetes such as secrets and other things…. And you want to install to a namespace….
Would you expect the app to only function internally in that namespace even if it was doing kubernetes automation? My gut is that I’d want a namespaced app to be defaulted to only running against secrets and resources in the namespace explicitly, regardless of RBAC allowing more. However, that’s my assumption. Curious if anyone else thinks automation for k8s with a namespace should be opt in to all namespace automation or opt out to all namespace automation as a rule.
Well secret access is scoped to Pods running in that namespace
But I think I follow either way
If you are using kube deployments and such via helm charts to do broader configuration against the cluster I don’t know that namespace scope matters so much as having the rbac rules of its SA approved in version control somewhere.
Maybe I misunderstand ya though
Anybody reliably used
CRON_TZ in CronJobs with 1.22 version?
Hello, is anyone here familiar with Istio? I’m a beginner trying to get started but running into some issues.
what’s your question?
Hey @roth.andy, so I’m a complete noob at this. I’m trying to just get it up and running to be honest. I’ve installed 1.14.1 using helm on EKS but the ingressgateway (I’ve just named it gateway) is not coming up. I keep seeing the below errors and I have no idea what it means. Do you have any insight?
2022-07-22T21:45:05.292287Z warning envoy config StreamAggregatedResources gRPC config stream to xds-grpc closed since 6457s ago: 14, connection error: desc = "transport: Error while dialing dial tcp: lookup istiod.istio-system.svc: i/o timeout" 2022-07-22T21:45:28.893251Z warning envoy config StreamAggregatedResources gRPC config stream to xds-grpc closed since 6481s ago: 14, connection error: desc = "transport: Error while dialing dial tcp: lookup istiod.istio-system.svc: i/o timeout" 2022-07-22T21:45:34.294627Z warn ca ca request failed, starting attempt 1 in 105.763472ms 2022-07-22T21:45:34.401066Z warn ca ca request failed, starting attempt 2 in 189.64364ms 2022-07-22T21:45:34.591468Z warn ca ca request failed, starting attempt 3 in 377.860141ms 2022-07-22T21:45:34.970056Z warn ca ca request failed, starting attempt 4 in 732.207437ms 2022-07-22T21:45:35.702977Z warn sds failed to warm certificate: failed to generate workload certificate: create certificate: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp: lookup istiod.istio-system.svc on 172.20.0.10:53: read udp 10.253.25.91:45906->172.20.0.10:53: i/o timeout"
Thank you for responding as well btw
This is what my resources look like. It’s weird because my deployment is a bit unstable. Sometimes it works and sometimes it doesn’t
NAME READY STATUS RESTARTS AGE pod/istio-gateway-7f885db475-jl74c 0/1 Running 0 112m pod/istiod-57c86cdbd7-cgx42 1/1 Running 0 3h43m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/istio-gateway LoadBalancer 172.20.192.132 <commented_out> 15021:31706/TCP,80:32429/TCP,443:31799/TCP 112m service/istiod ClusterIP 172.20.9.46 <none> 15010/TCP,15012/TCP,443/TCP,15014/TCP 3h43m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/istio-gateway 0/1 1 0 112m deployment.apps/istiod 1/1 1 1 3h43m NAME DESIRED CURRENT READY AGE replicaset.apps/istio-gateway-7f885db475 1 1 0 112m replicaset.apps/istiod-57c86cdbd7 1 1 1 3h43m NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE horizontalpodautoscaler.autoscaling/istio-gateway Deployment/istio-gateway <unknown>/80% 1 5 1 112m horizontalpodautoscaler.autoscaling/istiod Deployment/istiod <unknown>/80% 1 5 1 3h43m
I’m a little confused how the certificate process works
How are you deploying Istio? The operator? Or something else
What happens if you deploy the “demo” configuration profile?
Start from a configuration that works, then modify iteratively
but over the weekend, I actually realized this might be more of a kubernetes issue so I’m currently looking into that
Is it possible to run pods only on nodes within a specified subnet? If for some reason, nodes cannot be started in that subnet, start the pods on the other nodes?
i guess the solution would be something like this
apiVersion: v1 kind: Pod metadata: name: with-node-affinity spec: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - antarctica-east1
nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 10 preference: matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - eu-central-1a - eu-central-1c - weight: 90 preference: matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - eu-central-1b
With this nodeAffinity configuration, Kubernetes will try to schedule
• 90% of the pods on nodes with the label
• 10% of the pods on nodes with the label
Do I understand this correctly?