#kubernetes (2022-02)
Archive: https://archive.sweetops.com/kubernetes/
2022-02-01
Is it possible to trigger a job when another job completes? Or would you need something like argo workflows to orchestrate the jobs?
@Leo Przybylski has joined the channel
@Lucky has joined the channel
2022-02-02
hi there. please has anyone seen this error after adding datadog as a sidecar to eks fargate
Your pod's cpu/memory requirements exceed the max Fargate configuration
WTH? you can’t delete messages anymore????????
I will check
try now to delete
now it works
my english is getting worse……
2022-02-03
2022-02-04
Hi folks, currently I’m having issues deploying metrics-server on EKS. I’ve installed the latest version via terraform using the resource helm_release
however, I’m getting the following error:
$ kubectl top nodes
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
$ kubectl describe apiservice v1beta1.metrics.k8s.io
Message: failing or missing response from <https://10.0.5.36:4443/apis/metrics.k8s.io/v1beta1>: Get "<https://10.0.5.36:4443/apis/metrics.k8s.io/v1beta1>": context deadline exceeded
Reason: FailedDiscoveryCheck
$ kubectl logs -n kube-system deploy/metrics-server
1 round_trippers.go:454] GET <https://10.0.4.76:10250/stats/summary?only_cpu_and_memory=true> 200 OK in 6 milliseconds
1 round_trippers.go:454] GET <https://10.0.5.211:10250/stats/summary?only_cpu_and_memory=true> 200 OK in 6 milliseconds
1 scraper.go:157] "Scrape finished" duration="20.409888ms" nodeCount=3 podCount=9
1 server.go:139] "Storing metrics"
1 server.go:144] "Scraping cycle complete"
1 handler.go:153] metrics-server: GET "/readyz" satisfied by nonGoRestful
EKS Cluster version: 1.21 Metrics Server version: 3.7.0
Does anybody know what could be wrong?
I’ve solved it buy adding an addtional rule to my security group:
node_security_group_additional_rules = {
ms_4443_ing = {
description = "Cluster API to metrics server 4443 ingress port"
protocol = "tcp"
from_port = 4443
to_port = 4443
type = "ingress"
source_cluster_security_group = true
}
...
2022-02-09
_Check out our upcoming Webinar:_ Managed Kubernetes Clusters: Avoiding risky defaults, K8s threat modeling and securing EKS clusters
Click here to register Learn how to navigate the creating of a secure by default K8s cluster, avoid risky default settings and permissions, and listen to some live threat modeling of security EKS clusters. Join Lightspin CISO Jonathan Rau and Director of Security Research Gafnit Amiga to discuss hot topics and tips for leveling up your Kubernetes security knowledge. Questions and topics covered include:
- Avoiding risky default settings in your Kubernetes clusters
- Creating a secure by default Kubernetes cluster
- Unique supply chain risks for Kubernetes
2022-02-13
Hi there,
I have a rolling update deployment strategy with a 0 max unavailable value. Unfortunately when I deploy many pods get terminated and I am not sure why.
Example:
Normal ScalingReplicaSet 30m deployment-controller Scaled down replica set deployment-xyz to 27
Normal ScalingReplicaSet 2m20s deployment-controller Scaled down replica set deployment-xyz to 10
It went from 27 to 10
Strategy:
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 0 max unavailable, 25% max surge
Any ideas where I could look at to maybe understand why this is happening?
Try a PodDisruptionBudget. Or increase your max surge
The problem was the replicas
property of the deployment which takes effect over the HPA every time something is deployed.
I guess I’ll skip/remove the replicas
property from the deployment if the HPA is enabled.
Oh, yes you’re supposed to do that.
It causes the deployment and hpa controllers to fight
2022-02-14
2022-02-17
brew for kubernetes –> https://github.com/kbrew-dev/kbrew
kbrew is homebrew for Kubernetes
Not that it would be something I’d personally throw on a pipeline but could be useful for some rapid local testing perhaps
kbrew is homebrew for Kubernetes
hi if anyone is using <https://github.com/terraform-aws-modules/terraform-aws-eks>
i am trying to add some additional tag to the ELB which get created from this module but having a hard time locating it. If anyone can point me to the right location pls thx.
i don’t believe this module creates a load balancer
in the document it shows all the resources created and none of them are a load balancer
https://github.com/terraform-aws-modules/terraform-aws-eks#resources
Terraform module to create an Elastic Kubernetes (EKS) cluster and associated worker instances on AWS
wouldn’t the ASG perform that?
i don’t believe the ASG would create a load balancer
i mean for instance to be attached to an ELB when ASG resources are created.
i’m probably confusion things with kubernetes external-dns (helm)
external-dns creates the route53 record in the r53 hosted zone
perhaps the LB was created with an ingress resource from one of your helm charts
ah yes looks like external-dns has ingress controller to perform this https://github.com/kubernetes-sigs/external-dns/blob/master/docs/tutorials/alb-ingress.md#ingress-examples
Using ExternalDNS with alb-ingress-controller
This tutorial describes how to use ExternalDNS with the aws-alb-ingress-controller.
Setting up ExternalDNS and aws-alb-ingress-controller
Follow the <aws.md|AWS tutorial> to setup ExternalDNS for use in Kubernetes clusters
running in AWS. Specify the source=ingress
argument so that ExternalDNS will look
for hostnames in Ingress objects. In addition, you may wish to limit which Ingress
objects are used as an ExternalDNS source via the ingress-class
argument, but
this is not required.
For help setting up the ALB Ingress Controller, follow the Setup Guide.
Note that the ALB ingress controller uses the same tags for subnet auto-discovery
as Kubernetes does with the AWS cloud provider.
In the examples that follow, it is assumed that you configured the ALB Ingress
Controller with the ingress-class=alb
argument (not to be confused with the
same argument to ExternalDNS) so that the controller will only respect Ingress
objects with the kubernetes.io/ingress.class
annotation set to “alb”.
Deploy an example application
Create the following sample “echoserver” application to demonstrate how
ExternalDNS works with ALB ingress objects.
apiVersion: apps/v1
kind: Deployment
metadata:
name: echoserver
spec:
replicas: 1
selector:
matchLabels:
app: echoserver
template:
metadata:
labels:
app: echoserver
spec:
containers:
- image: gcr.io/google_containers/echoserver:1.4
imagePullPolicy: Always
name: echoserver
ports:
- containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: echoserver
spec:
ports:
- port: 80
targetPort: 8080
protocol: TCP
type: NodePort
selector:
app: echoserver
Note that the Service object is of type NodePort
. We don’t need a Service of
type LoadBalancer
here, since we will be using an Ingress to create an ALB.
Ingress examples
Create the following Ingress to expose the echoserver application to the Internet.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
kubernetes.io/ingress.class: alb
name: echoserver
spec:
ingressClassName: alb
rules:
- host: echoserver.mycluster.example.org
http: &echoserver_root
paths:
- path: /
backend:
service:
name: echoserver
port:
number: 80
pathType: Prefix
- host: [echoserver.example.org](http://echoserver.example.org)
http: *echoserver_root
The above should result in the creation of an (ipv4) ALB in AWS which will forward
traffic to the echoserver application.
If the source=ingress
argument is specified, then ExternalDNS will create DNS
records based on the hosts specified in ingress objects. The above example would
result in two alias records being created, echoserver.mycluster.example.org
and
[echoserver.example.org](http://echoserver.example.org)
, which both alias the ALB that is associated with the
Ingress object.
Note that the above example makes use of the YAML anchor feature to avoid having
to repeat the http section for multiple hosts that use the exact same paths. If
this Ingress object will only be fronting one backend Service, we might instead
create the following:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
[external-dns.alpha.kubernetes.io/hostname](http://external-dns.alpha.kubernetes.io/hostname): echoserver.mycluster.example.org, [echoserver.example.org](http://echoserver.example.org)
kubernetes.io/ingress.class: alb
name: echoserver
spec:
ingressClassName: alb
rules:
- http:
paths:
- path: /
backend:
service:
name: echoserver
port:
number: 80
pathType: Prefix
In the above example we create a default path that works for any hostname, and
make use of the [external-dns.alpha.kubernetes.io/hostname](http://external-dns.alpha.kubernetes.io/hostname)
annotation to create
multiple aliases for the resulting ALB.
Dualstack ALBs
AWS supports both IPv4 and “dualstack” (both IPv4 and IPv6) interfaces for ALBs.
The ALB ingress controller uses the alb.ingress.kubernetes.io/ip-address-type
annotation (which defaults to ipv4
) to determine this. If this annotation is
set to dualstack
then ExternalDNS will create two alias records (one A record
and one AAAA record) for each hostname associated with the Ingress object.
Example:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/ip-address-type: dualstack
kubernetes.io/ingress.class: alb
name: echoserver
spec:
ingressClassName: alb
rules:
- host: [echoserver.example.org](http://echoserver.example.org)
http:
paths:
- path: /
backend:
service:
name: echoserver
port:
number: 80
pathType: Prefix
The above Ingress object will result in the creation of an ALB with a dualstack
interface. ExternalDNS will create both an A [echoserver.example.org](http://echoserver.example.org)
record and
an AAAA record of the same name, that each are aliases for the same ALB.
ignore that i brought up terraform module i was looking in the wrong place.
2022-02-19
Collection of gadgets for debugging and introspecting Kubernetes applications using BPF
Hey For Platform Engineers over here, what do you think about this PR ? https://github.com/kubernetes/kube-state-metrics/pull/1689
What this PR does / why we need it:
This PR Introduce a new flag --per-metric-labels-allowlist
that its syntax works similar to --metric-labels-allowlist
but instead of being a filter to add labels to kube_X_labels
, it is a filter for K8S’ labels that will be added to each metric time series of a resource.
Motivation
Motivation for this change can be better described from a Platform Team POV, who is responsible for the observability stack and provides it to multiple teams and/or multiple tenants.
The goal is to make it easier to define queries, alerts, and rules without the need for complex joins. Making the barrier for smaller less experienced teams smaller. As well as alleviate pressure from Prometheus Server that’s constantly doing joins for each alert rule.
Use Case 1
- A Development Team wants to create alerts for the multiple components of their applications.
- Different Components have different alerts, severities, thresholds. (example, web pods, background consumers, different kinds of jobs) and as components live in the same namespace; filtering with a namespace is not feasible.
- You’ll now have to use joins with
kube_X_labels
to filter for the specific resources. Complex queries become more complex, especially ones that had joins already.
Use Case 2
- Platform Team defining general, default rules for every namespace.
- Platform team expects resources to have a set of standard labels, that defines something like
[alerting.xxx/severity](http://alerting.xxx/severity)
,[alerting.xxx/slack-channel](http://alerting.xxx/slack-channel)
along side the[app.kubernetes.io/name](http://app.kubernetes.io/name)
,[app.kubernetes.io/component](http://app.kubernetes.io/component)
ones. - Complex Rules are defined to join with these labels and generate alerts with variables based on labels defined by teams. Queries become even more complex as you want to join with more than one label.
Complex Queries Example
Deployment has not matched the expected number of replicas.
• Using only 1 label for joins
(
kube_deployment_spec_replicas{} * on (deployment) group_left(label_app_kubernetes_io_name) kube_deployment_labels{ label_app_kubernetes_io_name="emojiapp", namespace=~"emoji"}
!=
kube_deployment_status_replicas_available{} * on (deployment) group_left(label_app_kubernetes_io_name) kube_deployment_labels{ label_app_kubernetes_io_name="emojiapp", namespace=~"emoji"}
)
and
(
changes(kube_deployment_status_replicas_updated{}[10m] ) * on (deployment) group_left(label_app_kubernetes_io_name) kube_deployment_labels{ label_app_kubernetes_io_name="emojiapp", namespace=~"emoji"}
== 0
)
• Using 2 label for joins
(
kube_deployment_spec_replicas{} * on (deployment) group_left(label_app_kubernetes_io_name, label_alerting_severity) kube_deployment_labels{ label_app_kubernetes_io_name="emojiapp", label_alerting_severity="critical", namespace=~"emoji"}
!=
kube_deployment_status_replicas_available{} * on (deployment) group_left(label_app_kubernetes_io_name, label_alerting_severity) kube_deployment_labels{ label_app_kubernetes_io_name="emojiapp", label_alerting_severity="critical", namespace=~"emoji"}
)
and
(
changes(kube_deployment_status_replicas_updated{}[10m] ) * on (deployment) group_left(label_app_kubernetes_io_name, label_alerting_severity) kube_deployment_labels{ label_app_kubernetes_io_name="emojiapp", label_alerting_severity="critical", namespace=~"emoji"}
== 0
)
Same query but with labels as part of metric series:
•
(
kube_deployment_spec_replicas{label_app_kubernetes_io_name="emojiapp", namespace=~"emoji"}
!=
kube_deployment_status_replicas_available{label_app_kubernetes_io_name="emojiapp", namespace=~"emoji"}
)
and
(
changes(kube_deployment_status_replicas_updated{label_app_kubernetes_io_name="emojiapp", namespace=~"emoji"}[10m] )
)
•
(
kube_deployment_spec_replicas{label_app_kubernetes_io_name="emojiapp", label_alerting_severity="critical", namespace=~"emoji"}
!=
kube_deployment_status_replicas_available{label_app_kubernetes_io_name="emojiapp", label_alerting_severity="critical", namespace=~"emoji"}
)
and
(
changes(kube_deployment_status_replicas_updated{label_app_kubernetes_io_name="emojiapp", label_alerting_severity="critical", namespace=~"emoji"}[10m] )
)
Goal
• Making it easier for Platform and Development teams to create queries, rules, alerts, and dashboards for resources running in Kubernetes without the need for complex joins to filter resources. • Alleviate some pressure from the Prometheus Servers that are constantly running Joins for each alert rule.
Alternatives we tried
• Run recording rules to pre-compute the joins:
Having to have a recording rule for each metric series generated by KSM is cumbersome, and adds a dependency between teams and platform teams to add any recording rule. Not ideal especially in a multi-tenant environment.
How does this change affect the cardinality of KSM
(increases, decreases, or does not change cardinality)
• Unless explicitly using the new flag, this PR has no change to generated metrics cardinality. • Using the flag will add new labels for each metric series of each resource that has a label key that’s whitelisted. • Cardinality of the label values depends on how often these labels change for the same resource. • For the use-case behind this new feature, whitelisted labels are often labels that don’t change often. Admins should be cautious of what labels to whitelist.
Performance
• KSM already fetches each resource label during metric collection. • Prometheus Performance shouldn’t be affected as long as whitelisted labels are not changing constantly.
Misc
Which issue(s) this PR fixes:
• Fixes #1415
Relevant:
• Dynamic alert routing with Prometheus and Alertmanager
Notes:
• :warning: This is a Draft PR, the implementation is not final, PR is a working POC for pods
resource, and is yet to be discussed.
2022-02-22
Hi All Our Managed Kubernetes Clusters: Avoiding risky defaults, K8s threat modeling and securing EKS clusters webinar starts in less than an hour, still time to register
Learn how to navigate the creating of a secure by default K8s cluster, avoid risky default settings and permissions, and listen to some live threat modeling of security EKS clusters. Join Lightspin CISO Jonathan Rau and Director of Security Research Gafnit Amiga to discuss hot topics and tips for leveling up your Kubernetes security knowledge. Questions and topics covered include: - Avoiding risky default settings in your Kubernetes clusters - Creating a secure by default Kubernetes cluster - Unique supply chain risks for Kubernetes Bring your questions and notepads to this live webinar!
2022-02-24
Hi we’re using the kubernetes provider here and I’m wondering how others are using this when they have multiply clusters/contexts being used? thx!
something simple as below would only allow a single context to be access, what if i have multiply contexts?
provider "kubernetes" {
config_paths = [
"~/.kube/config",
"~/.kube_another_path/config"
]
config_context = "cluster01
}
2022-02-25
Regarding an EKS upgrade from 1.18 to 1.21, one option that was proposed was to leave kube-proxy and the AMIs at the 1.18 versions to reduce the amount of change and expedite the process, following up with these upgrades (and probably more) in the future. I’m not comfortable with that one. Anybody have any thoughts on the matter? Thanks for any input.
@Eric Berg you’d be outside the recommended version skew policy since the kubelet will be 3 minor versions older. Per
https://kubernetes.io/releases/version-skew-policy/
kubelet
must not be newer than kube-apiserver
, and may be up to two minor versions older.
The maximum version skew supported between various Kubernetes components.
Thanks for the feedback, @rafa_d. We are doing a fast-follow round of updates to core-dns
and kube-proxy
. Not optimal, but the logs are clean and so far so good.