SweetOps #kubernetes for October, 2022

Archive: https://archive.sweetops.com/kubernetes/

2022-10-07

Adnan

10:32:04 AM

Does anybody have experience with AWS EKS using AWS EFS?

I need a place to store/read some data (5-10MB file) very fast and have it available consistently on multiple pods.

Alanis Swanepoel

01:14:42 PM

Whats wrong with using S3?

Alanis Swanepoel

01:14:56 PM

Is the data being mutated?

Alanis Swanepoel

01:15:24 PM

How would you handle file locks when using EFS?

Alanis Swanepoel

01:16:25 PM

If you really want to use EFS, you would need the csi driver for it

Alanis Swanepoel

01:16:49 PM

Take a look at https://www.eksworkshop.com/beginner/190_efs/efs-csi-driver/

EKSworkshop.com attachment image

Amazon EKS Workshop

Adnan

05:15:36 AM

yes. I am not really interested in the implementation, rather I am interested in the best solution. The data is being centrally mutated (from one place) and then read by many pods.

Adnan

05:16:08 AM

I need something that would provide fast reads.

2022-10-08

2022-10-09

2022-10-12

sthapaprabesh2020

11:31:33 PM

Hello everyone, i hope everyone is having a good time. I wanted to reach out to the community and get some feedback on a small project i was working on. It would be good to get some feedbacks from the community. :pray:

If you have used kubernetes then you know the pain of having to delete the non necessary demo services you deploy into the cluster and later forget. While that could be for playground purpose or for testing and validating or just deploying stuffs to get the baseline behaviour of any services or apps. But over time, we ( at least I ) forget to delete those and clean up the cluster which results in large number of demo, unnecessary services running on the cluster taking resource and eventually adding up to the operating cost.

Seeing as a problem and taking inspiration from similar working tool i.e aws-nuke i had created a small handy tool to nuke your k8s resources after use. The tool is called kube-cleanupper which as the name suggests cleans up resources in the cluster. It is a helper service which can run however you want ( cronjob, cli, docker ) etc which scouts the cluster for object with particular clean up label applied. Once it finds those resources, it checks them against the retention time and nukes them if it past its retention time. Default retention time is 4 days. You can supply custom retention time as well. Service running on my cluster were getting OOM’d and i had to free up the space and hence the motivation behind this service. Once you deploy it to the cluster, you can forget about any dev service deployed as it will be deleted after use. Just apply the label auto-clean=enabled and specify the retention i.e retention=6d

You can read more about this in here: https://github.com/pgaijin66/kube-cleanupper

Note: In order to use this, you should already have a working cluster with ~/.kube present and kubeconfig for that cluster.

pgaijin66/kube-cleanupper

kube-cleaupper is a helper service that helps you to clean up kubernetes objects older than time period defined.

2022-10-13

2022-10-15

andre.claro

02:21:56 PM

Hello everyone, I have been reading the documentation about PDB (pod disruption budget), but I still have one doubt.

I want to guarantee that 50% of the pods are always available (minAvailability: 50%) but they must be healthy (ready = True). Health checks are done using readiness probe.

Does the PDB considers the readiness (ready state) of the pod?

Thanks

2022-10-17

Mallikarjuna M

03:31:12 PM

Hello everyone, How to restrict creation of any pods, if user not specify any limits in the yaml file. Only for namespace?

akhan4u

03:41:58 PM

Possibly through the use of an OPA Gatekeeper in cluster and by writing a rego policy for the same.

akhan4u

03:42:37 PM

Ref: https://kubernetes.io/blog/2019/08/06/opa-gatekeeper-policy-and-governance-for-kubernetes/

OPA Gatekeeper: Policy and Governance for Kubernetes

Authors: Rita Zhang (Microsoft), Max Smythe (Google), Craig Hooper (Commonwealth Bank AU), Tim Hinrichs (Styra), Lachie Evenson (Microsoft), Torin Sandall (Styra) The Open Policy Agent Gatekeeper project can be leveraged to help enforce policies and strengthen governance in your Kubernetes environment. In this post, we will walk through the goals, history, and current state of the project. The following recordings from the Kubecon EU 2019 sessions are a great starting place in working with Gatekeeper:

Mallikarjuna M

03:35:38 PM

whenever user not mentioning any limits in the yaml file, then cluster should through error, saying that must specify limits. how to configure it?

2022-10-23

idan levi

12:09:05 PM

Hey all, I adding NFS (EFS to be honest) to one of my deployments. I created FileSystem through AWS (https://us-east-1.console.aws.amazon.com/efs?region=us-east-1#/get-started) and created out of it Volume and VolumeCliam that are bound status, so far all look ok. When I’m trying to mount the PVC to a pod I’m getting the error:

Warning FailedMount    32s          kubelet       Unable to attach or mount volumes: unmounted volumes=[dump-vol], unattached volumes=[config dump-vol default-token-jfxl4]: timed out waiting for the condition

That’s my volume declaration:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: indexer-001-dev
spec:
  capacity:
    storage: 20Gi # Doesn't really matter, as EFS does not enforce it anyway
  volumeMode: Filesystem
  storageClassName: manual 
  accessModes:
    - ReadWriteMany
  mountOptions:
    - hard
    - nfsvers=4.1
    - rsize=1048576
    - wsize=1048576
    - timeo=600
    - retrans=2
    - noresvport
  nfs:
    path: /
    server: 10.X.X.X

I added port 2049 (NFS port) as an inbound rule to my SecurityGroup. also, NFS-UTILS is installed on the host.

Someone familiar with the proceeder ??

Olivier Kaloudoff

08:37:15 AM

Hi Idan, NFS uses more than port 2049. Usually NFS is advertised with the portmaper service, running on port 111. Also mountd and/or nfsd might be of interest.

Olivier Kaloudoff

08:37:17 AM

https://linuxhint.com/nfs-ports/

What Ports Does NFS Use

Network File System or NFS is a file system protocol that allows users to share directories and files over a network. The NFS protocol is similar to the Samba protocol. However, unlike Samba, NFS provides an encryption mechanism and authentication. What Ports does NFS use is explained in this article.

Olivier Kaloudoff

08:37:21 AM

Hope this helps

github140

07:11:13 PM

EFS is NFSv4.x compatible. So there’s only port 2049.

2022-10-24

zadkiel

04:31:35 PM

Hey there! cluster-autoscaler removes my nodes before fluentbit is able to send all its logs. Any idea how to get around this?

James

03:46:28 AM

Hello Everyone - just a quick one. When do you use kubectl create and kubectl apply ?

zadkiel

05:56:50 PM

Use kubectl create when you want to create a previously non-existent resource. It will stop if it found the resource already exists.

Use kubectl apply to update an existing resource. If the resource already exists, it will retrieve the resource from the API, compute and push the patch. Otherwise, it will do a kubectl create.

2022-10-25

mr.shayv

02:58:51 PM

Hey Is it possible to define nodeSelector to a master node role instead of label?

zadkiel

05:57:44 PM

Node selectors are based on labels. You can labels your nodes with the role (controlplane, etcd, worker per ex), and then nodeSelector on that label.

2022-10-27

Mallikarjuna M

09:42:39 AM

Hello Everyone, Does anyone know about the best and easy way of VPN configuration?

jark99

04:04:41 PM

For this service annotation file, do I have to manually hardcode aws_instance_endpoint or is that automatically read in from Metadata?

apiVersion: v1
kind: Service
metadata:
  name: postgres
  labels:
    tags.datadoghq.com/env: '<ENV>'
    tags.datadoghq.com/service: '<SERVICE>'
  annotations:
    ad.datadoghq.com/service.check_names: '["postgres"]'
    ad.datadoghq.com/service.init_configs: '[{}]'
    ad.datadoghq.com/service.instances: |
      [
        {
          "dbm": true,
          "host": "<AWS_INSTANCE_ENDPOINT>",
          "port": 5432,
          "username": "datadog",
          "password": "<UNIQUEPASSWORD>",
          "tags": "dbinstanceidentifier:<DB_INSTANCE_NAME>"
        }
      ]      
spec:
  ports:
  - port: 5432
    protocol: TCP
    targetPort: 5432
    name: postgres

Vincent Van der Kussen

06:00:41 AM

Annotations are strings afaik so yes, you’ll have to give it the value

2022-10-31

Saichovsky

02:31:04 PM

Hi,

I have a question regarding k8s behaviour when deleting namespaces.

I have cilium installed in my test cluster and whenever I delete the cilium-system namespace, the hubble-ui pod gets stuck in terminating state. The pod has a couple of containers, but I notice that one container named backend (a golang application) exits with code 137 when the namespace is deleted, and that’s what leaves the namespace stuck in Terminating state. From what I am reading online, containers exit with 137 when they attempt to use more memory that they have been allocated. In my test cluster, no resources have been defined (spec.containers.[*].resources = {}).

I know how to force delete a namespace that is stuck in Terminating state, but unfortunately, that is a workaround and not a solution. This issue is breaking my CI pipeline.

How can I ensure a graceful exit of the backend container? How do I view the default memory quotas? no kind: quota object has been defined, but there must be some defaults, I believe

package main

import (
	"context"
	"fmt"
	"net"
	"net/http"
	"os"
	"os/signal"
	"strconv"
	"time"

	gops "github.com/google/gops/agent"
	"github.com/improbable-eng/grpc-web/go/grpcweb"
	"github.com/sirupsen/logrus"
	"golang.org/x/sys/unix"
	"google.golang.org/grpc"

	"github.com/cilium/hubble-ui/backend/client"
	"github.com/cilium/hubble-ui/backend/internal/config"
	"github.com/cilium/hubble-ui/backend/internal/msg"
	"github.com/cilium/hubble-ui/backend/pkg/logger"
	"github.com/cilium/hubble-ui/backend/proto/ui"
	"github.com/cilium/hubble-ui/backend/server"
)

var (
	log = logger.New("ui-backend")
)

func runServer(cfg *config.Config) {
	// observerAddr := getObserverAddr()
	srv := server.New(cfg)

	if err := srv.Run(); err != nil {
		log.Errorf(msg.ServerSetupRunError, err)
		os.Exit(1)
	}

	grpcServer := grpc.NewServer()
	ui.RegisterUIServer(grpcServer, srv)

	wrappedGrpc := grpcweb.WrapServer(
		grpcServer,
		grpcweb.WithOriginFunc(func(origin string) bool {
			return true
		}),
		grpcweb.WithCorsForRegisteredEndpointsOnly(false),
	)

	handler := http.NewServeMux()
	handler.HandleFunc("/api/", func(resp http.ResponseWriter, req *http.Request) {
		// NOTE: GRPC server handles requests with URL like "ui.UI/functionName"
		req.URL.Path = req.URL.Path[len("/api/"):]
		wrappedGrpc.ServeHTTP(resp, req)
	})

	ctx, cancel := signal.NotifyContext(context.Background(), unix.SIGINT, unix.SIGTERM)
	defer cancel()

	addr := cfg.UIServerListenAddr()
	httpSrv := &http.Server{
		Addr:    addr,
		Handler: handler,
		BaseContext: func(net.Listener) context.Context {
			return ctx
		},
		ReadHeaderTimeout: 5 * time.Second,
	}
	log.Infof(msg.ServerSetupListeningAt, addr)
	if err := httpSrv.ListenAndServe(); err != nil {
		log.Errorf(msg.ServerSetupRunError, err)
		os.Exit(1)
	}
}

func runClient(cfg *config.Config) {
	addr := cfg.UIServerListenAddr()
	log.Infof("connecting to server: %s\n", addr)

	cl := client.New(addr)
	cl.Run()
}

func getMode() string {
	mode, _ := os.LookupEnv("MODE")
	if mode == "client" {
		return "client"
	}

	return "server"
}

func runGops() {
	if enabled, _ := strconv.ParseBool(os.Getenv("GOPS_ENABLED")); !enabled {
		return
	}

	gopsPort := "0"
	if gopsPortEnv := os.Getenv("GOPS_PORT"); gopsPortEnv != "" {
		gopsPort = gopsPortEnv
	}
	// Open socket for using gops to get stacktraces of the agent.
	addr := fmt.Sprintf("127.0.0.1:%s", gopsPort)
	addrField := logrus.Fields{"address": addr}

	if err := gops.Listen(gops.Options{
		Addr:                   addr,
		ReuseSocketAddrAndPort: true,
	}); err != nil {
		log.WithError(err).WithFields(addrField).Fatal("Cannot start gops server")
	}

	log.WithFields(addrField).Info("Started gops server")
}

func main() {
	runGops()

	cfg, err := config.Init()
	if err != nil {
		log.Errorf(msg.ServerSetupConfigInitError, err.Error())
		os.Exit(1)
	}

	if mode := getMode(); mode == "server" {
		runServer(cfg)
	} else {
		runClient(cfg)
	}
}