SweetOps #kubernetes for February, 2019

Archive: https://archive.sweetops.com/kubernetes/

2019-02-01

nutellinoit

Hi everyone, which is the best way to manage kubernetes deployments using terraform? We are using atlantis to CI/CD infrastructure

nutellinoit

02:45:03 PM

There is the terraform kubernetes provider, but i don’t know if is good for production use

Erik Osterman (Cloud Posse)

08:51:10 PM

Personal opinion is that terraform is not a tool well suited for deployments on top of Kubernetes because it is only really good at creating and destroying resources. But updating resources less so.

nutellinoit

04:47:28 PM

fyi, I took the road with helm charts + terraform helm provider

Erik Osterman (Cloud Posse)

05:13:43 PM

Erik Osterman (Cloud Posse)

05:13:44 PM

Erik Osterman (Cloud Posse)

05:14:01 PM

the helm provider is okay

Erik Osterman (Cloud Posse)

05:14:11 PM

in our experience, we couldn’t do half of what we do with helmfiles

Erik Osterman (Cloud Posse)

05:14:40 PM

terraform template files don’t support conditionals

Erik Osterman (Cloud Posse)

05:14:56 PM

so writing flexible values via terraform is difficult

Erik Osterman (Cloud Posse)

05:15:21 PM

our use-case is slightly different since we need to support multiple companies/organizations, which leads to more conditionals

nutellinoit

06:34:29 PM

atm I’m using helm charts to differentiate between prod, qa, dev stage

nutellinoit

06:36:14 PM

it’s so good applying changes with the helm provider, I was afraid it had a lot of bugs being still at version 0.x

2019-02-05

2019-02-06

btai

06:42:40 PM

do you guys blue/green your k8s clusters when you want to upgrade or do you utilize rolling updates?

Andriy Knysh (Cloud Posse)

06:45:22 PM

with kops we usually do rolling updates https://docs.cloudposse.com/geodesic/kops/upgrade-cluster/

pecigonzalo

07:11:44 PM

You dont manage the cluster with terra right?

Andriy Knysh (Cloud Posse)

07:12:11 PM

Andriy Knysh (Cloud Posse)

07:12:35 PM

with TF we create other resources like kops backend etc.

pecigonzalo

07:13:01 PM

yeah, but I was curious if you also did kops > terraf > atlantis or similar

Andriy Knysh (Cloud Posse)

07:13:17 PM

https://github.com/cloudposse/terraform-root-modules/tree/master/aws/kops

cloudposse/terraform-root-modules

Example Terraform service catalog of “root module” invocations for provisioning reference architectures - cloudposse/terraform-root-modules

Andriy Knysh (Cloud Posse)

07:13:50 PM

https://github.com/cloudposse/terraform-root-modules/tree/master/aws/kops-aws-platform

cloudposse/terraform-root-modules

Example Terraform service catalog of “root module” invocations for provisioning reference architectures - cloudposse/terraform-root-modules

Andriy Knysh (Cloud Posse)

07:15:02 PM

no, we just provision the resources above with TF, but the cluster using kops commands from a template https://github.com/cloudposse/geodesic/blob/master/rootfs/templates/kops/default.yaml

cloudposse/geodesic

Geodesic is the fastest way to get up and running with a rock solid, production grade cloud platform built on top of strictly Open Source tools. ★ this repo! https://slack.cloudposse.com/ - clou…

pecigonzalo

07:15:19 PM

thanks

pecigonzalo

07:17:42 PM

I guess you run kops commands out of band? not in CI

Andriy Knysh (Cloud Posse)

07:19:00 PM

yea, from geodesic

Andriy Knysh (Cloud Posse)

06:46:10 PM

https://docs.cloudposse.com/kubernetes-optimization/scale-cluster-horizontally/

Andriy Knysh (Cloud Posse)

06:46:26 PM

https://docs.cloudposse.com/kubernetes-optimization/scale-cluster-vertically/

btai

07:04:23 PM

slow isnt it?

Andriy Knysh (Cloud Posse)

07:15:18 PM

yea, takes some time

btai

07:17:30 PM

this is more of a terraform question, but if i had my k8s cluster deployed in its own VPC and I had the database in a seperate VPC. (they are provisioned seperately because I blue/green my k8s clusters when I want to upgrade) If I were to VPC peer, is it possible to not have to upgrade the security group of the database?

btai

07:17:53 PM

basically allow full access to the db if there is a vpc peering connection?

Andriy Knysh (Cloud Posse)

07:23:27 PM

when you upgrade the cluster, is it still the same VPC?

btai

07:23:48 PM

nope

btai

07:23:56 PM

new k8s cluster, new vpc

Andriy Knysh (Cloud Posse)

07:24:22 PM

can you make two of them in advance and just add the two SGs to the database’s SG?

btai

07:24:34 PM

yes

btai

07:24:37 PM

i can do that

btai

07:25:06 PM

that would require an extra step but i think thats the best approach

btai

07:27:00 PM

spin up new k8s cluster/VPC
update database terraform with new SG
cutover
spin down old k8s cluster
update database terraform remove old SG 

btai

07:29:43 PM

actually @Andriy Knysh (Cloud Posse), if i provide the db security group to my cluster terraform I could use this

resource "aws_security_group_rule" "allow_all" {
  type            = "ingress"
  from_port       = 0
  to_port         = 65535
  protocol        = "tcp"
  cidr_blocks     = ["0.0.0.0/0"]
  prefix_list_ids = ["pl-12c4e678"]

  security_group_id = "sg-123456"
}

btai

07:30:31 PM

that would automatically do step 2 & 5 for me during cluster spin up and spin down

Andriy Knysh (Cloud Posse)

07:31:57 PM

hmm… what about ingress rules for the db SG? (you need to update them as well)

Andriy Knysh (Cloud Posse)

07:34:24 PM

when you create a new VPC and VPC peering, you can update the db SG with new ingress rules (unless you always have just the two VPCs and they never change, in which case you can add the SGs to the db ingress just once)

Andriy Knysh (Cloud Posse)

07:36:53 PM

or, if you create the two VPCs with the same CIDRs and they never change, you can add the CIDRs to the db SG (after peering, the db will see those CIDRs)

btai

07:38:41 PM

I cant create two vpcs with the same cidr because its in the same account

btai

07:39:37 PM

that aws_security_group_rule will update the db SG with the new vpc_id to allow ingress

Andriy Knysh (Cloud Posse)

07:40:48 PM

by the same I meant they could be different for the two VPCs, but they never change so you know the CIDRs in advance

btai

07:41:32 PM

ah yeah

btai

07:42:01 PM

that could work, but risk the chances someone spins up a different service using the same unused CIDR

btai

07:42:25 PM

(theres only 2 of us at my company that works on this stuff so very unlikely)

Andriy Knysh (Cloud Posse)

07:42:30 PM

yes

Andriy Knysh (Cloud Posse)

07:42:55 PM

so it’s better to just update the db SG with the new rule after you spin a new VPC

btai

07:43:00 PM

yep

Erik Osterman (Cloud Posse)

10:20:39 PM

https://engineering.tumblr.com/post/182013497734/open-sourcing-our-kubernetes-tools

Open Sourcing our Kubernetes Tools attachment image

At Tumblr, we are avid fans of Kubernetes. We have been using Kubernetes for all manner of workloads, like critical-path web requests handling for tumblr.com, background task executions like sending…

btai

01:48:27 AM

how are you guys monitoring your kubernetes nodes?

Erik Osterman (Cloud Posse)

03:19:31 AM

Prometheus & grafana

2019-02-07

joshmyers

03:49:41 PM

https://sysdig.com/blog/enable-kubernetes-pod-security-policy/ Nice

Sysdig | Enable Kubernetes Pod Security Policy with kube-psp-advisor

How to enable Kubernetes Pod Security policy using kube-psp-advisor to address the practical challenges of building a security policy on Kubernetes.

Erik Osterman (Cloud Posse)

06:41:26 PM

https://containership.github.io/konstellate-editor/

Erik Osterman (Cloud Posse)

06:41:35 PM

maybe a good learning tool

btai

10:46:32 PM

@Erik Osterman (Cloud Posse) are you guys catching nodes that are going to have issues ahead of time?

btai

10:48:53 PM

i had a k8s node yesterday that spiked to 100% CPU randomly that had to be cordon & drained

Erik Osterman (Cloud Posse)

01:03:46 AM

https://github.com/kubernetes/node-problem-detector/blob/master/README.md

kubernetes/node-problem-detector

This is a place for various problem detectors running on the Kubernetes nodes. - kubernetes/node-problem-detector

Erik Osterman (Cloud Posse)

01:03:57 AM

@btai this look good?

btai

01:04:22 AM

interesting

btai

01:04:25 AM

i will try it out

btai

01:04:50 AM

the daemon.log was showing some interesting stuff

btai

01:04:57 AM

on that node that started having issues

Erik Osterman (Cloud Posse)

01:06:03 AM

https://github.com/kubernetes/node-problem-detector/blob/master/config/custom-plugin-monitor.json

kubernetes/node-problem-detector

This is a place for various problem detectors running on the Kubernetes nodes. - kubernetes/node-problem-detector

Erik Osterman (Cloud Posse)

01:06:23 AM

If you can generate a check, you can do a custom plugin like this

btai

01:08:15 AM

whats a custom plugin?

Erik Osterman (Cloud Posse)

01:08:27 AM

See example

Erik Osterman (Cloud Posse)

01:08:46 AM

Basically as simple as writing a a script that exits non zero

btai

01:09:03 AM

ah i see

Erik Osterman (Cloud Posse)

01:40:30 AM

https://github.com/danisla/terraform-operator

danisla/terraform-operator

Kubernetes custom controller for operating terraform - danisla/terraform-operator

Erik Osterman (Cloud Posse)

01:41:42 AM

https://github.com/danisla/terraform-operator/blob/master/examples/basic/README.md

danisla/terraform-operator

Kubernetes custom controller for operating terraform - danisla/terraform-operator

Erik Osterman (Cloud Posse)

01:56:05 AM

http://www.leebriggs.co.uk/blog/2019/02/07/why-are-we-templating-yaml.html

Why the fuck are we templating yaml?

I was at cfgmgmtcamp 2019 in Ghent, and did a talk which I think was well received about the need for some Kubernetes configuration management as well as the…

2019-02-08

nutellinoit

11:43:12 AM

Hi everyone, there is a project that manage EKS workers scale in using lifecycle hooks and lambda?

Erik Osterman (Cloud Posse)

03:17:47 PM

That is what the cluster autoscaler is used for

Erik Osterman (Cloud Posse)

03:18:58 PM

In other words, using a lambda to scale the cluster node pools could work, but it’s not the prescribed way in Kubernetes

Erik Osterman (Cloud Posse)

03:19:41 PM

https://github.com/kubernetes/autoscaler

kubernetes/autoscaler

Autoscaling components for Kubernetes. Contribute to kubernetes/autoscaler development by creating an account on GitHub.

nutellinoit

05:09:03 PM

Thank you Erik

nutellinoit

05:09:23 PM

but i need only to manage the scale in, when a node is removed by asg

nutellinoit

05:09:40 PM

i’m writing a new lambda that does kubectl drain on the node via SNS topic

joshmyers

05:10:21 PM

Doesn’t the autoscaler do scale in too?

nutellinoit

05:11:21 PM

i’m using plain asg with eks

joshmyers

05:19:26 PM

plain asg’s as opposed to?

joshmyers

05:20:10 PM

https://medium.com/@alejandro.millan.frias/cluster-autoscaler-in-amazon-eks-d9f787176519 looks like what I’d expect

Cluster Autoscaler in Amazon EKS – Alejandro Millan Frias – Medium attachment image

Cluster Autoscaler automatically adjusts the number of nodes in a Kubernetes cluster when there are insufficient capacity errors to launch…

2019-02-10

dryack

09:06:50 AM

@dryack has joined the channel

2019-02-12

rohit.verma

10:05:34 AM

hi all, wondering how can we retain the NATIP when recreating a cluster using kops.

rohit.verma

10:06:40 AM

there is an open issue https://github.com/kubernetes/kops/issues/3182 but couldn’t find a better solution

Re-using existing elastic IPs for NAT gateways created by kops · Issue #3182 · kubernetes/kops

We currently have a kops cluster with a private topology. If we need to re-create this cluster, the elastic IPs associated with the NAT gateways are deleted, and new EIPs are allocated when the rep…

rohit.verma

10:07:15 AM

all solutions are more about deleting the cluster manually

2019-02-13

Erik Osterman (Cloud Posse)

08:06:23 PM

@rohit.verma haven’t had to do that

Erik Osterman (Cloud Posse)

08:06:43 PM

though I have had to do other things related to networking in kops and it’s always led to that I destroy/recreate =(

Erik Osterman (Cloud Posse)

08:06:45 PM

Erik Osterman (Cloud Posse)

11:08:57 PM

@ryangolfs
Have you ran <https://github.com/mumoshu/aws-secret-operator>

Because for the life of me I can’t get it to create secrets <https://github.com/mumoshu/aws-secret-operator/issues/1>
Is my issuse as well .. just curious if you ran into this

Erik Osterman (Cloud Posse)

11:09:02 PM

@mumoshu

ryangolfs

11:09:04 PM

@ryangolfs has joined the channel

btai

12:00:25 AM

have you guys used envoy?

btai

12:00:31 AM

thoughts on it?

Erik Osterman (Cloud Posse)

12:00:51 AM

we have a basic example……

Erik Osterman (Cloud Posse)

12:00:52 AM

https://github.com/cloudposse/example-app

cloudposse/example-app

Example application for CI/CD demonstrations of Codefresh - cloudposse/example-app

Erik Osterman (Cloud Posse)

12:01:04 AM

with istio (envoy sidecar injection)

Erik Osterman (Cloud Posse)

12:01:36 AM

TL;DR: was impressed how it works and want to do more with it

Erik Osterman (Cloud Posse)

12:02:04 AM

https://github.com/cloudposse/example-app/blob/master/deploy/releases/istio.yaml

cloudposse/example-app

Example application for CI/CD demonstrations of Codefresh - cloudposse/example-app

btai

12:02:26 AM

i dont really need service mesh/service discovery

btai

12:02:38 AM

is it worth it just for proxying/traffic mgmt

Erik Osterman (Cloud Posse)

12:02:41 AM

yea

Erik Osterman (Cloud Posse)

12:02:48 AM

traffic mgmt / shapping is what i like

Erik Osterman (Cloud Posse)

12:02:55 AM

circuit breakers, rate limiting, auth, etc

btai

12:02:56 AM

whats shapping?

Erik Osterman (Cloud Posse)

12:03:19 AM

how the traffic flows across deployments (canary releases)

btai

12:03:27 AM

ahh

btai

12:03:48 AM

sorry im not super familiar with istio, is it recommended to run envoy w/istio?

daveyu

12:03:53 AM

i haven’t used it yet, but i like the promise of standardized request logging also

btai

12:04:10 AM

can i just run envoy as my proxy layer?

Erik Osterman (Cloud Posse)

12:04:11 AM

so istio is a way to manage envoy sidecars

Erik Osterman (Cloud Posse)

12:04:19 AM

linkerd does the same thing

Erik Osterman (Cloud Posse)

12:04:24 AM

and there are other ways too

btai

12:05:05 AM

ah so i deploy istio and it deploys envoy sidecars for me in my pods

Erik Osterman (Cloud Posse)

12:05:50 AM

yup

btai

12:06:08 AM

so i currently use traefik as my reverse proxy

btai

12:06:39 AM

deployed as daemon set (pod on each node)

btai

12:07:57 AM

is envoy considered an optimization?

Erik Osterman (Cloud Posse)

12:08:01 AM

basically isitio helps you deploy envoy on k8s

Erik Osterman (Cloud Posse)

12:08:58 AM

i like traefik too, but we haven’t used it in the same context

Erik Osterman (Cloud Posse)

12:09:05 AM

not sure if the feature set overlaps

btai

12:09:17 AM

have you guys used istio with EKS?

btai

12:09:22 AM

https://istio.io/docs/setup/kubernetes/helm-install/

Installation with Helm

Install Istio with the included Helm chart.

btai

12:09:42 AM

not sure if its outdated, but if you look under prereqs it doesn’t mention EKS

Erik Osterman (Cloud Posse)

12:10:00 AM

Erik Osterman (Cloud Posse)

12:10:54 AM

@johncblandii might also have done some research into that

btai

12:11:01 AM

https://aws.amazon.com/blogs/opensource/getting-started-istio-eks/

Getting Started with Istio on Amazon EKS | Amazon Web Services attachment image

Service Meshes enable service-to-service communication in a secure, reliable, and observable way. In this multi-part blog series, Matt Turner, founding engineer at Tetrate, will explain the concept of a Service Mesh, shows how Istio can be installed as a Service Mesh on a Kubernetes cluster running on AWS using Amazon EKS, and then explain some […]

btai

12:11:03 AM

sweet

Erik Osterman (Cloud Posse)

12:11:17 AM

ohhh

Erik Osterman (Cloud Posse)

12:11:24 AM

i misread EKS (!= ECS)

btai

12:11:41 AM

yeah no, eks

btai

12:12:03 AM

after using k8s, no point in using ecs

Erik Osterman (Cloud Posse)

12:12:28 AM

lol

Erik Osterman (Cloud Posse)

12:12:28 AM

yes

johncblandii

04:15:23 AM

I didn’t actually use Istio. I started to mess with it but hadn’t. We are using EKS and ECS (Fargate), though.

ramesh.mimit

05:03:29 AM

Does anyone faced CoreDNS pods are getting stuck at “ContainerCreating” issue?

Erik Osterman (Cloud Posse)

05:03:59 AM

What do you see when you describe pod?

ramesh.mimit

05:04:35 AM

kubelet, ip-10-225-0-236.ec2.internal Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container “2c2fa70a9231264ea9e67bd058126b67fee7409691c74165590a75bfecf29d1f” network for pod “coredns-7bcbfc4774-kxqmd”: NetworkPlugin cni failed to set up pod “coredns-7bcbfc4774-kxqmd_kube-system” network: add cmd: failed to assign an IP address to container

ramesh.mimit

05:04:41 AM

something like that

ramesh.mimit

05:04:54 AM

cni plugin version is 1.2.1

ramesh.mimit

05:05:23 AM

i have checked, its not related to EC2 instance or networking or IP addresses in subnet

Erik Osterman (Cloud Posse)

05:05:41 AM

Haven’t had that, but that error looks to be a pretty good hint

ramesh.mimit

05:05:49 AM

my subnet has lot of free IP ‘s and instance has only 3 ENI’s used and it can attach up to 10

2019-02-14

sarkis

05:34:34 PM

what instance sizes are your master/worker nodes @ramesh.mimit

sarkis

05:35:20 PM

i was reading abut some issues with t3, m5, c5 or basically the new hypervisor (nitro) instances having this problem

ramesh.mimit

11:38:25 PM

i am using r5 instances, @sarkis and checked they are supported

btai

02:54:21 AM

@sarkis can you link where you were reading that?

sarkis

02:56:26 AM

https://github.com/aws/amazon-vpc-cni-k8s/issues/59

Pods stuck in ContainerCreating due to CNI Failing to Assing IP to Container Until aws-node is deleted · Issue #59 · aws/amazon-vpc-cni-k8s

On a node that is only 3 days old all containers scheduled to be created on this node get stuck in ContainerCreating. This is on an m4.large node. The AWS console shows that it has the maximum numb…

sarkis

02:57:27 AM

multiple reports of t3, m5, r5 ^ which are all the new nitro instances

btai

09:44:40 PM

oo thanks, looks like its happening as much as 3 days ago. I guess i will revert to r4 instances

sarkis

10:25:34 PM

nw! curious were you also seeing these issues? and doubly curious if it fixes the problem

Erik Osterman (Cloud Posse)

05:20:40 AM

set the channel description: Archive: https://archive.sweetops.com/kubernetes/

2019-02-15

Erik Osterman (Cloud Posse)

04:25:49 PM

https://github.com/open-policy-agent/gatekeeper

open-policy-agent/gatekeeper

Gatekeeper - Policy Controller for Kubernetes. Contribute to open-policy-agent/gatekeeper development by creating an account on GitHub.

btai

05:37:59 PM

What container registry do u guys use

johncblandii

05:53:50 PM

Just stood up JFrog. We’re actively moving there.

ECR is the current option we use.

johncblandii

05:53:55 PM

You?

Erik Osterman (Cloud Posse)

05:59:46 PM

Are you also using other parts of Artifactory?

johncblandii

06:00:45 PM

As in Xray? If so, about to. As in other registries, definitely will be using it for npm and potentially some maven/etc packages.

btai

06:46:36 PM

we use quay, but im getting very frustrated with their support cause I havent been able to upgrade our plan for more private repos

btai

06:46:49 PM

how is ECR @johncblandii

johncblandii

07:44:37 PM

ECR is ok but can be a pain. you do 1 registry per image (can tag separately) so you don’t say “mydockerreg/image:tag” to reference multiple tags. You create a registry per image and reference the whole thing like: [registryid].dkr.ecr.[region].[amazonaws.com/[image]:[tag]](http://amazonaws.com/[image]:[tag]). Up to the [tag] part is locked in as the image URI.

I guess you could get fancy with a generic image name and customize per tag for the rest but layers would prob be an issue at that point.

johncblandii

07:45:02 PM

but it is decent. it definitely wouldn’t be something I’d recommend for someone with a lot of images

btai

09:40:23 PM

thanks @johncblandii

btai

09:41:06 PM

would you guys say if we were to use Istio for traffic management, we could just stay with classic AWS ELBs?

Erik Osterman (Cloud Posse)

11:30:03 PM

Yes

Erik Osterman (Cloud Posse)

11:30:12 PM

I’m still not jazzed on ALBs + k8s

Erik Osterman (Cloud Posse)

11:30:24 PM

current implementation creates one ALB per Ingress

Erik Osterman (Cloud Posse)

11:30:39 PM

also, enabling NLBs on classic ELBs is trivial

Erik Osterman (Cloud Posse)

11:31:56 PM

  annotations:
    # by default the type is elb (classic load balancer).
    service.beta.kubernetes.io/aws-load-balancer-type: nlb

Erik Osterman (Cloud Posse)

11:32:13 PM

the downside with ELB classic is you lose the client IP

Erik Osterman (Cloud Posse)

11:32:27 PM

this can be hacked with Proxy Protocol

Erik Osterman (Cloud Posse)

11:33:05 PM

but nginx-ingress doesn’t report the target port with Proxy Protocol correctly, so you don’t know if the user is using TLS or not

sarkis

12:44:06 AM

do ALBs still take forever to create?

Erik Osterman (Cloud Posse)

06:48:07 AM

Yea they slow the create too

2019-02-18

maarten

08:25:04 PM

Anyone using Vault instead of Kiam, I’m new to k8s, and wondering what advantages&drawbacks are over using vault like this.

joshmyers

09:11:41 PM

For AWS authentication? You have to manage Vault for a start

joshmyers

09:12:52 PM

Vault could allow more flexibility than Kiam

maarten

09:13:30 PM

Figured the kiam server needs to be managed as well, was hoping for it to be more elegant like the ecs-agent in that respect.

joshmyers

09:14:29 PM

Yeah, you need to manage that too, agents and server

joshmyers

09:14:45 PM

Has proved interesting in the past but I think mostly OK now

joshmyers

09:15:04 PM

Vault does a lot more than Kiam though

joshmyers

09:15:11 PM

How much do you want those other features?

maarten

09:16:22 PM

I think Vault was chosen for the application secrets, so the logical step here would be adding the iam sessions

joshmyers

09:16:34 PM

kiam is strictly around AWS services

joshmyers

09:16:52 PM

If already using Vault, I’d stick with it over Kiam for IAM stuff

joshmyers

09:17:07 PM

if not, kiam maybe a lower hanging fruit

maarten

09:17:16 PM

thanks Josh!

joshmyers

09:17:39 PM

IMO anyway, others will have other views

maarten

09:17:53 PM

for sure, no worries

Andriy Knysh (Cloud Posse)

09:18:04 PM

vault is not easy to setup https://aws-quickstart.s3.amazonaws.com/quickstart-hashicorp-vault/doc/hashicorp-vault-on-the-aws-cloud.pdf

maarten

09:18:20 PM

( Still liking ECS even more, knowing all this )

joshmyers

09:18:27 PM

Nope ^^ , but if you are already running it and have gone through that pain…

joshmyers

09:19:03 PM

If you are AWS, SSM and Kiam may get you what you want easier

maarten

09:19:58 PM

but I guess what vault can also do, is probably combining GCP with AWS, for the ones thinking about that ..

joshmyers

09:20:51 PM

Sure….

joshmyers

09:21:01 PM

but I don’t know of many folks actually doing that

joshmyers

09:21:09 PM

Multi provider is hard.

joshmyers

09:21:15 PM

Vendor lock in is a thing

joshmyers

09:21:26 PM

It’s all a tradeoff

joshmyers

09:21:35 PM

I also don’t really care about being locked into AWS

maarten

09:22:45 PM

me neither, they keep adding new stuff, and it works.

Erik Osterman (Cloud Posse)

09:23:52 PM

https://github.com/banzaicloud/bank-vaults

banzaicloud/bank-vaults

A Vault swiss-army knife: A K8s operator. Go client with automatic token renewal, Kubernetes support, dynamic secrets, multiple unseal options and more. A CLI tool to init, unseal and configure Vau…

Erik Osterman (Cloud Posse)

09:23:57 PM

saw that the other day

Erik Osterman (Cloud Posse)

09:24:04 PM

looks interesting and is related

joshmyers

09:25:09 PM

Ah nice

joshmyers

09:25:14 PM

I’ve used https://github.com/UKHomeOffice/vault-sidekick before

UKHomeOffice/vault-sidekick

Vault sidekick. Contribute to UKHomeOffice/vault-sidekick development by creating an account on GitHub.

joshmyers

09:25:34 PM

bank-vaults looks fuller featured

joshmyers

09:26:06 PM

Certainly more complex than Kiam to manage

Erik Osterman (Cloud Posse)

09:26:18 PM

haha yea

Erik Osterman (Cloud Posse)

09:26:20 PM

seriously

Erik Osterman (Cloud Posse)

09:27:00 PM

what I’d like to see (and there probably exists), is something that implements the AWS IAM metadata proxy pattern of kube2iam, kiam but uses vault as the mediator

Erik Osterman (Cloud Posse)

09:27:20 PM

then uses the [iam.amazonaws.com/role](http://iam.amazonaws.com/role) annotation just like kube2iam and kiam

Erik Osterman (Cloud Posse)

09:27:27 PM

that way the interface is interchangable

joshmyers

09:27:46 PM

Annotations is a super nice way to drive those things in k8s

Erik Osterman (Cloud Posse)

03:22:20 AM

https://github.com/gardener/machine-controller-manager

gardener/machine-controller-manager

Declarative way of managing machines for Kubernetes cluster - gardener/machine-controller-manager

Erik Osterman (Cloud Posse)

03:22:28 AM

Erik Osterman (Cloud Posse)

03:25:58 AM

https://gardener.cloud/

Erik Osterman (Cloud Posse)

03:26:00 AM

looks sweet

Erik Osterman (Cloud Posse)

03:26:10 AM

apparently 100% open source

2019-02-19

btai

09:37:49 PM

do you guys have an example using alb-ingress-controller with istio?

Erik Osterman (Cloud Posse)

11:18:18 PM

not together

2019-02-21

mpogrebnyak

01:59:23 PM

hello, does anyone know, how can i limit inbound traffic using AWS EKS nodes?

joshmyers

02:21:00 PM

Limit inbound according to?

joshmyers

02:21:25 PM

Close your security groups

btai

07:18:21 PM

for helm, do you guys do multiple helm installs for dependent helm packages or do you nest them in your helm package for the application being deployed?

Erik Osterman (Cloud Posse)

09:39:17 PM

I avoid chart dependencies and use mostly helmfiles; makes it easier to swap out pieces and target individual services for upgrades

btai

07:18:53 PM

im trying to use this helm package: https://github.com/helm/charts/tree/master/incubator/aws-alb-ingress-controller

helm/charts

Curated applications for Kubernetes. Contribute to helm/charts development by creating an account on GitHub.

btai

07:21:02 PM

and im curious how I should use it because it sets the namespace to be the namespace of the helm release but what if I don’t necessarily want to do that? Should I just modify the helm package files after I fetch them or is it bad practice

Erik Osterman (Cloud Posse)

09:39:24 PM

are you passing --namespace?

btai

10:40:31 PM

i wanted to avoid passing –namespace

2019-02-22

nutellinoit

09:21:38 AM

@btai You can set a value for namespace in values.yaml eg “custom_namespace” and then you reference it the templates {{ .Values.custom_namespace }}

frednotet

10:40:50 AM

Hi everyone ! Does somebody know the simplest way to enable hpa’s on a fresh new kops cluster ? metrics-server cannot connect (401 forbidden) and I can’t find the solution to retrieve metrics… maybe another solution ?

amaury.ravanel

06:03:22 PM

check this => https://github.com/kubernetes-incubator/metrics-server/issues/212#issuecomment-459321884 i’m sure this will solve your issue

frednotet

09:25:36 AM

Thanks @amaury.ravanel but I already saw it and It didn’t help to solve it

frednotet

09:25:43 AM

I’m still having same issue… it works on kube-system but not on the other namespaces

frednotet

11:31:18 AM

If ever somebody reads… It’s very strange I had to rolling-out nodes & master and it works everywhere…

amaury.ravanel

02:44:19 PM

Did you do the steps defined in the issue ? If so those requires a rolling-update to work because kops installs kubelet on both instances and master and kubelet should be restarted.

amaury.ravanel

02:48:28 PM

Your case seems weird man ^^. Can you ellaborate on the issue a bit ? Is this a new cluster ? What version it is ? Did you do an update (if so which versions) ? Did you update your kops binary (if so which versions) ? How do you use kops ? (Gitops / tf / cf / nothing and prey)

frednotet

03:33:46 PM

well, I have another problem actually

frednotet

03:33:55 PM

maybe they’re related

frednotet

03:34:48 PM

so I did several tests on a fresh new cluster

frednotet

03:35:17 PM

(I have 3 clusters: “test”, “stg” and “prd”. those 3 are fresh new and are coded with terraform/kops)

frednotet

03:35:36 PM

I now realize that I have 6 masters instead of 3

frednotet

03:36:00 PM

if I force a rolling-update; it create new instances but they’re not healthy enough to join the cluster

frednotet

03:36:35 PM

I see in their kubeconfig that they’re still configured on 127.0.0.1 instead of the k8s’s api. If I manually change this (+ restart kubelet), it will join the cluster

frednotet

03:36:39 PM

but I have this error :

frednotet

03:36:50 PM

Unable to perform initial IP allocation check: unable to refresh the service IP block: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:4001: connect: connection refused

frednotet

03:37:10 PM

and the validation failed. I think that’s the reason why it ups new EC2 without releasing the old ones

frednotet

03:37:42 PM

I think I will delete the full cluster and re-init it ‘cause I’m really lost and all my google is purple instead of blue now ^^

frednotet

03:38:50 PM

even if I’d like to understand…

amaury.ravanel

05:24:20 PM

I just finish reading

amaury.ravanel

05:24:53 PM

what cni are you using ? if calico check that your nodes can reach the etcd cluster

amaury.ravanel

05:25:18 PM

it’s weird that you are using the 4001 port for etcd

amaury.ravanel

05:26:22 PM

what version of etcd / kubernetes are you using ? are you using etcd-manager (opt-in by default on kops w/ kube >= 1.11) ? if yes can you paste me the /etc/hosts of your masters please ?

amaury.ravanel

05:27:19 PM

can you type this command against your etcd cluster and paste the output => etcdctl cluster-health

frednotet

09:01:29 PM

was using weave but I changed, reinstall everything with Calico… and everything works fine

frednotet

09:02:02 PM

1.11.6 if I well remember (> 1.11 anyway since I integrate Spotinst and it needs 1.11)

frednotet

09:02:22 PM

thanks for your help, even if I reset everything…

frednotet

10:17:23 PM

I can reproduce actually… My cluster was working fine after a fresh installation… I edit the instancegroup to add more nodes and then I had to rolling-update the cluster

frednotet

10:18:08 PM

the new master comes up; the old is terminated… but the new ones has a /var/lib/kubelet/kubeconfig set on 127.0.0.1 instead of the API

frednotet

10:19:06 PM

kops rolling-update cluster k8s.stg.**********.io --state=s3://***********-stg-kops-state --yes                  
NAME			STATUS		NEEDUPDATE	READY	MIN	MAX	NODES
master-eu-west-1a	NeedsUpdate	1		0	1	1	1
master-eu-west-1b	NeedsUpdate	1		0	1	1	1
master-eu-west-1c	NeedsUpdate	1		0	1	1	1
nodes			NeedsUpdate	5		0	5	20	5
I0225 23:04:28.528274   63403 instancegroups.go:165] Draining the node: "ip-10-62-103-158.eu-west-1.compute.internal".
node/ip-10-62-103-158.eu-west-1.compute.internal cordoned
node/ip-10-62-103-158.eu-west-1.compute.internal cordoned
WARNING: Ignoring DaemonSet-managed pods: calico-node-4ql85
pod/calico-kube-controllers-77bb8588fc-qcb4h evicted
pod/dns-controller-5dc57b7c99-dtw8j evicted
I0225 23:04:42.275404   63403 instancegroups.go:358] Waiting for 1m30s for pods to stabilize after draining.
I0225 23:06:12.280987   63403 instancegroups.go:185] deleting node "ip-10-62-103-158.eu-west-1.compute.internal" from kubernetes
I0225 23:06:12.340897   63403 instancegroups.go:299] Stopping instance "i-07f15ebb7078aec08", node "ip-10-62-103-158.eu-west-1.compute.internal", in group "master-eu-west-1c.masters.k8s.stg.musimap.io" (this may take a while).
I0225 23:06:15.287836   63403 instancegroups.go:198] waiting for 5m0s after terminating instance
I0225 23:11:15.299756   63403 instancegroups.go:209] Validating the cluster.
I0225 23:11:17.347229   63403 instancegroups.go:273] Cluster did not pass validation, will try again in "30s" until duration "5m0s" expires: machine "i-0567076920fedf435" has not yet joined cluster.
I0225 23:11:48.468847   63403 instancegroups.go:273] Cluster did not pass validation, will try again in "30s" until duration "5m0s" expires: machine "i-0567076920fedf435" has not yet joined cluster.
I0225 23:12:23.592726   63403 instancegroups.go:273] Cluster did not pass validation, will try again in "30s" until duration "5m0s" expires: machine "i-0567076920fedf435" has not yet joined cluster.
I0225 23:12:48.538343   63403 instancegroups.go:273] Cluster did not pass validation, will try again in "30s" until duration "5m0s" expires: machine "i-0567076920fedf435" has not yet joined cluster.
I0225 23:13:18.516763   63403 instancegroups.go:273] Cluster did not pass validation, will try again in "30s" until duration "5m0s" expires: machine "i-0567076920fedf435" has not yet joined cluster.
I0225 23:13:48.512016   63403 instancegroups.go:273] Cluster did not pass validation, will try again in "30s" until duration "5m0s" expires: machine "i-0567076920fedf435" has not yet joined cluster.
I0225 23:14:18.697398   63403 instancegroups.go:273] Cluster did not pass validation, will try again in "30s" until duration "5m0s" expires: machine "i-0567076920fedf435" has not yet joined cluster.
I0225 23:14:48.490544   63403 instancegroups.go:273] Cluster did not pass validation, will try again in "30s" until duration "5m0s" expires: machine "i-0567076920fedf435" has not yet joined cluster.
I0225 23:15:18.539400   63403 instancegroups.go:273] Cluster did not pass validation, will try again in "30s" until duration "5m0s" expires: machine "i-0567076920fedf435" has not yet joined cluster.
I0225 23:15:48.672146   63403 instancegroups.go:273] Cluster did not pass validation, will try again in "30s" until duration "5m0s" expires: master "ip-10-62-103-6.eu-west-1.compute.internal" is not ready.
E0225 23:16:17.352484   63403 instancegroups.go:214] Cluster did not validate within 5m0s

master not healthy after update, stopping rolling-update: "error validating cluster after removing a node: cluster did not validate within a duration of \"5m0s\""

amaury.ravanel

12:32:25 AM

are you saying that you are changing the number of nodes and it brings you new masters ?

2019-02-23

James D. Bohrman

05:48:16 AM

Has anyone seen this yet? I haven’t played with it, but it looks really cool.

Write a Tiltfile script that describes how your services fit together. Share it with your team so that any engineer can hack on any server. See a complete view of your system, from building to deploying to logging to crashing.

https://tilt.dev/

Tilt

Local Kubernetes development with no stress

James D. Bohrman

05:51:17 AM

Anyone using Jaeger with K8’s here?

2019-02-24

amaury.ravanel

02:19:07 PM

@James D. Bohrman i’m using jaeger with k8s

James D. Bohrman

02:23:25 PM

How do you like it? I’ve been playing with it a bit and am having fun with it.

Erik Osterman (Cloud Posse)

06:17:32 PM

@amaury.ravanel are you using it together with Istio?

amaury.ravanel

02:44:37 PM

@Erik Osterman (Cloud Posse) yes and no

amaury.ravanel

02:46:15 PM

Let’s say not everywhere. I have tracing enabled by istio/envoy but some component are not injected by istio (lack of performances,…). So those just use the default jeager setup.

amaury.ravanel

04:59:38 PM

@James D. Bohrman it’s very nice and easy to implement if you use it with a service mesh. othw/ you shall implement it in yout code so k8s won’t help you with it

amaury.ravanel

05:00:20 PM

but I need to give a shot to the new elastic apm feature for opentracing

Erik Osterman (Cloud Posse)

09:26:50 PM

Has anyone looked into using AWS App Mesh (managed Envoy control plane ~ istio) with non-EKS kubernetes clusters? (e.g. #kops)

Erik Osterman (Cloud Posse)

09:28:37 PM

Is designed to pluggable and will support bringing your own Envoy images and Istio Mixer in the future.

Erik Osterman (Cloud Posse)

09:29:22 PM

Today, AWS App Mesh is available to use in preview

Erik Osterman (Cloud Posse)

09:49:07 PM

https://github.com/solo-io/supergloo

solo-io/supergloo

The Service Mesh Orchestration Platform. Contribute to solo-io/supergloo development by creating an account on GitHub.

Erik Osterman (Cloud Posse)

10:06:06 PM

@mumoshu have you seen this?

Erik Osterman (Cloud Posse)

10:06:21 PM

https://medium.com/solo-io/https-medium-com-solo-io-supergloo-ff2aae1fb96f

SuperGloo: The Service Mesh Orchestration Platform – solo.io – Medium attachment image

Today we are thrilled to announce the release of SuperGloo, an open-source project to manage and orchestrate service meshes at scale…

mumoshu

12:15:02 AM

yep! i like the cli and their vision.

not yet sure if it worth another abstraction at this point of time

SuperGloo: The Service Mesh Orchestration Platform – solo.io – Medium attachment image

Today we are thrilled to announce the release of SuperGloo, an open-source project to manage and orchestrate service meshes at scale…

Erik Osterman (Cloud Posse)

12:15:42 AM

yea….

Erik Osterman (Cloud Posse)

12:15:55 AM

did you use it with AWS App Mesh?

mumoshu

12:36:16 AM

not yet. just interestedf in istio + appmesh

2019-02-25

btai

11:51:44 PM

what do you guys use for SSL certs?

amaury.ravanel

12:15:10 AM

@btai which cert ? the one facing our apps ? or the one needed by kube to works ? (like api server, kubelet, …)

btai

12:15:20 AM

facing your apps

endofcake

12:15:37 AM

Anyone using Loki? https://github.com/grafana/loki

grafana/loki

Like Prometheus, but for logs. Contribute to grafana/loki development by creating an account on GitHub.

zadkiel

03:55:26 PM

I tried it and it looks great, well integrated with grafana explore and and even better now there is a fluentd output plugin to send logs from all fluend enabled slacks (https://github.com/grafana/loki/tree/master/fluentd/fluent-plugin-loki). still it’s still in alpha and not prod ready from now

grafana/loki

Like Prometheus, but for logs. Contribute to grafana/loki development by creating an account on GitHub.

amaury.ravanel

12:15:51 AM

@endofcake I know that @zadkiel gave a try on this

amaury.ravanel

12:15:55 AM

@btai https://github.com/jetstack/cert-manager

jetstack/cert-manager

Automatically provision and manage TLS certificates in Kubernetes - jetstack/cert-manager

amaury.ravanel

12:16:02 AM

this is what you need

btai

12:16:10 AM

nice im looking into that right now

btai

12:16:23 AM

whats the best way to generate some certs manually in the meantime?

amaury.ravanel

12:16:30 AM

openssl man

btai

12:16:51 AM

can i generate some with letsencrypt ?

amaury.ravanel

12:17:06 AM

or this https://github.com/cloudflare/cfssl

cloudflare/cfssl

CFSSL: Cloudflare’s PKI and TLS toolkit. Contribute to cloudflare/cfssl development by creating an account on GitHub.

amaury.ravanel

12:17:09 AM

yes you can

amaury.ravanel

12:17:47 AM

but man, certmanager is a maximum 1 hour setup for basic certificate generation

btai

12:18:37 AM

yeah?

amaury.ravanel

12:20:14 AM

yes !

amaury.ravanel

12:20:34 AM

there is an helm chart for that also in the github I linked to you

amaury.ravanel

12:20:52 AM

let me take a look I have some documentation for this in local

Erik Osterman (Cloud Posse)

03:53:46 AM

Erik Osterman (Cloud Posse)

03:53:47 AM

anyone going?

2019-02-27

2019-02-28

Erik Osterman (Cloud Posse)

08:43:53 AM

https://github.com/linki/cloudformation-operator

linki/cloudformation-operator

A Kubernetes operator for managing CloudFormation stacks via a CustomResource - linki/cloudformation-operator

endofcake

09:30:40 AM

https://banzaicloud.com/blog/istio-operator/

Introducing the Istio Operator for Kubernetes · Banzai Cloud

Bringing cloud native to the enterprise, simplifying the transition to microservices on Kubernetes

Erik Osterman (Cloud Posse)

06:36:24 PM

Anyone using AWS Service Mesh?

Erik Osterman (Cloud Posse)

06:37:14 PM

I love Istio, but it’s k8s centric; we have a upcoming use-case to create a mesh across ECS and k8s

amaury.ravanel

10:24:36 PM

I personally dislike the aws policy regarding opensource stealing (app-mesh is istio) so maybe you can come with an in between using true opensource project that run on both ecs and kubernetes like linkerd for example (I’m not having this use case neither use linkerd)

James D. Bohrman

03:41:51 AM

I’ve read about it a bit, never used it. It seems interesting.

btai

10:57:12 PM

one day istio will be independent of k8s