#office-hours (2020-07)
“Office Hours” are every Wednesday at 11:30 PST via Zoom. It’s open to everyone. Ask questions related to DevOps & Cloud and get answers! https://cloudposse.com/office-hours
Public “Office Hours” are held every Wednesday at 11:30 PST via Zoom. It’s open to everyone. Ask questions related to DevOps & Cloud and get answers!
https://cpco.io/slack-office-hours
Meeting password: sweetops
2020-07-01
This question might be specific to kubespray.
I want to set the kubelet-certificate-authority
flag in kube-apiserver.yaml
. I choose /etc/kubernetes/ssl/ca.crt
but that is probably wrong. Let me tell my story.
After adding the flag, I try to get logs from a pod. The following message was displayed:
Get <https://10.250.205.173:10250/.../bash-shell-d8bd1>: x509: cannot validate certificate ... because it doesn't contain any IP SANs
Then I changed the --kubelet-preferred-address-types
parameter to InternalDNS
. This changed the message to:
Error from server: no preferred addresses found; known addresses: [{InternalIP 10.250.205.173} {Hostname ip-10-250-205-173.ec2.internal}]
Since it seems like Hostname
was known, I changed to using InternalDNS,Hostname
. This changed the message to:
Error from server: Get <https://ip-10-250-205-173.ec2.internal:10250/containerLogs/kube-system/nodelocaldns-s8mfk/node-cache>: x509: certificate signed by unknown authority
Am I using the wrong CA file?
is there a packer linter ?
#office-hours starting in 25 minutes. Make sure you post your questions here!
1.What is the recomended approach to stream the output of a background job running in server, to the web application? Cloud watch, Web-Socket, fluentd, logstash, or any other solution?
Too vague question to provide exact solution that will match everyones expectations. This one is to order beers and have long chat
For me most important questions are: Should it be realtime? Does cost matter?
Check this https://aws.amazon.com/elasticsearch-service/the-elk-stack/ if the cost reasonable for your case
ELK is the popular, open-source framework for log analytics. Try Amazon Elasticsearch Service to deploy and manage ELK without any operational overhead.
Bircan Bilici has joined Public “Office Hours”
Erik Osterman (Cloud Posse) has joined Public “Office Hours”
rb rb has joined Public “Office Hours”
David Raistrick has joined Public “Office Hours”
Marcin Branski has joined Public “Office Hours”
Joe Hosteny has joined Public “Office Hours”
Marc Tanne has joined Public “Office Hours”
Muhammed Soyer has joined Public “Office Hours”
David Medinets has joined Public “Office Hours”
Josh Duffney has joined Public “Office Hours”
Alex Vorona has joined Public “Office Hours”
Leo Zavala has joined Public “Office Hours”
Michael Holt has joined Public “Office Hours”
@here public #office-hours starting now! join us to talk shop https://cloudposse.zoom.us/j/508587304
Eric Berg has joined Public “Office Hours”
Marcos Soutullo Rodriguez has joined Public “Office Hours”
what is the password for the meeting ?
I’m getting the same prompt, unexpectedly.
Rahul Muraleedharan has joined Public “Office Hours”
Johnny Mom has joined Public “Office Hours”
Ianculov Vucomir has joined Public “Office Hours”
Adam Watson has joined Public “Office Hours”
David Scott has joined Public “Office Hours”
Brian Choy has joined Public “Office Hours”
Robert Horrox has joined Public “Office Hours”
set the channel topic: Meeting password: sweetops
Public “Office Hours” are held every Wednesday at 11:30 PST via Zoom. It’s open to everyone. Ask questions related to DevOps & Cloud and get answers! https://cpco.io/slack-office-hours
Neil Gealy has joined Public “Office Hours”
I changed channel topic to provide zoom password first. Seems that it’s much inconvinient and people ask that every week (me included :D)
Blaise pabon has joined Public “Office Hours”
At Clever, we’ve embraced microservices. They promote modularity, which leads to simpler code bases and lets our engineers move quickly and independently. They are easier to deploy, which helps us…
Command line utility to send messages with attachments to Slack channels via Incoming Webhooks - cloudposse/slack-notifier
Cloud Posse installer and distribution of native apps, binaries and alpine packages - cloudposse/packages
A CLI tool to make git changes across many repos, especially useful with Microservices. - Clever/microplane
Here are 5 automated PRs:
- https://github.com/cloudposse/terraform-aws-alb-target-group-cloudwatch-sns-alarms/pull/19
- https://github.com/cloudposse/terraform-aws-acm-request-certificate/pull/23
- https://github.com/cloudposse/terraform-aws-alb/pull/43
- https://github.com/cloudposse/terraform-aws-backup/pull/4
- https://github.com/cloudposse/terraform-aws-ses/pull/4
What Adds chatops commands '/test all' '/test bats' '/test readme' '/test terratest' Drops codefresh Drops slash-command-dispatch Removes codefresh badge Rebuild…
What Adds chatops commands '/test all' '/test bats' '/test readme' '/test terratest' Drops codefresh Drops slash-command-dispatch Removes codefresh badge Rebuild…
What Adds chatops commands '/test all' '/test bats' '/test readme' '/test terratest' Drops codefresh Drops slash-command-dispatch Removes codefresh badge Rebuild…
What Adds chatops commands '/test all' '/test bats' '/test readme' '/test terratest' Drops codefresh Drops slash-command-dispatch Removes codefresh badge Rebuild…
What Adds chatops commands '/test all' '/test bats' '/test readme' '/test terratest' Drops codefresh Drops slash-command-dispatch Removes codefresh badge Rebuild…
Q: found a terraform-aws-kops-vault-backend
repo and was wondering if you guys have an infrastructure where vault is running on a k8s cluster, with other k8s clusters authenticating and pulling secrets from that singular vault using a mutating webhook secrets injector
been using bank-vaults and have been unsuccessful communicating due to kops internal certs since we use AWS ACM to handle ssl
from vault:
login unauthorized due to: Post <https://CLUSTER/apis/authentication.k8s.io/v1/tokenreviews>: x509: certificate signed by unknown authority
trying to wrap my head around what certs are required and where, or how to debug since we’re terminating through ACM
ElasticSearch plus Kibana
Node.js stream-based access to CloudWatch Logs. Contribute to mapbox/cwlogs development by creating an account on GitHub.
Andrew Elkins has joined Public “Office Hours”
Thanks, everyone. I need to signoff to get ready for a 4pm meeting.
Rahul Muraleedharan has joined Public “Office Hours”
Victor Fondevilla has joined Public “Office Hours”
I am curious to understand how others manage their secret and sensitive info in conjunction with Terraform. Most of my use-cases with terraform are provisioning Infra (Usually AWS) and then Application resources that depend on the infra. Examples of Secrets: single-line strings passwords api-keys tokens multi-line strings ascii-armored pem files ascii license data binary license data I’ll explain the requirements I’m trying to fulfill and then currently how I achieve the success criter…
thanks for the help!
Great video on multi-cloud. Basically, you don’t choose multi-cloud. Multi-cloud chooses you.
New Zoom Recording from our Office Hours session on 2020-07-01 is now available.
2020-07-02
Any kubernetes ready opensource alternative to healthchecks.io ( i am aware of selfhosted version ) ?
this doesn’t directly answer your question, but if you happen to use opsgenie, they have this functionality built-in
(if you’re not using opsgenie, what are you using for escalations?)
pagerduty, it has integration to deadmanssnitch ( but of course you have to buy it )
2020-07-03
2020-07-08
Office hours starting in 15 minutes! please post your questions
probably a dumb question but what are the cons of running a fargate container as root user instead of a non root user?
whats a good way to compare ecs ec2 to ecs fargate cost ?
Working with multiple pull requests in .github/PULL_REQUEST_TEMPLATE/
with 2 files general.md
and kms_secrets.md
. When I create a new PR, I expected to see a button to select which template like we see with issue templates. What could the issue be ?
Question on how people are managing cross account IAM in CI with OIDC. I’m fighting with having to assume a role in the target account before running a command (eg terraform not supporting web tokens in aws). and with tools like chamber default to the account they are running in. managing lots of accounts in a CI process is seeming like a hassle
@Erik Osterman (Cloud Posse) “Waiting for the host to start this meeting”
Erik Osterman (Cloud Posse) has joined Public “Office Hours”
rb rb has joined Public “Office Hours”
Andrew Roth has joined Public “Office Hours”
Adam Watson has joined Public “Office Hours”
Robert Horrox has joined Public “Office Hours”
@here our devops #office-hours starting now! join us to talk shop https://cloudposse.zoom.us/j/508587304
Eddie Wizelman has joined Public “Office Hours”
Michael Holt has joined Public “Office Hours”
Joe Hosteny has joined Public “Office Hours”
Brian Tai has joined Public “Office Hours”
Adam Crown has joined Public “Office Hours”
Paul Tath has joined Public “Office Hours”
ngealy has joined Public “Office Hours”
Eric Berg has joined Public “Office Hours”
Leo Zavala has joined Public “Office Hours”
Sheldon Hull has joined Public “Office Hours”
Marcin Branski has joined Public “Office Hours”
Babajide Hassan has joined Public “Office Hours”
Command line utility for updating GitHub commit statuses and enabling required status checks for pull requests - cloudposse/github-status-updater
Omer Sen has joined Public “Office Hours”
Command line utility for creating GitHub comments on Commits, Pull Request Reviews or Issues - cloudposse/github-commenter
Bircan Bilici has joined Public “Office Hours”
sri has joined Public “Office Hours”
Adam Blackwell has joined Public “Office Hours”
Juan Soto has joined Public “Office Hours”
Marc Tamsky has joined Public “Office Hours”
Monitoring synthetic metrics can optimize the user experience on your application. Here’s how Grafana makes that easier
AWS Fargate is a technology that you can use with Amazon ECS to run containers without having to manage servers or clusters of Amazon EC2 instances. With AWS Fargate, you no longer have to provision, configure, or scale clusters of virtual machines to run containers. This removes the need to choose server types, decide when to scale your clusters, or optimize cluster packing.
When you add a pull request template to your repository, project contributors will automatically see the template’s contents in the pull request body.
And that’f for github actions
https://docs.github.com/en/actions/configuring-and-managing-workflows/sharing-workflow-templates-within-your-organization
You can create a standardized set of workflow templates specifically for your organization. Organization members can then use the templates when creating new workflows in the organizations repositories.
Community health files for the @GitHub organization - github/.github
Meta-GitHub repository for all terraform-aws-modules repositories - terraform-aws-modules/.github
github-actions-exporter for prometheus. Contribute to Spendesk/github-actions-exporter development by creating an account on GitHub.
11:39:50 From Sheldon Hull : #6 awesome. Exactly something I wanted more info on, very little documentation on it
11:46:57 From Sheldon Hull : It's not cheap :-)
11:53:02 From Sheldon Hull : aws released support for synthetic checks built into cloud watch
11:53:26 From Adam Blackwell : We looked at exporting prometheus things into New Relic
11:54:04 From Omer Sen : good old times we were using Nagios ;)
11:54:06 From Sheldon Hull : This is a perfect use case for lambda/serverless
11:54:14 From Andrew Roth : <https://grafana.com/blog/2019/06/18/grafana-tutorial-simple-synthetic-monitoring-for-applications/>
11:54:33 From Sheldon Hull : Deploy to any region and run these commands periodically. Pretty sure that's what AWS cloud watch synthetic checks supports now.
11:54:42 From Marc Tamsky : <https://github.com/prometheus/blackbox_exporter>
11:54:51 From Sheldon Hull : <https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Synthetics_Canaries.html>
11:55:25 From Sheldon Hull : ngrok solves all woes :-)
11:58:44 From Andrew Roth : <https://docs.aws.amazon.com/AmazonECS/latest/developerguide/AWS_Fargate.html#fargate-task-defs>
12:10:07 From Adam Blackwell : Have to drop off, but thanks for all the new ideas! :wave:
12:20:33 From Andrew Roth : <https://github.com/github/.github>
12:22:19 From Sheldon Hull : I have question about GitHub actions when ready
12:24:30 From Sheldon Hull : Until you get the bill 😂
12:24:58 From Sheldon Hull : I would use RDS except 100 database limit is freaking crazy impacting to cost for us
12:27:31 From Marc Tamsky : is the 100 database limit a soft or hard quota?
12:29:33 From Eric Berg : I'm working on getting datadog set up for my containerized java and elixir apps, running on k8s, with an EKS backplane, running in AWS. I'm having trouble getting a handle on what metrics are available, what metrics are associated with the various levels (i.e., cluster, node, pod, etc.).
12:30:10 From Eric Berg : So, my question is whether anyone has any good references for helping to sort through all of this.
12:33:08 From Sheldon Hull : If we have any time at end, anyone who has implemented chatops with Microsoft Teams?
12:33:40 From Sheldon Hull : thanks!
New Zoom Recording from our Office Hours session on 2020-07-08 is now available.
@Erik Osterman (Cloud Posse) looks like the podcast feed didn’t get updated
ah crap
Yep, thanks for the heads up - will see where the automation broke down! Looks like I need monitoring this stuff - SRE for podcasts.
No worries
Yes, PodOps?!
I saw something generic for monitoring feed changes somewhere the other day. Will try to remember where I saw it
It was actually https://healthchecks.io/ Probably doesn’t fit here.
Healthchecks.io alerts you when your cron jobs fail to run on time. Quick setup (no coding required), clean dashboard, affordable pricing.
Aha, so I’m going to take a bit of my own advice! we have that with opsgenie, but I didn’t think of using it with our Zapier configuration.
Yep! Just set it up.
Also fixed the podcast.
Thanks!
2020-07-09
2020-07-10
Just listened the last Office Hours podcast, we have working pull requests templates at https://github.com/swapagarwal/swag-for-dev Unfortunately, these are not automatically listed when you create a new PR, you’ll need to link directly to it like this: https://github.com/swapagarwal/swag-for-dev/compare/master...aslafy-z<i class="em em-add-hasura?expand=1&template=new-swag-opportunity.md|https"</i>//github.com/swapagarwal/swag-for-dev/compare/master…aslafy-z:add-hasura?expand=1&template=new-swag-opportunity.md> which is not easy to use.. if the user changes the branch, template in the URL is reset and default one is applied instead. Hopefully they will be implemented some day!
swag opportunities for developers. Contribute to swapagarwal/swag-for-dev development by creating an account on GitHub.
swag opportunities for developers. Contribute to swapagarwal/swag-for-dev development by creating an account on GitHub.
2020-07-11
2020-07-15
Is there a page somewhere with the talking points for each previous episode of Office Hours?
Or links to Erik’s Google Sheet presentations?
hey @Andy, we have @Andy Miguel working on updating our show notes for this
Right now, our slides are not yet published anywhere… but rest assured we are working on it! (only 50 hours of video to go - haha)
Remember to post your questions for today’s office hours starting in 10 minutes
Erik Osterman (Cloud Posse) has joined Public “Office Hours”
Alex Siegman has joined Public “Office Hours”
Adam Watson has joined Public “Office Hours”
Stephen Lucero has joined Public “Office Hours”
@here our devops #office-hours starting now! join us to talk shop https://cloudposse.zoom.us/j/508587304
Eric Berg has joined Public “Office Hours”
Marcin Branski has joined Public “Office Hours”
Andrew Elkins has joined Public “Office Hours”
I’ve set up TF to spin up our entire stack, from the VPC on up to the helm charts. This same code will be used to spin up each client environment. My question is about whether to use workspaces or another approach to minimize code duplication and facilitate management of each installation.
Jose Netto has joined Public “Office Hours”
HariPrasad Venkatanarayana has joined Public “Office Hours”
Rahul Muraleedharan has joined Public “Office Hours”
Eddie Wizelman has joined Public “Office Hours”
Neil Gealy has joined Public “Office Hours”
Sai Veerepalli has joined Public “Office Hours”
We build and host a wide variety of web applications and I’m working on getting our CI processes up to speed with some standardized default coding standards enforcement. Ideally I can centralize these and other project-agnostic configuration files, but still incorporate them into projects that don’t have them in place at build/testing time to ensure the latest configuration is always being used.
I’m trying to identify the best strategy and tool for this.
Zachary Loeber has joined Public “Office Hours”
Zadkiel AHARONIAN has joined Public “Office Hours”
Eddie Wizelman has joined Public “Office Hours”
Michael Holt has joined Public “Office Hours”
Adam Crown has joined Public “Office Hours”
I am new to terraform - I have written a terraform module to create AWS CodePipelines that me and my team can use to create multiple pipelines. All pipeline’s module definitions are under a single main.tf file though I am passing multiple tfvars files. When I run terraform plan I see that terraform is planning to modify existing resource rather than creating new one, I have tf state store in an S3 bucket.
Babajide Hassan has joined Public “Office Hours”
Andy has joined Public “Office Hours”
Zadkiel AHARONIAN has joined Public “Office Hours”
If you’re on AWS and a company trying to improve your Infrastructure set up are there recommendations out of these options:
• k8s via EKS
• k8s via kops
• Docker via ECS
• Nomad?? Team of 2 SREs: 1 experienced with k8s. Other things we use: github.com. Also looking for recommendations for CI tools
for an existing kops installation, are there benefits to switching to EKS?
sorry @Alex Siegman! i just saw this now
We’re a DevOps accelerator. That means we help companies own their infrastructure in record time by building it with you and then showing you the ropes. If t…
@Erik Osterman (Cloud Posse) just want to mention that your speaking/communication skills are really great! And this is really crucial and often overlooked in this so-called devops transformation.
We’re a DevOps accelerator. That means we help companies own their infrastructure in record time by building it with you and then showing you the ropes. If t…
Thanks @Andrew Nazarov - it means a lot to hear that! appreciate it.
New Zoom Recording from our Office Hours session on 2020-07-15 is now available.
2020-07-16
Guys? I am looking for a some tool that can handle installing/updating binary packages in linux ( a lot of binaries like helm, helmfile, kustomize, ytt does not have any package maintainer - os, flatpak, snap, nixos ). Not all packages are available as assets for downloading via github ( helm ), some some logic for “curling” new version would be fine. Any ideas?
can https://github.com/variantdev/mod help with this ?
Missing package manager for any task runners and build tools e.g. make and variant - variantdev/mod
Asdf
Extendable version manager with support for Ruby, Node.js, Elixir, Erlang & more - asdf-vm/asdf
Why not create a make file for pulling in all binaries and building a package for your specific flavor of Linux e.g. rpm/deb
Thanks
+1 for asdf but you can also go the docker approach and wrap your toolkit in a Docker image that you update periodically. You then need to execute everything through Docker but it’s very portable and easy to spin up others on.
AFAIU, this is what CP’s geodesic tool is.
2020-07-22
@here Remember to post your questions for today’s office hours starting in 25 minutes
Erik Osterman (Cloud Posse) has joined Public “Office Hours”
Andrew Roth has joined Public “Office Hours”
Phil Chen has joined Public “Office Hours”
Babajide Hassan has joined Public “Office Hours”
Michael Holt has joined Public “Office Hours”
Eddie Wizelman has joined Public “Office Hours”
Rob Flesher has joined Public “Office Hours”
Marc Tanne has joined Public “Office Hours”
Marcin Branski has joined Public “Office Hours”
Latika Wadhwani has joined Public “Office Hours”
kevin chan has joined Public “Office Hours”
Eric Berg has joined Public “Office Hours”
Brian Tai has joined Public “Office Hours”
Andrey Nazarov has joined Public “Office Hours”
Nathaniel Alconcel has joined Public “Office Hours”
Joy has joined Public “Office Hours”
Pulumi guys say you can just pack everything into npm package or whatever and reuse it this way;)
which is fine and all if you’re starting from ground zero
but not if your org has a significant investment into terraform modules
Reet Chowdhary has joined Public “Office Hours”
Announcements
EKS 1.17 Released - AWS speeding up releases https://aws.amazon.com/about-aws/whats-new/2020/07/amazon-eks-supports-kubernetes-version-1-17/
GitHub Actions AWS Terraform Module (with spot instances!) https://github.com/philips-labs/terraform-aws-github-runner
AWS CDK Now Supports Terraform! https://www.hashicorp.com/blog/cdk-for-terraform-enabling-python-and-typescript-support/
Terraform 0.13 RC1! https://github.com/hashicorp/terraform/releases/tag/v0.13.0-rc1
Check out our YouTube Channel for past episodes! https://www.youtube.com/channel/UCvTIgk77GZg7hEs6dNIGs7A
Terraform module for scalable GitHub action runners on AWS - philips-labs/terraform-aws-github-runner
Cloud Development Kit for Terraform, a collaboration with AWS Cloud Development Kit (CDK) team. CDK for Terraform allows users to define infrastructure using TypeScript and Python while leveraging the hundreds of providers and thousands of module definitions provided by Terraform and the Terraform ecosystem.
0.13.0-rc1 (July 22, 2020) BUG FIXES: command/init: Fix confusing error message for locally-installed providers with invalid package structure (#25504) core: Prevent outputs from being evaluated d…
We’re a DevOps accelerator. That means we help companies own their infrastructure in record time by building it with you and then showing you the ropes. If t…
Neil Gealy has joined Public “Office Hours”
Hey guys, my company is planning on moving from a monolith progressively to a microservices architecture. We’re planning on using Docker and Kubernetes via EKS to manage the packaging and deployment. There’s a whole bunch of considerations but one question I have:
• What is the interaction between terraform (which we’re already using) and these build/deploy tools
Sorry I missed this. We’ll answer this next week =)
Neil Gealy has joined Public “Office Hours”
Neil Gealy has joined Public “Office Hours”
Identify image vulnerabilities in Kubernetes pods. Contribute to quay/container-security-operator development by creating an account on GitHub.
Hi, i was using EKS worker nodes in the past our our staging ENV and now i would like to switch to terraform-aws-eks-node-group my question is
- if i use terraform-aws-eks-node-group is there a way to encrypt the disk and also set scaling policy(CPU limit) ?
- if i use EKS worker nodes is there a way to automatically dain nodes before removing them, at the moment i’m using
termination_policies = ["OldestInstance", "OldestLaunchConfiguration", "Default"]
?
Terraform module to provision an EKS Node Group. Contribute to cloudposse/terraform-aws-eks-node-group development by creating an account on GitHub.
Sorry I missed this. We’ll answer this next week
Terraform module to provision an EKS Node Group. Contribute to cloudposse/terraform-aws-eks-node-group development by creating an account on GitHub.
thanks i joined at the end as i did not see my calendar notification, will be in the next week meeting as well
Ianculov Vucomir has joined Public “Office Hours”
Juan Soto has joined Public “Office Hours”
https://github.com/jantman/awslimitchecker
A script and python module to check your AWS service limits and usage, and warn when usage approaches limits
Much more robust than Trusted Advisor (which supports limits in paid plan).
A script and python package to check your AWS service limits and usage via boto3. - jantman/awslimitchecker
Use tags to categorize and track your AWS costs with your monthly and hourly cost allocation reports.
Bircan Bilici has joined Public “Office Hours”
New Zoom Recording from our Office Hours session on 2020-07-22 is now available.
2020-07-23
does anyone use some kind of self hosted code searching tool? looking at opengrok but also see others like hound (4.4k stars) and google code search (2.4k stars)
for webbased search? e.g. something like the (paid) algoia
does anyone use some kind of self hosted code searching tool? looking at opengrok but also see others like hound (4.4k stars) and google code search (2.4k stars)
ohhh code search
never mind.
haven’t looked around lately. hound
looks nice.
ya. we did a hack week this week and i wish i had taken a step back and found hound sooner
opengrok is such a PITA to setup and hound even notes that in their blog post about hound in 2015 lol
would be rad if we had [search.cloudposse.com](http://search.cloudposse.com)
to find stuff faster.
ohhhh maaaaan
github search is “good enough” sometimes but i do like regex searches like opengrok / hound
ya….
@btai AWS updated the issue today regarding pod density on EKS. Not sure if it’s a coincidence or not, since I escalated this to AWS yesterday via our rep.
We are working on the next version of the Kubernetes networking plugin for AWS. We've gotten a lot of feedback around the need for adding Kubenet and support for other CNI plugins in EKS. This …
This will allow for all worker nodes to support at least the Kubernetes recommended pods per node thresholds (min(110, 10*#cores))
for an r4.2xlarge that’s still only 80 pods.
(our max pod count is 200 — we’ve not had issues running w/ this setting for years)
We are working on the next version of the Kubernetes networking plugin for AWS. We've gotten a lot of feedback around the need for adding Kubenet and support for other CNI plugins in EKS. This …
anyone create a custom github homebrew tap in a private repo ? getting authorization errors
Hey, all, what was the tool that was mentioned on the last office-hours, that wipes out all of the resources in an aws account? Thanks!
Could have been AWS Nuke
Nuke a whole AWS account and delete all its resources. - rebuy-de/aws-nuke
That’s the one we are using
And here is our config https://github.com/cloudposse/testing.cloudposse.co/blob/master/.github/aws-nuke.yaml
Example Terraform Reference Architecture that implements a Geodesic Module for an Automated Testing Organization in AWS - cloudposse/testing.cloudposse.co
Cool! Thanks, @Erik Osterman (Cloud Posse)! Excellent example. You’re brave.
It’s our testing account - designed to be nuked
It’s even inside of a totally separate AWS organization that shares nothing
I ran it on the new account i was working on yesterday. VEEEEEERY sharp! Super powerful, but i killed too much IAM stuff and had to just trash the account and tart over. Good to know about this though.
Lol, exactly - very easy to blow your leg off
2020-07-24
2020-07-27
anyone use a module to create scheduled ecs tasks ? looking at this module, but open to other modules too.
https://github.com/turnerlabs/terraform-ecs-fargate-scheduled-task
we have created schedule tasks in ecs
anyone use a module to create scheduled ecs tasks ? looking at this module, but open to other modules too.
https://github.com/turnerlabs/terraform-ecs-fargate-scheduled-task
it was so little code we did not created a module
ahhh this uses a cloudwatch event , that is very different of what we did
yea, and we use cloudwatch event too. im looking at this module now.
module "ecs_scheduled_task" {
source = "git::<https://github.com/tmknom/terraform-aws-ecs-scheduled-task.git?ref=tags/2.0.0>"
name = "example"
schedule_expression = "rate(3 minutes)"
container_definitions = var.container_definitions
cluster_arn = var.cluster_arn
subnets = var.subnets
}
super simple
you can also pass in an iam role instead of th emodule creating one for you
that is cool, we use ec2+ecs and we use a cron sidecard
interesting setup. any reason to not use cloudwatch cron ?
or is it to safe money ? or convenience?
no reason, I did not know you could do it that way
the cron work very well for us because it ingest data once is created
and that data is on a s3 bucket on a schedule too
ah i see. yea the cw method is convenient. i havent done it the other way.
How does everyone here create golden amis with toggles ? such as if you want instance X to use AMI1 with datadog and instance Y to use AMI1 without datadog, you wouldn’t build a whole new AMI, you’d have some kind of flag or feature toggle, right?
Would love to here thoughts on this. I’m wondering if we can do something with SSM or tagging on instances to use as toggles.
2020-07-28
I’m here! What’d I miss?
haha! a lot…. but you’re in luck, we have it all recorded.
We’re a DevOps accelerator. That means we help companies own their infrastructure in record time by building it with you and then showing you the ropes. If t…
2020-07-29
wondering if you guys have any tips around kubernetes dns benchmarking and debugging. dealing with intermittent hostname resolution failures to external hostnames with coredns. networking isn’t a strong point of mine and would love to hear if you guys made any dns optimisations on k8s and have any advice on how to gain visibility to start troubleshooting this
got it! we’ll bring it up today
@Erik Osterman (Cloud Posse) brian here, enjoyed this week’s office hours. should i post again next week to hear your thoughts on this topic?
Ahk! I started adding it to my slides and got pull aside. Yes, let’s repost for next week. Sorry!
thanks!
@Brian where are you running this? I’ve faced this in minikube but not elsewhere (knock on wood!).
@OliverS aws via kops.
.:53 {
errors
health {
lameduck 5s
}
kubernetes cluster.local. in-addr.arpa ip6.arpa {
pods insecure
upstream
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
loop
cache 30
loadbalance
reload
}
currently have datadog to monitor SERVFAIL and trying to benchmark using https://github.com/kubernetes/perf-tests/tree/master/dns
Performance tests and benchmarks. Contribute to kubernetes/perf-tests development by creating an account on GitHub.
and being alerted via sentry
OperationalError: could not translate host name "[REDACTED].us-east-1.rds.amazonaws.com" to address: Try again
what’s your timeout for a DNS response?
can you elaborate a bit? networking isn’t my strong suit. if it’s a configurable on the coredns or kubernetes side, i believe it’s left at the default value. if you’re asking how long til it times out, unsure how to check
@Erik Osterman (Cloud Posse) curious about your /quiz link. is that strictly for business, or is it also an open forum for discussion topics?
haha, it’s “top of the funnel” - suppose you could ask a question in one of the free form fields.
so client libraries that perform DNS lookups will typically have a DNS timeout. Additionally, DNS is by default of UDP so timeouts play a big role. If your timeout is 25ms for a DNS lookup, it could look like a DNS failure, but really it was just an aggressive timeout
I would first extend the timeouts and see if it alleviates any of the problems. If not, then restore it and keep digging into it.
are you operating on EKS?
we’re using kops on aws with coredns, no eks. we have about 4000 pods running on ~60 nodes autoscaling in and out.
if my metrics are correct, we have a max of 0.8ms coredns request latency at the time we started getting several SERVFAILs with 1.5k queries per second
going to look into our dns timeouts and see what’s that at
btw thanks for taking your time to assist
Also investigate the the load on your masters. K8S hates loaded masters due to raft consensus. Make sure your running the appropriate size masters for your cluster size. There are best practices guides for this out there, but not sure off the top of my head.
DNS benchmarking and optimizations( EKS focused, but it’s not tied to that): https://www.vladionescu.me/posts/eks-dns.html
NodeLocalDNS is the usual way to handle it.
Also, move from TCP to UDP as many configs do default to TCP
Amazon VPC DNS Limits of 1024 packets per second per network interface the EKS-default 2 CoreDNS pods can be quickly overloaded. Each CoreDNS pod has to serve DNS traffic from the host through its own elastic network interface, which is limited to 1024 packets per second.
Control DNS support for your VPC.
arg. meetings made me miss this week. @Erik Osterman (Cloud Posse) did you address this topic? looking forward to checking out the recap
we did! a video recording will be shared shortly
Erik Osterman (Cloud Posse) has joined Public “Office Hours”
Hi, i was using EKS worker nodes in the past our our staging ENV and now i would like to switch to terraform-aws-eks-node-group my question is
- if i use terraform-aws-eks-node-group is there a way to encrypt the disk and also set scaling policy(CPU limit) ?
- if i use EKS worker nodes is there a way to automatically drain nodes before removing them, at the moment i’m using
termination_policies = ["OldestInstance", "OldestLaunchConfiguration", "Default"]
?
Terraform module to provision an EKS Node Group. Contribute to cloudposse/terraform-aws-eks-node-group development by creating an account on GitHub.
Erik Osterman (Cloud Posse) has joined Public “Office Hours”
Ianculov Vucomir has joined Public “Office Hours”
Robert Jackson has joined Public “Office Hours”
Neil Gealy has joined Public “Office Hours”
Andrew Roth has joined Public “Office Hours”
Eddie Wizelman has joined Public “Office Hours”
James Gray has joined Public “Office Hours”
Michael Martin has joined Public “Office Hours”
Michael Holt has joined Public “Office Hours”
James Gray has joined Public “Office Hours”
Denis Tomakhin has joined Public “Office Hours”
Marcin Branski has joined Public “Office Hours”
nitro code has joined Public “Office Hours”
Robert Horrox has joined Public “Office Hours”
Brian Choy has joined Public “Office Hours”
Gabriel Tam has joined Public “Office Hours”
Rahul has joined Public “Office Hours”
Ayrton Araújo has joined Public “Office Hours”
Nathaniel Alconcel has joined Public “Office Hours”
John Mitchell has joined Public “Office Hours”
GitHub public roadmap. Contribute to github/roadmap development by creating an account on GitHub.
Guidance for changing the default branch name for GitHub repositories - github/renaming
Oliver Schoenborn has joined Public “Office Hours”
Andrew Elkins has joined Public “Office Hours”
Checks whether Kubernetes is deployed according to security best practices as defined in the CIS Kubernetes Benchmark - aquasecurity/kube-bench
Adam Watson has joined Public “Office Hours”
hari b has joined Public “Office Hours”
An implementation of Netflix’s Chaos Monkey for Kubernetes clusters - asobti/kube-monkey
Marc Tamsky has joined Public “Office Hours”
Jay Simoni has joined Public “Office Hours”
Vladimir Samoylov has joined Public “Office Hours”
Vicken Simonian has joined Public “Office Hours”
Gabriel Tam has joined Public “Office Hours”
A Kubernetes Daemonset to gracefully handle EC2 instance shutdown - aws/aws-node-termination-handler
Adam Crown has joined Public “Office Hours”
Autoscaling components for Kubernetes. Contribute to kubernetes/autoscaler development by creating an account on GitHub.
Babajide Hassan has joined Public “Office Hours”
Eric Berg has joined Public “Office Hours”
Adam Crown has joined Public “Office Hours”
Terraform Module for integration DataDog with AWS. Contribute to cloudposse/terraform-aws-datadog-integration development by creating an account on GitHub.
Blaise Pabon has joined Public “Office Hours”
Amazon Timestream is a fast, scalable, fully managed time series database service for IoT and operational applications that makes it easy to store and analyze trillions of events per day at 1/10th the cost of relational databases.
Comprehensive Distribution of Helmfiles for Kubernetes - cloudposse/helmfiles
this should work for EKS - https://www.eksworkshop.com/beginner/080_scaling/deploy_ca/
Amazon EKS Workshop
wondering if you guys have any tips around kubernetes dns benchmarking and debugging. dealing with intermittent hostname resolution failures to external hostnames with coredns. networking isn’t a strong point of mine and would love to hear if you guys made any dns optimisations on k8s and have any advice on how to gain visibility to start troubleshooting this
We’re at that point at which we need to set up something like PagerDuty. I’ve heard OpsGenie mentioned here and we are an Atlassian Cloud shop, but i’ve used PD in the past. We’re a small shop at this point (< 20 devs/ops people) and we’ll start with just one or two rota.
Sorry if this has been discussed before, but any input or suggestions to help make the choice would be appreciated.
New Zoom Recording from our Office Hours session on 2020-07-29 is now available.
for next office hours
https://sweetops.slack.com/archives/CUGPEKG9H/p1595537102029100
anyone create a custom github homebrew tap in a private repo ? getting authorization errors