#office-hours (2021-11)
Public “Office Hours” are held every Wednesday at 11:30 PST via Zoom. It’s open to everyone. Ask questions related to DevOps & Cloud and get answers!
Meeting password: sweetops

My Q&A Questions
Suppose you have a K8S cluster per team, would you do a VPC per cluster ? Or VPC per stage (Prod/staging/testing) ?
How to pass information between Terraform and the Helm Chart (deployed via Argo) ? a. For Testing/QA we have DB Snapshot Volumes that we will use Terraform to clone, and pass it for the HelmChart b. The HelmChart then uses to create a PV & PVC for the DB Pod.

links from today’s session:
• https://github.com/doitintl/kube-no-trouble
• https://aws.amazon.com/about-aws/whats-new/2021/10/amazon-eks-nodes-groups-bottlerocket/
• https://reinvent.awsevents.com/justify-your-trip/?trk=www.google.com
• https://www.youtube.com/watch?v=-MmRf27UEWM
• https://aws.amazon.com/about-aws/whats-new/2021/10/aws-fargate-amazon-ecs-windows-containers/

From @Vlad Ionescu (he/him) in zoom
https://github.com/bottlerocket-os/bottlerocket is interesting and more in-depth that the usual blogs
An operating system designed for hosting containers - GitHub - bottlerocket-os/bottlerocket: An operating system designed for hosting containers

This is a cool thing for those who are on GCP https://cloud.google.com/blog/products/containers-kubernetes/introducing-container-image-streaming-in-gke

New container image streaming in Google Kubernetes Engine slashes the time it takes to boot your applications.

A cluster is a set of nodes (physical or virtual machines) running Kubernetes agents, managed by the control plane. Kubernetes v1.22 supports clusters with up to 5000 nodes. More specifically, Kubernetes is designed to accommodate configurations that meet all of the following criteria: No more than 110 pods per node No more than 5000 nodes No more than 150000 total pods No more than 300000 total containers You can scale your cluster by adding or removing nodes.

[What’s Next for Infrastructure as Code? | Cloud Posse Explains](https://www.youtube.com/watch?v=TqYFi4WxICQ) |

[Ephemeral Environments Strategies | Cloud Posse Explains](https://www.youtube.com/watch?v=0h2yn-uk5ZE) |

links from today’s session:
• https://www.sec.gov/Archives/edgar/data/1720671/000119312521319849/d205906ds1.htm
• https://news.ycombinator.com/item?id=29110444
• https://github.com/hashicorp/terraform-provider-aws/releases/tag/v3.64.0
• https://aws.amazon.com/blogs/aws/goodbye-microsoft-sql-server-hello-babelfish/
• https://ghuntley.com/sudo-rm-rf/
• https://grafana.com/blog/2021/11/09/announcing-grafana-oncall/
• https://github.com/gruntwork-io/terratest/releases/tag/v0.38.1
• https://aws.amazon.com/about-aws/whats-new/2021/10/amazon-ec2-amazon-machine-images-organizations/
• https://discuss.hashicorp.com/t/request-for-feedback-config-driven-refactoring/30730
• https://github.com/hashicorp/terraform/releases/tag/v1.1.0-beta1
• https://github.com/cloudposse/terraform-aws-eks-node-group/pull/93

Does the 15 min runtime limit start from beginning or end of downloading the image?

lol. Your time is over, please try “FROM scratch”

How do you suggest doing DB Snapshot Dump and Restore from Production to Dev/Staging/QA envs ?

Yea way we do it for preview environments that we dump DB from prod to staging, do the tokenization, and then take a EBS Snapshots from this. The we create volumes from it on each preview environment deployment and create its PV and PVCs.

It’s quick, take about 2~3 mins to restore, and dump is done daily at night.

Was just asking if there is other ways of doing that

Nope, it sounds correct The only thing that I would change here is doing the tokenization in prod to decrease threat radius.


[How to Architect for VPC IP Limits | Cloud Posse Explains](https://www.youtube.com/watch?v=mxvohZWDOuI) |

[Maintenance Pages Dos and Don’ts | Cloud Posse Explains](https://www.youtube.com/watch?v=aM91Np8O4E4) |

Question: are there any SQL database (postgres compat) solutions which run in AWS (EC2 or EKS), and outperform AWS Aurora in terms of write speeds, replication speeds, and overall performance?

pricing is not on the radar

• Also any experience from anyone who migrated either side?

candidates: CockroachDB, Percona

• Also does anything 3rd party implement query priority in postgres?

it will be hard to bit the I/O performance of the aurora storage solution

which at the end of the day is what 70% of the performance of any database?

I have hear of people using proxies that do some other stuff on front of aurora

That would be another interesting topic to hear

• Any proxies that provide query priority, better replication etc? Are any other good examples of a proxies?

Also any experience from anyone who migrated either side?
@Max Lobur (Cloud Posse) can you elaborate a bit? do you mean to/from Aurora?

From 3rd party to Aurora or vise versa. What were the benefints, losses, pain points

the version changes is a paint if someone is using specific version features

getting the sizing right is another peoblem

Migration of data is another problem, you can’t just copy on the aurora drive since is not available for you

and s3 import only works for mysql so then pgdump needs to be used for postgres

Question: On moving away from Terragrunt, and completely into native Terraform, what are good resources to learn about how to split Terraform workspaces for infrastructure? We are using a centralized infrastructure-live
repository but would like to reduce both the blast radius and time to plan/apply (e.g. separate repositories for tf-networking
, tf-messaging
~Also, how to organize infra repositories taking multi-region/multi-account into consideration~replied in thread)

how to organize infra repositories taking multi-region/multi-account into consideration
related question a few months ago:

Thanks a lot, Andy! That really answers that whole question. I think a follow-up question is if CloudPosse plans to keep updating https://github.com/cloudposse/reference-architectures (as it’s archived now), or is documenting best practices somewhere else

Universal Tool for DevOps and Cloud Automation (works with terraform, helm, helmfile, istioctl, etc) - GitHub - cloudposse/atmos: Universal Tool for DevOps and Cloud Automation (works with terrafor…

I know this thread is a little old but I have previously used reference-architecture and understand Atmos is now replacing it. But I cannot seem to find anything about a cold start. In particular, is there anything to replace the provisioning of the master and member accounts?

Curious how many dev teams are using conventional commits at all –> https://www.conventionalcommits.org/en/v1.0.0/
A specification for adding human and machine readable meaning to commit messages

A couple of teams I’ve been on have tried, but it ended up being a burden and a point of friction. In theory it’s great for autogenerating the changelog but it’s annoying to enforce
A specification for adding human and machine readable meaning to commit messages

That is exactly the kind of answer I was looking for

the idea seems great

practically, man it seems a bit rough to keep up with

There are various ways to enforce the format, so that’s not much of an issue. My issue is the added burden on developers who are forced into making sure they write commit messages in specific ways

I could see that causing dissent


[Securely Perform RDS Backups and Restores from Prod | Cloud Posse Explains](https://www.youtube.com/watch?v=aQWQB5YXWSY) |

[Terraform Similar Alerts with Different Thresholds | Cloud Posse Explains](https://www.youtube.com/watch?v=eT65kZkxu4w) |

Question: How do you bootstrap IAM/service/machine roles for CICD and allow the repository to self manage? example: A repository of terraform files deployed with github actions, with n number of environments. A single IAM role is assumed by the action, how can I bootstrap this role and allow it to be updated by a workflow in the same repo? ( I asked this question in #terraform and received great responses, just for discussion )

I’m going to Re:Invent! CloudPosse meetup?

@Matt Gowie has a module for it https://github.com/masterpointio/terraform-aws-amplify-app (following cloudposse conventions)
A Terraform module for building simple Amplify apps. - GitHub - masterpointio/terraform-aws-amplify-app: A Terraform module for building simple Amplify apps.

Gitpod streamlines developer workflows by providing prebuilt, collaborative developer environments in your browser - powered by VS Code.

Codespaces has the full power of Visual Studio Code, including the editor, terminal, debugger, settings sync, and any extension.

@Erik Osterman (Cloud Posse) qq - in the above session you mentioned you folks have a way to create schema for mysql.
Could you please share if you are using https://github.com/hashicorp/terraform-provider-mysql ?
If so, how do you deal when with the situation when your DB instance is in a private subnet (which it should be in a prod env)?
Sadly the above provider doesn’t work with ssh tunnel/ socks etc. Also the fact that credentials must be provided ….

The only way to handle this is either using a VPN or with GitOps and private runners (we use the latter)

fair, that is what i have today with GHA self-hosted runners running as docker containers inside EC2 (acc A) and have Peering with all the other acc where i need to connect.

Ya we use a transit gateway to connect all the accounts, then have an “automation” account where we run things like GHA runners and space lift workers