#office-hours (2021-04)
"Office Hours" are every Wednesday at 11:30 PST via Zoom. It's open to everyone. Ask questions related to DevOps & Cloud and get answers!
Public "Office Hours" are held every Wednesday at 11:30 PST via Zoom.
[thread] CTO.ai slackops first class approach?

Code-hosting service GitHub is actively investigating a series of attacks against its cloud infrastructure that allowed cybercriminals to implant and abuse the company’s servers for illicit crypto-mining operations, a spokesperson told The Record today.

We’ve been hit

what Describe high-level what changed as a result of these commits (i.e. in plain-english, what do these changes mean?) Use bullet points to be concise and to the point. why Provide the justific…

Lame we can’t mark as spam

Submitting a spam report requires ~6 clicks and an explanation of why I’m submitting it as spam.

This is itself absusive.

Github is retarded. So crypto miners can open up as many of these spam PRs as they want.

I report them, and I am the one rate limited.


When adding custom metrics to our apps, I’m interested in how people structure any given metrics. For example, we could post a single metrics with tags to indicate status, but that makes calculations more complex and prone to error, whereas, if we post the total and and error count, we can more easily get the ratio of errors. What are best practices in this area?

Have you checked out the docs?
As a rule of thumb, either the sum() or the avg() over all dimensions of a given metric should be meaningful (though not necessarily useful). - @ https://prometheus.io/docs/practices/naming
Having tags indicate status is not a good pattern.
An open-source monitoring system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach.

Yes, But my question is about what related metrics to publish, not simply how to name metrics but that is

For example, we could post a single metrics with tags to indicate status
What kind of status are we talking about?
Generally the pattern is to use APP_UP
metric to indicate if service is up or down (0 down, 1 up).
On top of that you add additional metrics (golden signal) and then anything specific to your app usage that makes sense.
Check out this book, I think it will answer many of your questions

if we post the total and and error count, we can more easily get the ratio of errors. What are best practices in this area?
You are exactly right here @Eric Berg
– I recommend posting exactly those types of counts – categorized by _total
and type of error: _failed
/, etc.
And, in your binaries, avoid doing any math on those counts – for instance, don’t calculate the error ratio in your binary and expose that as a separate metric.

@here office hours is starting in 30 minutes! Remember to post your questions here.

Im curious how people manage auto rotation of iam user access keys within terraform if time permits.

basically im manually bumping a user creation module now, i had thought of creating a pipeline schedule that taints the resource and reapplies but curious what other people do.

what kind of key material or keypairs are these? (access key ID + secret access key) ?

Anyone use any of the CNCF tools that work on top of Envoy? If so, did it ease your workflow or make it more complicated? I see Contour, Curiefense, and Open Service Mesh as examples.

Is anyone using any type of IPAM software dynamically in terraform or have a way you define and slice from suberblocks/supernets

netblox, nipap, ryo aws service etc

@Steven Hopkins I have been wonder the same thing on what companies are doing for IPAM source of truth
. For datacenter/on-prem we used PHPIPAM or but I have been looking at netbox since terraform provider is more full featured, and it has come recommend. I not heard or seen nipap and ryo, I am going to give these a look this week.

nice, let me know what you think of ‘em all

Do you have a link to aws ryo service? Is this a an abbreviation for some obscure aws service, my google fu is only coming up with people name Ryo especially “AWS Networking with Ryo Koyama” lol

I can tell you Device42 is great for Datacenters but horrible for AWS - I wrote a custom python script that fed Device42 info from AWS api requests that had all our datacenter and cloud stuff in 1 place. It wasn’t a great solution but worked decently well. If I had to do it over again I would just push harder for netbox. (I haven’t seen nipap or ryo either)

We had a demo from them about 6-8 months ago, most of our cloudops staff were a solid no, while our dc, network engineers and compliance team were a yes. We didn’t move forward with it.

Looks like AWS now has a native solution
Learn how to allocate a CIDR to a pool.

Andrew Shepherd has joined Public “Office Hours”

It looks like Google created an open-source repo to help track “the four keys”…https://github.com/GoogleCloudPlatform/fourkeys
Contribute to GoogleCloudPlatform/fourkeys development by creating an account on GitHub.

coworker posted this, I think it’s actually not DORA though, it’s mostly Puppet

Yeah, it’s similar, but not the same thing.

@ikrnic @puppetize @jezhumble @jessfraz @alannapb @stahnma @nigelkersten We collab’d and I was principal investigator 2014-2017. DORA split and did our own report 2018-2019 (I was PI).
In 2020, DORA is led by @jezhumble (coauthors & Dustin Smith joined 2019); they’ve released some fab work like Quick Check and deep dives http://cloud.google.com/devops

question about the cloudposse terraform-aws-elastic-beanstalk-environment
…what I’m trying to do is configure the beanstalk-env loadbalancer to redirect HTTP requests to HTTPS
It doesn’t seem to be a config option of the module itself, so wondering what’s the best way to configure this?
One thought I had would be to find the arn
for the load balancer from outputs and then modify it within a resource block, e.g.
resource "aws_lb_listener" "front_end" {
load_balancer_arn = "${find.arn.from.ebsEnv.output}"
port = "80"
protocol = "HTTP"
default_action {
type = "redirect"
redirect {
port = "443"
protocol = "HTTPS"
status_code = "HTTP_301"
…but then the above might stomp out other settings of the load balancer?
Another option could be to fork https://github.com/cloudposse/terraform-aws-elastic-beanstalk-environment and then PR a config option to for redirectHttpToHttps
…any help/advise greatly appreciated
thx for the question @larry kirschner! might get a quicker response in #terraform on this one
ok thx for getting back…I found something I’m going to try which is this:
resource "aws_lb_listener_rule" "redirect_http_to_https" {
listener_arn = {lb arn}
action {
type = "redirect"
redirect {
port = "443"
protocol = "HTTPS"
status_code = "HTTP_301"
condition {
http_header {
http_header_name = "X-Forwarded-Port"
values = ["80"]
…if that doesn’t work will try that terraform channel. Thanks again for responding to my q!

…digging around I also found the lb_listener_rule
resource, which looks promising:
…they have an example for HTTP => HTTPS…but it looks weird to me, because the condition isn’t PORT==80?
resource "aws_lb_listener_rule" "redirect_http_to_https" {
action {
type = "redirect"
redirect {
port = "443"
protocol = "HTTPS"
status_code = "HTTP_301"
condition {
# shouldn't condition be PORT==80 somehow?
http_header {
http_header_name = "X-Forwarded-For"
values = ["192.168.1.*"]



Hi all! Been a long-time listener to CloudPosse office hours and glad to be joining you on Slack

@Erik Osterman (Cloud Posse) If I can add a question to todays discussion, how do people approach migrating existing AWS infrastructure into Terraform for large-scale projects with many resources?

@here office hours is starting in 30 minutes! Remember to post your questions here.

Question for discussion: Does anyone have a solid process for terraform state migrations in larger teams?
My largest client had an issue today where a newer infrastructure engineer did a bunch of terraform state mv
migrations for work of his that hadn’t been merged upstream yet and it caused us to roll back a bunch of his state changes. I’d like to propose a better solution for them to do state migrations going forward and I believe I know how I would do it, but I’d like to see if anyone in this group has strong opinions or has already gone through the trenches with this type of problem before.

Thanks for the talk tonight! first intro to terraformer! and 1password automation given at a great time!


For the next office hours can we get a tutorial/demo on how the Cloud Posse README.md files are generated? I see the structure of the README.yml and the associated docs/terraform.md
(which i assume is generated by terraform-docs markdown
. I’d be thrilled to see how it all comes together in CI/CD.

If its not worth the office hours time, and there are docs/demos already available, just point me at those and I’ll proceed with due dilligence.

In certain scenarios, we have had to bootstrap containers to handle variation in configuration files for different environments (e.g. staging / production). We have done so by leveraging mostly Docker entrypoints and confd or shell scripting but for simple and not very complex scenarios.
Is there a better solution or anything you could recommend that would help avoid adding too many abstraction layers to container config management? (e.g. ansible pull)

gomplate documentation

@here office hours is starting in 30 minutes! Remember to post your questions here.

@here office hours is starting in 30 minutes! Remember to post your questions here.

