#announcements (2018-07)
Cloud Posse Open Source Community #geodesic #terraform #release-engineering #random #releases #docs
This channel is for workspace-wide communication and announcements. All members are in this channel.
Archive: https://archive.sweetops.com
2018-07-01

Second training repo. No step by step instruction yet. If you have ideas, or suggestions please submit issues. https://github.com/firmstep-public/trainingdaytwo
trainingdaytwo - Day Two: Creating Terraform modules

The example module in demo_one is a working module for a web server in an asg. Which I figured was a common desire for the first thing to template for a company.
2018-07-02

@rohit.verma I was able to reproduce that s3 path issue you were experiencing when mounting the s3 bucket: https://github.com/cloudposse/geodesic/pull/167 - once we have it CR’ed and tested will have a new release with this fix in it. Will keep you posted.
what Add the absolute path to the s3fs script for mounting s3 bucket via goofys. why Fix a bug where pathing issue causes any mount -a invocation to throw an error: /bin/sh: s3fs: not found

@mcrowe (and possibly @tamsky) also ran into it

just wondering why the mount
command is not honoring the PATH

Hi Erik, we had a quick chat on Friday regarding ECS! I would love to know how you do it. The end-goal for me is to be able to use 3rd party deployment software which most likely pushes a docker image based on a GIT-SHA, meanwhile Terraform should be able to run and take the last active task definition ( this part is simple ). Harder part is when I want Terraform to be able to modify the environment variables while keeping the image of the last task definition.
The idea I now have is by taking the image variable length to decide if it’s bootstrapping or keep_image:
# Initial bootstrapping with ecr/repo:tag module “ecs_app” { image = “ecr/repo:tag” }
# after that, a change to keep_image module “ecs_app” { image = “” ( default) }
when image==””, i create datasources to get the current image definition, use that as input for the updated task definition resource.
Not sure if that would all work, and happy to hear your input on that.

roughly it breaks down like this:
1) deploy ecs service task with container definition. default the container to our default-backend
(~ a 404 page).
2) use codebuild/pipeline to CI/CD all image deployments.
3) define one public ALB per cluster
4) use ingress rules (targets) to route traffic based on host/paths to backend services
the inspiration for the architecture comes from our experience with kubernetes and trying to make it feel more like it.

what I describe above is captured in this module:

terraform-aws-ecs-web-app - Terraform module that implements a web app on ECS and supporting AWS resources.

the CI/CD is here: https://github.com/cloudposse/terraform-aws-ecs-codepipeline (which swaps out the image & tag)
terraform-aws-ecs-codepipeline - Terraform Module for CI/CD with AWS Code Pipeline and Code Build for ECS https://cloudposse.com/

we programmatically generate the container definition JSON in this module: https://github.com/cloudposse/terraform-aws-ecs-container-definition
terraform-aws-ecs-container-definition - A Terraform module to generate well-formed JSON documents (container definitions) that are passed to the aws_ecs_task_definition Terraform resource

and our ALB modules are in: https://github.com/cloudposse/terraform-aws-alb https://github.com/cloudposse/terraform-aws-alb-ingress
terraform-aws-alb - Terraform module to provision a standard ALB for HTTP/HTTP traffic
terraform-aws-alb-ingress - Terraform module to provision an HTTP style ingress rule based on hostname and path for an ALB using target groups

oh, and our “default-backend”

default-backend - Default Backend for ECS that serves a pretty 404 page

Does Codebuild allow cross account ECR repos by now ?

we have 1:1 codebuild/codepipeline/ecr/web app

so a different build image for your staging , as for prod ?

aha, for now, we have not orchestrated that.

in our case, we would promote an image to the prod repo

but we have not done that yet

(we’re pretty early in our ECS journey as most of what we use is k8s)

let me correct that

we currently would rebuild it

but the way I would want to solve it eventually is to promote the image between ECR repos

still reading through it..

I see this working when having an imagename like repo:latest , but not for repo:unique_id or am I missing something

@sarkis where is an example of our build spec?

we do set a tag

so for example, it’s possible to only deploy tagged releases

then it would be repo:tag

Contribute to docker-php-poc development by creating an account on GitHub.

We should add this example to the web app repo docs
Contribute to docker-php-poc development by creating an account on GitHub.

- ill get to this by EOD … added to my tasks for the day

(we never pin to latest)

printf '[{"name":"%s","imageUri":"%s"}]' $CONTAINER_NAME $REPO_URI:$IMAGE_TAG > imagedefinitions.json

@maarten I thought you were doing something similar to this

so it all comes down to how IMAGE_TAG
is computed

yes, but after the building, how is terraform aware of the new IMAGE_TAG, I’m not seeing that.

oh, the lifecycle of the image:tag is not the job of terraform

this is our concession

terraform is strictly responsible for deploying the infrastructure that powers the service

monitoring

autoscaling

iam permissions, etc

so i think we ignore changes, right @sarkis?

I agree there, but for me ENV VARS are a sort of grey area

SSM

+chamber

Yeah at Blinkist we’re using SSM

(it doesn’t resolve how to “rollback” envs, but we’ve also conceded that we won’t solve for that)

But another customer doesn’t want SSM or doesn’t want a wrapper inside his Docker for that..

So I thought, maybe I can find a way to deal with that using container def. datasources

i agree that I don’t like the wrapper inside the container as the entrypoint, but it’s become the necessary evil to reduce complexity with terraform.

wrapper = chamber

would be nice to have have ENV VARS defined for the ecs service instead, problem solved

something else, which is beautiful, happens if you use SSM though…

call out to @jamie for introducing us to this pattern


terraform-aws-ssm-parameter-store - Terraform module to populate AWS Systems Manager (SSM) Parameter Store with values from Terraform. Works great with Chamber.

you can provision those SSM parameters from outputs of your other modules

users, passwords, hosts, etc

yeah I’ve seen that, super cool and will use it

what’s the customer’s counter argument?

…we’ve even started using chamber with kubernetes in place of configmaps and secrets. it makes it much, much easier to manage ACLS+IAM using IAM roles

Another thing in the chain they don’t know.. I was probably not convincing enough.. first customer after my current employer ..


haha, yea, understood - in the end, if you overwhelm them with all the pros, I think the cons are very minimal.

the ECS envs are also not encrypted

ok, outside environment variables you still have CPU and MEMORY definitions

that is an actual terraform argument I think

maybe not for Fargate

that is an actual terraform argument I think
can you elaborate

inside the task definition you define the cpu and memory for a task

Of course you can set these vars during deployment

but is this something any developer should do in some conditions, or rather have memory/cpu centrally orchestrated

aha, yes, but i don’t think this solution precludes that

we set some defaults

but the buildspec.yaml
can also override them

printf '[{"name":"%s","imageUri":"%s"}]' $CONTAINER_NAME $REPO_URI:$IMAGE_TAG > imagedefinitions.json

just add memory
to that

no?

you’re right. I’m thinking from a perspective where the CI is circleci and not managed by terraform..

aha

yes, i can see that some stuff gets more complicated that way

But still, circleci can invoke codebuild

i guess

or a lambda which does nothing but deployment, and terraform manages the parameters for the lambda.

yea, lambda is the ultimate escape hatch. can probably accomplish it that way.

That also saves me distributing access keys which can stop services.

Do you a tool for remote command execution, fixing a failed migration etc ?

Thanks anyway, I didn’t find the answer to my solution, but it made it clear that when I control the deployment of ecs I can still control the container definition, be it codebuild or not.

yep! no problem. i think we went down a similar path until we resovled that it wasn’t feasible at this point in time with terraform.

hrmm… the only migrations we’ve done so far happen in k8s and then we’re able to exec
into the container to perform manual remediations.

@jamie might have some tips

ok because the new guy replacing me is working on my old tool which does this:
- Takes current running task, properties, security groups
- Creates keypair
- Starts EC2 ECS Instance with keypair
- Starts task [same iam etc]
- SSH into EC2 instance , creates a socket forward for /var/run/docker.sock ( this is so cool)
- docker exec into task

that sounds pretty cool

on-demand ssh

i know this is a pattern promoted by Teleport SSH (gravitational)

but haven’t seen it in practice yet with ECS

https://github.com/blinkist/skipper it’s dormant now, also because my golang skills are .. but I’m sure he’ll be able to make something nice of it
skipper - Maintenance tool for moving docker containers on ECS.

Anyway, I’ll keep you updated on it, for now it’s focus is on regular EC2, with already SSH access.
For conditions without VPN we can maybe also add a network load balancer to allow outside access to internal ssh

yea, what I think could be neat is to have something like this:

bastion - Secure Bastion implemented as Docker Container running Alpine Linux with Google Authenticator & DUO MFA support

that is deployed on demand into the cluster for triaging

e.g. fixing failed migrations

Well it would be nice to be able to completely log it to an S3 bucket, with the private key EC2 generation for the ec2 instance for the specific user (for which it needs MFA ) the extra MFA is probably overkill.

but is ec2 instance access necessary?

what i like about using containers is that it’s still isolated

I like that too, but when things brake someone wants to have access I suppose..

One question, how quick is codebuild now w/r booting up ?

It’s pretty quick now, but as you can see from the code for the “test app” we are using - it’s really basic.

@sarkis can maybe answer this

The idea I have now is this
- Have CircleCI test & build , a lot of startups here are using Circleci
- After build and push to ECR, push textfile to S3 in dev/uat/prod environment with image:tag
- Codebuild just pushes to ECS
- CircleCI loops&polls codebuild result, finishes

Yep, that sounds like a good solution to a common use case

@Erik Osterman (Cloud Posse) https://github.com/cloudposse/geodesic/pull/168
What it is Previously this used an old spec that caused newer installations of the this chart to fail. storageSpec has since been updated to just storage See: coreos/prometheus-operator#860 (commen…

One question regarding terraform-aws-ecs-web-app Have you ever had issues with the ecs_service being created before the listener_rule was added to the target_group ? I don’t see this dependency being forced in terraform-aws-ecs-web-app and I had this quite a lot and caused me to trick around.

@sarkis

@maarten we do have that issue - since it’s a one time problem (cold boot) - we are just running terraform apply twice for the time being. I’d like to at some point dig into the provider and see if there is something to be done there, before trying to hack this with depends_on
statements

Clarify what you mean by twice

I think what you mean is two phases

Not twice as retrying after failure :)

well it fails then

depends on definition of fail.

But the hack to mitigate it is .. kind of ugly..

How did you get it working?

terraform-aws-airship-ecs-service - Terraform module which creates an ECS Service, IAM roles, Scaling, ALB listener rules.. Fargate & AWSVPC compatible

so the listener rules of the alb handling are outputted

inputted after into the ecs service module

then a null resource .. doing nothing

and aws_ecs_service with a depends_on

clever

but maybe the null_resource could be removed, create a local, and add the local to the count = inside the ecs_service I think now

might be a computed count issue

well, we don’t have to count it, just evaluate it in someway

hm

so @maarten looking over your ecs_service/ in more depth - it looks like you need 2 tf runs as well right?

nope

where ?

i guess i’m not seeing where lb_attached gets changed

oh nvm i see how you make it wait with null_resource

lb_attached is just input for if it is a worker or a web service


The ugliest hack you will find here : https://github.com/blinkist/terraform-aws-airship-ecs-service/blob/master/modules/ecs_task_definition/main.tf search for my_random Module has as input key-value pairs which afterwards are turned into Name: Value: pairs for the environment variables of the task definition.. I found out that when “true” runs through a null_resource it will be casted to a 1 ..
terraform-aws-airship-ecs-service - Terraform module which creates an ECS Service, IAM roles, Scaling, ALB listener rules.. Fargate & AWSVPC compatible

I have no idea how I could fix this otherwise, input very much welcome

yea typecast hell is real in TF

we did something similar for converting TF -> json

i mean not similar but a similar typecast issue

let me find it - it might make you feel better about your hack

I think the ecs_service can be done without null_resource, add another empty item to the list, grab that item from the list, call it empty_string , and use that empty_string somewhere in ecs_service

terraform-aws-ecs-container-definition - A Terraform module to generate well-formed JSON documents (container definitions) that are passed to the aws_ecs_task_definition Terraform resource

lol

i call that “cost of doing business with terraform”

price we have to pay

sadly - i was reading through HCL2 didn’t see anything specifically about the typecasting

there was a vague mention that types work better or something

Do you know if your modules will break yes/no ?

that’s also very vague rigth now - i think they are waiting for the community to do the dirty work

Wonder what is smart then, just create a new one calling it module-hc2 and go from there

i was thinking a new branch to start out

for example in the ecs_service part i now have 4 x ecs_service with conditionals.. with hcl2 this can be compacted to just one

but that might be too optimistic

ah yea i hear you though - it’s going to assume you can use the new tf version everywhere too

can prob fix this with tags though… go to a new major release for HCL2 (in your modules)

and then a note in the readme

sounds like they are going to support both languages initially, no?

so that provides an upgrade path

so modules can lay claim to the legacy provider until upgraded

hmm how would that work? oh just depends on what provider version you lock?

but i’m definitely nervously biting my nails right now hoping that it won’t be too painful

we have something like 70 modules

yea and i’m certain we do some interesting workarounds / hacks that are going to be fixed/deprecated in the future

Night ttyl


goodnight!
2018-07-03

Good morning… new to the channel and not sure where I should / if I should be asking questions here… A while back I saw you guys do a presentation along side Codefresh…I remember being very impressed with the deployment pipeline you guys had set up and I am trying to get something of my own set up. I am curious how you guys connect git tags and pull them through to your dockerhub registry

I just found this page …. seems like I am on the right track https://docs.cloudposse.com/release-engineering/cicd-process/semantic-versioning/

@cbravo I can share more details a little later today

Currently afk

Yes that’s a good place to start

@Erik Osterman (Cloud Posse) Thank you very much…. I am also scoping out this repo (https://github.com/cloudposse-demo/demo-catalogue) and the build harness repo
Contribute to demo-catalogue development by creating an account on GitHub.

we’ve also iterated a lot since that demo

would love to get you to try out the new stuff

all of our new stuff uses helmfile

are you familiar with that?

I am not

it simplifies a lot of stuff around working with helm

we are just getting our feet wet with helm but the particular project I am currently focused on is just a deployment image (it has aws cli in it and some other tools we use…) and I am trying to come up with a way to keep the git tag inline with the tags in docker hub without having to do a bunch of automated steps

codefresh gives you the git hub short revision but no access to the git tags

aha, ok - that is a simpler use-case

we are slowly ramping up our knowledge of help and kubernetes but we aren’t there yet

terraform-root-modules - Collection of Terraform root module invocations for provisioning reference architectures

here’s a more simple codefresh pipeline

this builds a docker image and tags it with the semantic version generated in a previous step

it’s also an example of pushing the image to multiple registries

Any Datadog users here ?

i think we all have a bit of experience (cloudposse org)

I’ve not seen a good timeboard module yet, and with locals the population of multiple graphs can be fixed .. at least it looks like it

i personally have less than others here with datadog - my use case was limited to monitoring kafka

@Erik Osterman (Cloud Posse) so is the build harness something that the public should be using? like should I consider using your build harness? or is it something that could change without warning

or really something I should just consider taking pieces of and rolling my own solution based off it

we recommend to all of our customers to use the build-harness

it’s well maintained and we tag all releases. so if it suits your needs, by all means, leverage it.

especially with codefresh, the build-harness makes a lot of sense since every step runs in a container

@cbravo let me know if it would be helpful to take a look at what you have

and if we are not currently customers?

only share what’s not subject to an NDA


you can DM me

I am 200 messages behind this chat. Sorry I’ve been afk guys.

you snooze you lose


Totally!


If anyone wants to take a look at the basic structure of a datadog timeboard module, feel free to comment.
https://github.com/maartenvanderhoef/terraform-datadog-timeboard/blob/master/examples/main.tf
https://www.terraform.io/docs/providers/datadog/r/timeboard.html
Problem of the datadog_timeboard was that it’s set-up like the cloudfront resource.. many blocks after another inside one resource.. but now with locals it can be modularized a bit. I wanted to be able to create graphs seperatedly from creating the actualy timeboard by creating 2 modules and this seems to be working.
Contribute to terraform-datadog-timeboard development by creating an account on GitHub.
Provides a Datadog timeboard resource. This can be used to create and manage timeboards.

@dave.yu @Daren maybe something interesting for you guys

cool will take a look


anyone know how to get the version of a terraform module programmatically? e.g. ${module.version}

use-case: I want to download artifacts from github release corresponding to the version of the terraform module

(e.g. a lambda zip file)

this appears to work:

variable "git_ref" {
default = "tag"
}
data "external" "example" {
count = "${var.git_ref == "tag" ? 1 : 0}"
program = ["git", "-C", "${path.module}", "tag", "--points-at", "HEAD", "--format={\"ref\": \"%(refname:lstrip=2)\"}"]
query = {
}
}
output "ref" {
value = "${join("", data.external.example.*.result.ref)}"
}

minor nit:
even with a data.external.example
input or output, we can’t use either to instantiate the module:
Terraform does not support interpolations in the source parameter of a module

outputs:

ref = test-0.1.1
2018-07-04

Hey @Erik Osterman (Cloud Posse) – been looking at aws-cloudfront-s3-cdn. Do you guys have a strategy for a javascript bundled webapp deployed to dev/test/prod via codepipeline?

I’m wondering: 1) Github -> hook -> CodePipeline (builds app via webpack) and pushes to dev 2) Q/A approves, time to move to test/uat 3) (?????) push dev artifacts to uat 4) Customer approves live 5) (?????) pushes uat artifacts to prod

we usually use tags for gate control

so branches = dev

merge to master = pre-production

tags like release-
go to prod

we have implemented this with codefresh

not yet with codepipeline

Maybe I’m over-thinking it. Maybe each one is a codepipeline task off a branch

There’s a seriously small cap on the number of pipelines you can make. But because you can pass a lot of data to them you can do a lot. So I suggest making your pipeline as generic per task as you can and making it work for you.

So what you describe is a pretty common pattern. A few of our customers do exactly that. We haven’t packaged that up as a terraform module. Most of them use Codefresh for CI/CD, and then use

terraform-aws-cloudfront-cdn - Terraform Module that implements a CloudFront Distribution (CDN) for a custom origin.

terraform-aws-s3-website - Terraform Module for Creating S3 backed Websites and Route53 DNS

Here’s a reference implementation https://github.com/cloudposse/terraform-root-modules/tree/master/aws/docs
terraform-root-modules - Collection of Terraform root module invocations for provisioning reference architectures
2018-07-05

hey folks, quick question on the prometheus-cloudwatch exporter. When i try to follow the kubernetes instructions (https://github.com/cloudposse/prometheus-to-cloudwatch) the prometheus-to-cloudwatch
pod fails to deploy, saying the image is unavailable. I’m running helm install .
from within the charts
subdirectory so I’m not sure what to do next.
prometheus-to-cloudwatch - Utility for scraping Prometheus metrics from a Prometheus client endpoint and publishing them to CloudWatch

@tom thanks for the report. I will check into this a bit later today.

Hello @tom - do you have some more info you can share, namely helm/tiller version while we look into reproducing the issue?

@Andriy Knysh (Cloud Posse) are you are around to take a look?

@sarkis probably just wrong image tag

It should correspond to the latest release in the repo

ah yea

i see an old version here: https://github.com/cloudposse/prometheus-to-cloudwatch/blob/master/chart/values.yaml#L21
prometheus-to-cloudwatch - Utility for scraping Prometheus metrics from a Prometheus client endpoint and publishing them to CloudWatch

i’ll check in after i push through the few tasks i have today - if you haven’t looked yet I can dig into it..

@tom

Ah yeah that was it. Thanks for your help

great!

For our use case, it’s so we can download a zip artifact for a lambda function from the module GitHub release page.

what Do not set Name and Namespace why When we try to set the tags "Name" and "Namespace" the second deployement fails (the first one is ok). example * module.front-elastic-…

@jamie any idea how we can change this to only apply Name
and Namespace
on first apply?

apparently, all applys after the first one fail if these are set

Yo

just a moment ill look

thx

well, yeah

lifecycle { ignore_changes = [“tags”]}

might do it

ah, yea, I think that’s a better fix

I’m not set up to test that it will fix it though

are you able to?

i’ll ask that they try it out

if not… then its possible to do their fix… or strip “Name” and “Namespace” tags back out

but i think their solution is .. tidier then removing two map items from a map


by having a data lookup the existing tags on the elb

and only apply the extra tags

ok, my response: https://github.com/cloudposse/terraform-aws-elastic-beanstalk-environment/pull/37#pullrequestreview-134777928
what Do not set Name and Namespace on Elastic Beanstalk Environment why When we try to set the tags "Name" and "Namespace" the second deployement fails (the first one is ok)….

since you can reference resources directly

so you could do { if lookup(this.resource.tags, “Name”, “NAME_NOT_EXISTS”) != “NAME_NOT_EXISTS”} add full list of tags { else } add subset of tags {end}

^psudocode

Thanks for thinking of me


yea, that will simplify so many things in complex terraform modules

thanks!

As part of the lead up to the release of Terraform 0.12 later this summer, we are publishing a blog post each week highlighting a new feature. The post this week is on first-class …

i should start constructing a sed expression to replace “${var…}” to var…. and post to the cloudposse blog

no need i think

they will have a conversion method

oh like a HCL 1->2?

so all the standard stuff will convert as easily as something like terraform fmt .

yea

nice

thats true this should be a fmt change

(also, I’m not sure if fmt
is the action they will use, but I read that they will provide a means of automatically upgrading code)
2018-07-06

https://github.com/kubernetes/kops/issues/2537 (via @Daren)
I noticed a few instances where if a pod is hung in ContainerCreating state, or some other state and won’t go into Evicted state Kops hangs forever waiting for it during a rolling-update.

@Max Moon @dave.yu heads up

there’s an issue with kops 1.9 where it has trouble detecting failed pod evictions

Good looking out, thank you!

We hit it on every node with nginx-ingress-controller

Dang, I upgraded worker node size yesterday and fortunately it went smoothly
2018-07-09

@dave.yu @Max Moon heads up: https://github.com/cloudposse/geodesic/pull/172/
what Replace sets with inline values Rewrite values files with inline values (except files with comments that used to override values) why #169

we’re planning on merging this soon.

it uses inline values.yaml
to make it easier to maintain

Thanks @Erik Osterman (Cloud Posse)
2018-07-10

Hey i have a tip for you guys when dealing with ecs

in many cases the service, the containers, and the metrics all want the cluster name, not the arn

but the resource doesn’t provide a name, just the arn

but you can do this:locals { cluster_name = "${basename(aws_ecs_cluster.default.arn)}"}

to get the name

that’s a good one! like the use of basename(...)
for this

I was doing
locals { cluster_name = "${element(split("/",(aws_ecs_cluster.default.arn), 1)}"}
Before.

really nice tip - thanks @jamie i was trying to figure out last week how to get the name with just the ARN (was looking at data source, but some sources don’t have name available :()

I thought it would be up your alley

I have grown to dislike “terraform_remote_state” data provider

hmmm - not good in practice?

how come?

It requires a lot of details to collect details from a remote state

It has meant that when writing code that depends on other modules

you have to pass in “workspace”, “s3 bucket name”, “state path”, and use shit like:
### For looking up info from the other Terraform States
variable "state_bucket" {
description = "The bucket name where the chared Terraform state is kept"
}
variable "state_region" {
description = "The region for the Terraform state bucket"
}
variable "env" {
description = "The terraform workspace name."
}
locals {
state_path = "${var.env == "default" ? "" : "env:/${var.env}/" }"
}
### Look up remote state info
data "terraform_remote_state" "vpc" {
backend = "s3"
config {
bucket = "${var.state_bucket}"
key = "${local.state_path}vpc/state.tfstate"
region = "${var.state_region}"
}
}

That locals hack to get around the non-conformance that it has with workspaces

hrmm… yes… fwiw, here’s what we’ve recently done

data "terraform_remote_state" "backing_services" {
backend = "s3"
config {
bucket = "${module.identity.namespace}-${module.identity.stage}-terraform-state"
key = "backing-services/terraform.tfstate"
}
}

we have a module that keep track of a lot of constants

that can be reused

i think just stay clear of workspaces

we don’t love it

Workspaces were so good when I discovered them

yea because you’d think it would help DRY

With one command I could switch from the ecs container task that has jenkins deploying, to the nginx container, to the node container

all the same code, but dif vars

Yeah, but when it comes to state management and workspaces

there is a lot to be desired

If only terraform_remote_state
wasn’t the only data source with a default

I’d be using parameter_store, or an api call.

\/whinge

One last gripe….!

I would prefer that blocks that only have one entry in them were put all in one line when you formatted.

haha devops therapy sessions

how does that make you feel?

resource "aws_thing" "default" {
tags = { managedby = "Terraform"}
}

output "thing" { value = "the thing i mentioned" }

IMO, I don’t like that. I agree it’s concise, but in most language frameworks I’ve dealt with, they are strict about how braces are used. Typically, they enforce one of:
if (...) {
....
}

or

if (...)
{
...
}

but never

if (...) { ... }


Where’s the Terraform yaml option!

Okay… gripe done ;-D

heh, would be interesting to see what TF would look like in YAML

data:
template_file:
example:
template: '${hello} ${world}!'
vars:
hello: goodnight
world: moon
output:
rendered:
value: '${data.template_file.example.rendered}'

variable:
count:
default: 2
hostnames:
default:
'0': example1.org
'1': example2.net
data:
template_file:
web_init:
count: '${var.count}'
template: '${file("templates/web_init.tpl")}'
vars:
hostname: '${lookup(var.hostnames, count.index)}'
resource:
aws_instance:
web:
count: '${var.count}'
user_data: '${element(data.template_file.web_init.*.rendered, count.index)}'

AKA

data "template_file" "example" {
template = "${hello} ${world}!"
vars {
hello = "goodnight"
world = "moon"
}
}
output "rendered" {
value = "${data.template_file.example.rendered}"
}

and

variable "count" {
default = 2
}
variable "hostnames" {
default = {
"0" = "example1.org"
"1" = "example2.net"
}
}
data "template_file" "web_init" {
// here we expand multiple template_files - the same number as we have instances
count = "${var.count}"
template = "${file("templates/web_init.tpl")}"
vars {
// that gives us access to use count.index to do the lookup
hostname = "${lookup(var.hostnames, count.index)}"
}
}
resource "aws_instance" "web" {
// ...
count = "${var.count}"
// here we link each web instance to the proper template_file
user_data = "${element(data.template_file.web_init.*.rendered, count.index)}"
}

can’t help but feel like it’s “cloudformation” when seeing it in YAML format

it feels like they just “borrowed” the golang approach to { }

now thats a nice positive rant

i’ve noticed he does these multiple tweets a lot lol

that’s cool
2018-07-11

I’m mid module writing for handling ecs instance draining for spot instances, but I want it to handle ASG standard istances as well…

so now it looks like im making another module for my module

how are you implementing the draining?

Modifies the status of an Amazon ECS container instance.

Good question

boto3

on asg instance termination sns alert

i get the instance id

And currently i grab the list of all ecs clusters, then in each cluster, i list all container_instances, and then i compare my instance id to the container instance id.

Once I have that i change state to draining

I don’t know how heavy those requests are when there are 100 clusters with 10000 instances on each

but it works for mine

I expect a faster lookup would be: tag each instance on creation with its cluster name

then, use the tags to get the cluster name from the id.

But then it involves an extra step on creation.

Whereas this way, i don’t need access to the ASG resource, just the asg name

I am still in POC stage with it really

I can see many ways to smooth it out, break it in to smaller reusable chunks and so on

but … maybe once I get the functions and permission sets all working

I can split the modulerisation of it with Sarkis or something

Well, it works

terraform-aws-ecs-launch-template - Terraform module for generating an AWS Launch Template for ECS that handles draining on Spot Termination Requests

It now also handles draining on scale in

Next it likely needs the lambda function rewritten to trim all of the junk off it, and set up a step function, so that a 5 or 10 second delay can be added between checking if the tasks have drained.

at the moment it just loops with no delay, between sns -> lambda -> sns ->lambda etc

until it finishes draining the tasks

@jamie the upstream tf PR was merged to support the launch templates?
2018-07-12

Not for spot fleets.

So while waiting for that, I made the template module handle both spot fleets, asg on demand, and asg spot.

That’s just the launch template module

The other module which I’m making with you in mind is the auto scaling spot fleet one.

And that requires the merge first


Your typical cloud monitoring service integrates with dozens of service and provides you a pretty dashboard and some automation to help you keep tabs on how your applications are doing. Datadog has long done that but today, it is adding a new service called Watchdog, which uses machine learning to …

I like writing modules because it makes me do less work in the long term.

Question for the ones who worked with K8S before, I have not really. How does it compare to ECS for you. I see small startups offering jobs to devops for their to-be-configured k8s cluster ( on AWS) and I really think, why not just ecs ? It’s so much hype.

feature velocity of k8s is insane. huge ecosystem of tools, much lager than for ECS. IMO, it’s easier to deploy complex apps on kubernetes than on ECS. ECS needs something like Helm. Not sure if “terraform” is the answer for that. ECS has evolved leaps and bounds from when I first looked at it (e.g. DNS service discovery), but it still feels rather primitive coming from an kubernetes background. I think one big thing ECS has going for it is that it is simpler, but in being simpler lacks a lot of the cool features of kubernetes.

charts - Curated applications for Kubernetes

There’s no signficant library of apps for ECS like there is for kubernetes. That to me is a little bit of a redflag.

(that I’m aware of)

True, but tbh, how many of the cool features do you really need. And will some cool features like persistent storage let people make decisions which are contrary to the reason for moving to AWS, like, having AWS taking care of your stateful services.

The idea is to represent as much of the infrastructure as possible so that i can be easily deployed without involving ops

with ECS, that’s not really possible.

with k8s, you can almost do everything but IAM roles/policies

volumes, load balancers, ingress, namespaces, secrets, configmaps, replicasets, deployments, pods, etc.

automatic TLS

automatic DNS

so basically a single helm chart is capable of provisioning a web app with a dynamic TLS certificate, automatic public DNS registration, pulling secrets and exposing them as envs, mounting configmaps as files on the filesystem, provisioning EBS volumes for scratch storage, and more…

we use these features all the time.

CI/CD of this is very easy with Kubernetes

few orgs actually do CI/CD of terraform

ok, but when connected to AWS in ways of networking, iam, .. then you do get terraform after all

persistent storage doesn’t have to be “persistent”

but you need it for big data applications like cassandara or HDFS

and for staging environments, we run apps like postgres

so having attached storage is necessary since the host machine is limited

so kubernetes cannot exist without terraform

that’s why it’s so central still to everything we do visible in our github

but kubernetes is headed in the right direction

I’d love to see a kind: Terraform
in kubernetes

then it would truly allow everything

for what resources then ?

IAM roles, policies

RDS databases

elasicache instances

EFS filesystems

etc.

…so fully managed services

Here’s a POC presented at HashiConf: https://github.com/kris-nova/terraformctl
terraformctl - Running Terraform in Kubernetes as a controller

The other thing about k8s, is it’s more like like a framework (like “rails”) for how to do cloud automation. It provides the interfaces, scheduling, state, service discovery. Then makes it extensible to add anything else on top of it. So for more complex applications, e.g. “postgres”, in order to run it in a containerized environment, you need an “application aware” controller. Something that knows instinctively how to manage postgres. How to do updates, upgrades, rollbacks, etc.

So people are developing operators like this: https://github.com/CrunchyData/postgres-operator
postgres-operator - PostgreSQL Operator Creates/Configures/Manages PostgreSQL Clusters on Kubernetes

etcd-operator - etcd operator creates/configures/manages etcd clusters atop Kubernetes

for running complex applications that aren’t just stateless webapps

Sure, but having seen a few meetups with pains of guys running stateful inside k8s..

Didn’t we move to AWS to not have state anymore

i agree that not dealing with state is the “ideal” situation

and we encourage our customers to push that off as long as possible

but I would rather have a platform capable of handling that in addition to all the stateless apps

at somepoint, someone needs to manage the state.

aws doesn’t provide state management for every application

thus some apps do need to handle that.

true, although, there are so many providers who offer services through vpc peering now

mongo atlas for example

yea, i would love to see more of that

i think that’s a cool direction

also, we’re still in the wee early days of k8s, but look how far & fast it’s come?

totally, in the end this is about not having 24/7 shifts, and not needing a 100% devops

With ECS I think this is possible, with self managed k8s, maybe less

yea, so back to perhaps your original question

but I don’t know k8s enough

to really judge there

companies who don’t have any dedicated devops, i would recommend considering ECS


smaller shops, with 1-4 people, ECS is probably better/simpler.

also with fargate then.. just great

i came from 80, but also only devops .. so ECS is perfect then

i think we’re coming at this from 2 differnet backgrounds. i spent the last 3 years working with k8s and 2 months with ECS.


so I don’t yet fully appreciate perhaps ECS.

you will, let’s just wait for that one k8s update hehe

haha

i really want to start working on our TF modules for EKS.

having deployed both - i do prefer k8s - something about it just appeals to me - i think it’s mostly the fact that i can do everything out of the box via command line - i.e. kubectl

def possible for ecs i bet but aws-cli is meh

for example - so easy to cat out the logs for ingress and describe a pod etc… for ECS i still find myself in the AWS web console - i know I am doing it wrong - but aws doesn’t make it a no brainer task like k8s for cli equivalents

so in a 20 developer situation how many people have kubectl and can do damage ?

RBAC addresses that concern

also, you can give those developers carte blanche for a namespace, so they can triage their own stuff.

speeds up itertations, removes bottlenecks

yea good point - not ideal in production

and imo kubectl is the emergency hatch - so whoever is dealt the devops card in the 20 dev group

large companies definitely are dealing with it though and as far as I know, love the RBAC support within kubernetes.

ok

i.e. knows what they are doing

hehe

As part of the lead up to the release of Terraform 0.12 (https://www.hashicorp.com/blog/terraform-0-1-2-preview), we are publishing a series of feature preview blog posts. The post…

@Erik Osterman (Cloud Posse) uploaded a file: Pasted image at 2018-07-12, 6:49 PM

This is awesome, but I find the formatting awkward without additional indention

Kubernetes and related technologies, such as Red Hat OpenShift and Istio, provide the non-functional requirements that used to be part of an application server and the additional capabilities described in this article. Does that mean application servers are dead?
2018-07-13

@Erik Osterman (Cloud Posse) /@sarkis Within geodesic wrapper we are publishing geodesic port and binding kubernetes_api_port to it, can you tell why we are doing this

also how can I proxy something out of container to host

say for example kubernetes-dashboard

Yep! This is for kubectl proxy

So you can do exactly what you want to do

Also, for dashboard you can do something else

But I am on my way to bed. Can demo our portal for you tomorrow or next week

It uses bitly oauth2 proxy

kubectl proxy –port=0.0.0.0:8080

its seems not working as expected, so if I do kubectl proxy –port=$GEODESIC_PORT

From inside geodesic

No

yes thats what i am doing

okay

You are not binding to 0.0.0.0

By default it is 127.0.0.1

Docker port forwarding does not work to local host

Actually, arg is diff

kubectl proxy --port=0.0.0.0:8080
gives invalid port syntax exception

—addresss=0.0.0.0

okay

sorry, on phone so hard to type

thats working

thanks

I have Kubernetes running on a VM on my dev box. I want to view the Kubernetes dashboard from the VM host. When I run the following command: kubectl proxy –address 0.0.0.0 –accept-hosts ^/.* …

Might need to add this too

I will document this too

It’s a good question

Anyways thanks a lot, it was an instant resolution, cheers

Haha welcome! :)

Hi, someone with spare time and wants to help me out with something. I’m passing a list with maps to a resource. This works as long as there is no interpolation happening with a variable from an external source. When I do it fails and the resource complains certain keys are missing from the map.
But when I output that structure, the structure is the same, just in a different order, I can’t figure out what is wrong with it.
https://gist.github.com/maartenvanderhoef/83047f578486dce8f5995d3c728b99d3

Can you share the precise error

Error: datadog_timeboard.this: “graph.0.request”: required field is not set
Error: datadog_timeboard.this: “graph.0.title”: required field is not set
Error: datadog_timeboard.this: “graph.0.viz”: required field is not set

This sounds familiar. @Andriy Knysh (Cloud Posse) I think ran into this in one of our other modules, but I don’t remember which one

Have you tried not using a local?

For the data structure

Have you tried removing the brackets here:

Then it does work, but that wouldn’t work for my module ..

graph = [”${local.not_working_graph}”]

let me try

the thing is, i’m passing a list of maps there normally, not just one, so it’s an actual list..

but let me try just a single one.

Actually, I misread your local

Thought it was already in a list

What if you put the local in a list

I don’t have any ideas other than to try all kinds of permutations of what you are attempting to do

(On my phone)

first attempt “datadog_timeboard.not_working: graph: should be a list”
the datadog_timeboard can have multiple graph { } blocks, so it must be a list.
haha, thanks , i’ll try the other option

2nd option, same problem as initial error. When outputted I have this:
not_working = {
request = [map[style:map[type:solid width:normal palette:dog_classic] q:avg:aws.applicationelb.target_response_time.p95{targetgroup:targetgroup/qa-web-backend-web/123} aggregator:avg type:line]]
title = not_working
viz = timeseries
}
working = {
request = [map[q:avg:aws.applicationelb.target_response_time.p95{targetgroup:targetgroup/qa-web-backend-web/123} aggregator:avg type:line style:map[palette:dog_classic type:solid width:normal]]]
title = working
viz = timeseries
}
I’ll wait for the new terraform I think.
2018-07-14

@Erik Osterman (Cloud Posse) I think there is some issue with git::<https://github.com/cloudposse/terraform-aws-rds-cluster.git?ref=master>

What’s the problem?

I have create
module "rds_mysql" {
source = "git::<https://github.com/cloudposse/terraform-aws-rds-cluster.git?ref=master>"
engine = "aurora-mysql"
cluster_size = "${var.MYSQL_CLUSTER_SIZE}"
cluster_family = "aurora-mysql5.7"
namespace = "${var.namespace}"
stage = "${var.stage}"
name = "${var.MYSQL_DB_NAME}"
admin_user = "${var.MYSQL_ADMIN_NAME}"
admin_password = "${var.MYSQL_ADMIN_PASSWORD}"
db_name = "${var.MYSQL_DB_NAME}"
instance_type = "${var.MYSQL_INSTANCE_TYPE}"
vpc_id = "${module.vpc.vpc_id}"
availability_zones = ["us-west-2b", "us-west-2c"]
security_groups = ["${aws_security_group.store_pv.id}"]
subnets = ["${module.subnets.private_subnet_ids}"]
zone_id = "${var.zone_id}"
cluster_parameters = [
{
name = "character_set_client"
value = "utf8"
},
{
name = "character_set_connection"
value = "utf8"
},
{
name = "character_set_database"
value = "utf8"
},
{
name = "character_set_results"
value = "utf8"
},
{
name = "character_set_server"
value = "utf8"
},
{
name = "lower_case_table_names"
value = "1"
apply_method = "pending-reboot"
},
{
name = "skip-character-set-client-handshake"
value = "1"
apply_method = "pending-reboot"
},
]
}

but if I run terraform apply
, 2 nd time it recreates the instance

-/+ module.rds_mysql.aws_rds_cluster.default (new resource required)
id: "niki-dev-commerce" => <computed> (forces new resource)
apply_immediately: "true" => "true"
availability_zones.#: "3" => "2" (forces new resource)
availability_zones.2050015877: "us-west-2c" => "us-west-2c"
availability_zones.221770259: "us-west-2b" => "us-west-2b"
availability_zones.2487133097: "us-west-2a" => "" (forces new resource)
backup_retention_period: "5" => "5"
cluster_identifier: "niki-dev-commerce" => "niki-dev-commerce"
cluster_identifier_prefix: "" => <computed>
cluster_members.#: "1" => <computed>
cluster_resource_id: "cluster-PA4BVKHSGWXDI7RT72RN2JGEZQ" => <computed>
database_name: "commerce" => "commerce"
db_cluster_parameter_group_name: "niki-dev-commerce" => "niki-dev-commerce"
db_subnet_group_name: "niki-dev-commerce" => "niki-dev-commerce"
endpoint: "niki-dev-commerce.cluster-cgxpu4rhgni7.us-west-2.rds.amazonaws.com" => <computed>
engine: "aurora-mysql" => "aurora-mysql"
engine_version: "5.7.12" => <computed>
final_snapshot_identifier: "niki-dev-commerce" => "niki-dev-commerce"
hosted_zone_id: "Z1PVIF0B656C1W" => <computed>
iam_database_authentication_enabled: "false" => "false"
kms_key_id: "" => <computed>
master_password: <sensitive> => <sensitive> (attribute changed)
master_username: "root" => "root"
port: "3306" => <computed>
preferred_backup_window: "07:00-09:00" => "07:00-09:00"
preferred_maintenance_window: "wed:03:00-wed:04:00" => "wed:03:00-wed:04:00"
reader_endpoint: "niki-dev-commerce.cluster-ro-cgxpu4rhgni7.us-west-2.rds.amazonaws.com" => <computed>
skip_final_snapshot: "true" => "true"
storage_encrypted: "false" => "false"
tags.%: "3" => "3"
tags.Name: "niki-dev-commerce" => "niki-dev-commerce"
tags.Namespace: "niki" => "niki"
tags.Stage: "dev" => "dev"
vpc_security_group_ids.#: "1" => "1"
vpc_security_group_ids.1052271664: "sg-0774db77" => "sg-0774db77"

Probably something making it not idempotent

I cannot look at it now though - on my way out

The problem looks like your AZ map is not static

no problem, initially i though it can be due to either you are calculation azs or subnets counts somewhere

Consider hardcodifnit

Hard coding it

Or at the very least sorting it

Already did that ` availability_zones = [“us-west-2b”, “us-west-2c”]`

Hrmm I see

That is the line of investigation I would pursue

We used this module for multiple enagagemebrs

Probably a regression caused by a newer version of terraform

Show me your subnet invocation

module "subnets" {
source = "git::<https://github.com/cloudposse/terraform-aws-dynamic-subnets.git?ref=master>"
availability_zones = ["us-west-2b", "us-west-2c"]
namespace = "${var.namespace}"
stage = "${var.stage}"
name = "${local.name}"
region = "${var.kops_region}"
vpc_id = "${module.vpc.vpc_id}"
igw_id = "${module.vpc.igw_id}"
cidr_block = "${module.vpc.vpc_cidr_block}"
nat_gateway_enabled = "true"
}

hardcoded here as well

Hrmmm yea was going to be my other suggestion

You can try upgrading / downgrading the AWS provider

Hrmmm can you check the status of your 2a az?

AWS takes zones out of commission

Though unlikely in us-west

Also, not all services are available in all zones

Try a different az selection and see if it makes a difference

Also try reducing to just 2, for example

And don’t include the 2a

okay

But their weird thing is it’s saying you are going from 3 => 2

As an outsider, it looks like you previously provisioned the cluster in 3 az and now want to shrink it

That will destroy the cluster

Terraform is not a good tool for that kind of automation

it never actually happened and i verified that

Hrm odd indeed

i have actually destroy the module and recreated also

and the aws console shows 2 azs only

this looks more of an issue on terraform

Ya…

@mcrowe are you using the RDS cluster module?
2018-07-15

For my info, what is the reason to both specify vpc subnets and ec2 availability zones ?

If you leave out availability zones it will work out most likely. The subnet group defines the azs.

availability_zones - (Optional) A list of EC2 Availability Zones that instances in the DB cluster can be created in

Yea that’s a good suggestion @maarten

@rohit.verma

Has there been a discussion about whether having security groups defined inside modules along the resource of purpose is OK or not, like with the rds module ? I personally dislike it a lot as I used ot create modules which did exactly that. It makes migrations extremely complex in some cases. Next to that the AWS/Terraform security group implementation is poor enough on itself, so it’s something I’m extremely careful with. Having vpc_security_group_ids as list variable for a module is a lot simpler and safer I think.

@rohit.verma Can you show us the output of terraform plan?

@maarten yes/no, but to your point, this module does not do it correctly

@Erik Osterman (Cloud Posse) uploaded a file: image.png

is bad practice. We should use security group rules for stability/interoperability with other modules

i think it’s ok, so long as the module returns the security group, so that other modules or consumers can add rules.

@Erik Osterman (Cloud Posse) If you define a SG with inline rules, it is very problematic to add additional rules using security group rules. We ran into this and removed all inline rules to support the intra-module flexibility

From https://www.terraform.io/docs/providers/aws/r/security_group.html
At this time you cannot use a Security Group with in-line rules in conjunction with any Security Group Rule resources. Doing so will cause a conflict of rule settings and will overwrite rules.
Provides a security group resource.

yes, 100%

we need to remove these inline rules

but also to @maarten point, i think we need to add the option of moving the SG outside of the module too

we had this problem at gladly too and it complicated migrations.

added issues


what consider removing security group from the resource in the module, or making it an optional parameter why complicates interoperability with other modules reported by @maartenvanderhoef

i am open to discussion around this. @Igor Rodionov @jamie @mcrowe and @Andriy Knysh (Cloud Posse) probably have more thoughts

let’s decide on something.

@Erik Osterman (Cloud Posse) what module is that?

terraform-aws-rds-cluster - Terraform module to provision an RDS Aurora cluster for MySQL or Postgres

@maarten @Erik Osterman (Cloud Posse) and @mcrowe, thanks for your suggestions. For the time being i have actually added all 3 availability zones, anyways as mentioned the instances are created in provided subnets only, so it don’t impact anything
2018-07-17

hi team, did any one tried kube2iam with eks and got it working?

Use Kiam

kiam has master agent configuration

Crap

i don’t understood how to schedule master

eks have no master nodes

You might need to have a pseudo master tier

On phone

Kube2iam has a lot of serious issues

this is a bit concerning, is there something else we should be using?

yes, kiam
will work out-of-the-box for you guys using our helmfile

i can show @Max Moon

This still needs to be discussed, I’ll gladly take a look at it on Monday when I’m back in the country

If you could put together a list of these serious issues I can take a look at before then, that would be great.

Geodesic is the fastest way to get up and running with a rock solid, production grade cloud platform built on strictly Open Source tools. https://docs.cloudposse.com/geodesic/

Thanks, I will take a look at that. Do you have a list of github issues about kube2iam?

Sec


Kiam bridges Kubernetes’ Pods with Amazon’s Identity and Access Management (IAM). It makes it easy to assign short-lived AWS security…

FWIW, gladly and PeerStreet have both had issues

And Joany

Gladly is evaluating Kiam

It has its own set of issues :-)

But has a dedicated following and a channel now in the official kube Slack team

sorry didn’t got that, what is pseudo master tier on phone ?

Any interest to collaborate on our EKS modules?

sure i will

I am on my phone - hard to type :-)

okay, no problem

Bagel in other hand

ah ha

take your time

also to support eks, you have to make changes to your subnet module

i will send pr, small change

Ok many thanks

just when you got time, tell me about psuedo master,

if i promote any worker to act as master, i can’t run anything on it which need role assumption

Yes, want to review the EKS modules I am working on with you
2018-07-18

@rohit.verma are you around?

i can share now

(or later today - ping me)

Just dropped a PR for elasticache-redis, need to be able to pass the encrypt at rest / enable TLS flags. Should be backwards compatible with previous releases (e.g. defaults to false on both). No idea if this is how you prefer I contribute, but let me know: https://github.com/cloudposse/terraform-aws-elasticache-redis/pull/15

thanks @jonathan.olson

@evan @Max Moon @Daren @chris might be interested in this enhancement for encryption at rest and TLS

@dave.yu also a heads up, we might need to do this: https://github.com/cloudposse/geodesic/issues/180
what Set memory and CPU limits for Kiam why Kiam may have a memory leak references uswitch/kiam#72 uswitch/kiam#125

(reported by @Daren)

they are seeing memory leaks in kiam

(and possibly some excess network traffic)
2018-07-19

We just released Cloud Posse reference architectures:
https://github.com/cloudposse/terraform-root-modules - Collection of Terraform root module invocations for provisioning reference architectures https://github.com/cloudposse/root.cloudposse.co - Terraform Reference Architecture of a Geodesic Module for a Parent (“Root”) Organization in AWS https://github.com/cloudposse/prod.cloudposse.co - Terraform Reference Architecture of a Geodesic Module for a Production Organization in AWS https://github.com/cloudposse/staging.cloudposse.co - Terraform Reference Architecture of a Geodesic Module for a Staging Organization in AWS https://github.com/cloudposse/dev.cloudposse.co - Terraform Reference Architecture of a Geodesic Module for a Development Sandbox Organization in AWS https://github.com/cloudposse/audit.cloudposse.co - Terraform Reference Architecture of a Geodesic Module for an Audit Logs Organization in AWS https://github.com/cloudposse/testing.cloudposse.co - Terraform Reference Architecture that implements a Geodesic Module for an Automated Testing Organization in AWS
They show how we provision AWS accounts and what Terraform modules we use. Complete description is here https://docs.cloudposse.com/reference-architectures
Thanks everybody for your contributions. We will be improving the repos and the docs. Your input and PRs are very welcome.

Awesome team, thanks for open sourcing


Hi Team, I’m writing a naming/tagging strategy document. I would love another set of eyes on it and suggestions. As far as I can I’m using the cloudposse terraform naming convention, but I needed to extend it to cover a larger number of scenarios.


Sorry, just fixed the access

try and click again now

Permission to comment in the doc is enabled.

wow, great document

very detailed

I think you have everything covered there

Hah, well, thats kind of you. But critical feedback will help me improve it

nice doc @jamie

here is some feedback

1) we usually use namespace
as a company name or abbreviation so we can distinguish the company’s resources from other companies resources (but it’s your choice to expand its usage as you described in the doc)
2) the third part in our naming (label) pattern is name
, not role
. The difference is what it does
vs. what it is
. Those could be just small differences, but we usually try not to use the resource types in resource names, e.g. we don’t use cp-prod-subnet-xxx-yyy-zzz
but instead cp-prod-app1-xxx-yyy-zzz
to name a subnet. But I think your doc has described both cases

1) As it is a single company, but with 6 different divisions i was wanting to have them use it for that instead

2) Role is a suggested tag from AWS for categorisation, and i wanted to have them group resources as needed by the role.
As in role: frontend
or cmsstack

the only issue with that (and why we use a company name as the namespace
) is to name global AWS resources like S3 buckets. Could be naming collisions if not used properly


I wanted to disambiguate the Name input from the Name output (id) that the label module does, but as a convention document.

Which is why i changed from name to role.

Well, I can tell you from years of field experience that most people don’t even have the document for the tags, but plenty of tags they want to use

So you end up with a giant cluster f*$! of mismatched tags with very little direction, and plenty of weird dependancies and gotchas baked into the system.

So, touch somethign and watch the dumpster start burning

Having a document like that to start with is a solid way of moving, good footing from the start is a powerful position for sure

what it is vs what it does is a very valid point.

Haha, well thats good news @krogebry - its for a transformation project one of my clients have. They have a monolith of aws EC2 instances, and are moving to serverless and docker microservices. And they have hired new teams and such, so before the big work starts on the transform i’m wanting to get a a few standards in place for consistancy

nice

yeah, that’s solid, I’ve done that work to transform, doing that work now

if you use role
as a name, it should be role from the business point of view, not resource types point of view. Although in some cases it’s difficult to assign names (yes, naming is hard)

i'm wanting to get a a few standards in place for consistency
- very valid point @jamie

Whats that saying you have about naming?

There are only two hard things in Computer Science: cache invalidation and naming things – Phil Karlton (bonus variations on the page)

Love that

I may put that as a footnote.

Jamie this is awesome. I will take a closer look later today. At first glance love the selection of tags.

Access controls via tags is only sometimes supported

Ha, thanks Erik. Yeah, I have the list

Maybe add a note in regards to that. It should be used as a last resort IMO

For stage segregation, tags are not well suited since there are resources which do not even support tags

I’m gonna do a policy that means that anyone in the ‘developer’ group that wants to create new resources from that list, must apply tags at creation. As well as to ‘start’ a resource.

That way the greenfields stuff has to have tags, even if they are not awesome.

That is cool. Enforcing tags at some level would be a nice account level module!

Maybe i can see if I can just make a cloudtrail metric alert that looks for missing tags on creation, and notifys slack, or a dashboard via sns

Btw love the font in your doc

Oh thanks

More of a ‘soft’ way to enforce it

and implement the hard enforcement if required.

Can I offer an idea on that front?

I just did something with the tags on this client wrt missing tags

It’s an idea that Intuit implemented, but geared for tags.

Sure!

Alright, so Intuit started this with an “up and to the right” progress over perfection initiative with regards to the overall security of any given account that ran a service ( mobile stuff, mint, etc… ). I’ve implemented the same idea with various different things including pager duty noise levels.

Start with a simple code base ( lambda could work, or ruby+docker+jenkins, whatever ), analyze tags, then create a grading metric. So, A for >90% compliance, B for compliance of >80% >90%, C,D,F etc.

Or just go with the numeric, but i think there’s almost a cognitive hook on the grades.

I usually end up representing these as graphs in Jira

I was thinking I would use https://www.terraform.io/docs/providers/aws/r/config_config_rule.html
Provides an AWS Config Rule.

A config rule, to do the checks

@krogebry So this is in terms of bringing an account into compliance?

@krogebry uploaded a file: Screen Shot 2018-07-19 at 11.13.49 AM.png

Yeah, so forcing compliance from resource creation can work, but does get in the way.

However, a stance of “progress over perfection” usually works better in the transformation world of things.

Like, the reality is that you’re probably not going to get people to conform to tags right away, and in some cases you can’t enforce things because enforcing the standard might actually break things.

So the idea is the Riot sort of mindset where we’re just trying to move things up and to the right, or progress over time.

Usually more effictive when dealing with people who are maybe a little timid around big changes.

</rant cents=”2”>

@krogebry agree. But how do you do it from the automation point of view? E.g. we create and provision TF resources without tags (or with some tags), and then use the process to validate the tags and then update all the resources from TF?

@Andriy Knysh (Cloud Posse) with aws config rules you can have it provide a compliance report based on your terraform resources https://docs.aws.amazon.com/config/latest/developerguide/evaluate-config_manage-rules.html
Use the AWS Config console, AWS CLI, and AWS Config API to view, update, and delete your AWS Config rules.

nice

So instead of hard enforicing it, it can just be dashboard

Okay, I can tell you a tale of woe as to how not to do it

and its something that could be a TF module easily

Yeah, so you’re on the right track, basically you have two good options: either enforce it from the creation of things, or enforce it later with some kind of dashboard/report’y thingie

If you enforce at creation the only real risk is the timing with things, so in many cases the tag creation is a secondary action after the resource is created

So you just have to be aware of that, if you do something like “kill instance if tags are missing” with an ASG, you’re going to have a bad time because of the timing stuff.

yea, that’s why naming is hard

It would be neat to have some kind of way to have TF actually enforce the naming conventions with if conditionals

Might be an option in 0.12?

Its an option now

using my nifty hack

ohh, with config rules?

so I think it’s good to combine the two strategies: 1) enforce some rules at creation; 2) check it after creation and alert/dashboard

yeah, okay, I can see how that would be pretty awesome with config rules

variable "failure_text" {
default = "The values didn't have the values needed"
}
variable "conditional" { default = true }
resource "null_resource" "ASSERTION_TEST_FAILED" {
count = "${var.conditional ? 1 : 0}"
"${var.failure_text}" = true
}

im curious, is this pseudo code or you actually have this working?

It’s working

It’s works great. I wrote an article on it :)


And it gets referenced here https://github.com/hashicorp/terraform/issues/2847
It would be nice to assert conditions on values, extending the schema validation idea to the actual config language. This could probably be limited to variables, but even standalone assertion state…

Thats the entire assert module I have

you would use it like
module "assert_name_length" {
source = "thatmodulepath"
failure_text = "Your name is too long"
conditional = "${length(var.name) < 63}"
}

nice hack @jamie

Although @Andriy Knysh (Cloud Posse) actually won’t touch it, because he hates hacks

So its just for my own compliance checks for now

for a few reasons :

understandable

breaking changes and all

- As a user of the module, you instantiate the module in TF and you add/change the assertion code - looks like you just control yourself

- Little bit difficult to read

But that module has worked since at least version 0.9

I added it because I had a client just filling in tf vars files

So they wouldn;t have tf access, and it would go through a pipeline

but i agree something like that is needed

so, i was using it to cause a TF error and message when their vars were flakey


for 0.12

Error: Error running plan: 2 error(s) occurred:
* module.elasticache_redis.module.dns.var.records: Resource 'aws_elasticache_replication_group.default' does not have attribute 'primary_endpoint_address' for variable 'aws_elasticache_replication_group.default.*.primary_endpoint_address'
* module.elasticache_redis.output.id: Resource 'aws_elasticache_replication_group.default' does not have attribute 'id' for variable 'aws_elasticache_replication_group.default.*.id'

@Daren have you seen this with yours/our TF module for redis

Yes

We are using auth_tokens
and if the token is not valid for AWS this happens

ok, i think that could be related to our issue

We used:

resource "random_string" "redis_auth_token" {
length = 16
override_special = "!&#$^<>-"
}

@jamie
I have grown to dislike “terraform_remote_state” data provider
I love the data provider.
In directories (origin) where another directory (client) is expected to read the state, I create a lib/remote_state.tf
in the origin directory.
The (client) directory has a symlink to the lib/remote_directory.tf
file, and leaves all the implementation details up to the (origin).
example: https://github.com/tamsky/terrabase/blob/master/aws-blueprints/core/lib/remote-state.tf
Contribute to terrabase development by creating an account on GitHub.

Its all we have but my list of gripes are:
- It hard binds one tf template to another: I.e. even using variables to choose a state to use at first run, you can’t search for a state to use, or select states by relative paths. Because the s3 bucket has to be unique.
- You have to have an agreed naming structure to use a state, and there are no standards, so each setup will be different.
- It does have default values, but no way to provide a wildcard default, so that you can query for a value that doesn’t exist. I.e. If you are using version 1.0.1 of a tf template that has an output string “alb_frontend_sg” but in version 1.0.2 you have changed it to a list “alb_frontend_sgs”. And your remote state is changed to query for “alb_frontend_sgs”, if you need to roll back to version 1.0.1. You would get an error looking for “alb_frontend_sgs”. While it is good practice to error, it doesn’t allow you to create terraform level exception handling. Such as querying for both variables, and outputting the value that isn’t an empty string.
Contribute to terrabase development by creating an account on GitHub.

if you could get the outputs of terraform_remote_state as a map, and do a lookup, it would help that last part on unknown outputs at the code level. And you can workaround that last part if you have easy access to the terraform_remote_state data provider, as you could add both alb_frontend_sgs and alb_frontend_sg as default values.

that’s nice @tamsky
2018-07-20

Hey everybody! Proud to join the SweetOps crowd… this is a great initiative, seems like a real missing piece.

I have a quick q regarding the reference architectures, which operate under root.company.com, staging.company.com and prod.company.com etc. Where/how, in this setup, would I set up top-level DNS mappings, e.g. CNAME record referencing company.com -> prod.company.com, app.company.com -> app.prod.company.com etc? Would this be in the root module?

@Sebastian Nemeth I can help answer in a few hours.

Great! Thanks very much. I’m trying to get rolling with geodesic and some basic stuff as we speak.

also, let’s move the conversation to #geodesic so it’s not lost in the noise

btw, let’s start using #geodesic so we can concentrate knowledge

@Andriy Knysh (Cloud Posse) might be around to answer some questions related to this.

Have you seen this? https://docs.cloudposse.com/reference-architectures/cold-start/

Sorry for the messed up css

We are still refining that doc to make it more clear

I’m following it now, but it’s taking some cross-referencing. e.g. it tells me to install aws-vault, but then I read that it’s included in geodesic, so trying to figure out best way to use geodesic and where to keep the credentials store.

Great of feedback. Agree. Credentials will be stored in an encrypted key chain file

Inevitably you will want to use AWS vault natively as well. E.g. docker compose

But you don’t need to install to get started, so we should not make that a step so early on

Actually, @Andriy Knysh (Cloud Posse) and I were talking about just this yesterday.

I don’t mind installing aws-vault, but if its already installed in geodesic maybe it’s better to just run geodesic while mounting a local volume to store credentials for our developers.

@tamsky thanks for the bug report on https://github.com/cloudposse/terraform-aws-alb-target-group-cloudwatch-sns-alarms! Just addressed the arn_suffix issue with a PR - just waiting on an approval and should be merged in…
Also just a heads up that I found and fixed an edge case here, would love a quick review and any comments/suggestions on the solution: https://github.com/cloudposse/terraform-aws-alb-target-group-cloudwatch-sns-alarms/issues/8
terraform-aws-alb-target-group-cloudwatch-sns-alarms - Terraform module to create CloudWatch Alarms on ALB Target level metrics.
what AWS CloudWatch Alarm Thresholds should be assumed to contain floats when used in conditionals. why Fix the following case: CloudWatch thresholds are specified in seconds, which results in spec…


whoops sorry Jamie - just committed this too: https://github.com/cloudposse/terraform-aws-alb-target-group-cloudwatch-sns-alarms/pull/9/commits/dfdb45d0a293cb745a6d40e153e2b3a0cf991578
what Wrap thresholds with a floor to convert to int before comparing to 0. why Fixes #8 and #6

didn’t think it was worth it’s own PR for that since it is just cosmetic

maybe target_response_time_alarm_enabled = "${floor(var.target_response_time_threshold) < 0 ? 0 : 1 * local.enabled}"

should be the one without a floor

ah yea its confusing

Hi there! I am evaluating using geodesic for my organization and curious to discuss some kubernetes cluster related patterns.

Sure thing! You can direct message me or shoot me an email ([email protected])
Our current kubernetes architecture is captured in this kops
manifest:
https://github.com/cloudposse/geodesic/blob/master/rootfs/templates/kops/default.yaml
Geodesic is the fastest way to get up and running with a rock solid, production grade cloud platform built on strictly Open Source tools. https://docs.cloudposse.com/geodesic/

also, let’s move the conversation to #geodesic so it’s not lost in the noise

since it is actually float

@Yoann hi, welcome

i think you mean the other way around right @jamie since reponse time is the one that is a float?

so only wrap floor() for var.target_response_time_threshold so it converts that to an int and can compare to 0 safely

Lets try it in terraform console

where var.target_response_time_threshold is 0.5

> floor(0.5) < 0
false
> 0.5 < 0
false
>

So it evaluates fine

hmm

ah right

string

try “0.5” < 0

> floor(0.5)
0

> floor("0.5") < 0
false

> "0.5" < 0
__builtin_StringToInt: strconv.ParseInt: parsing "0.5": invalid syntax in:
${"0.5" < 0}

> floor("1") < 0
false
> floor("-1") < 0
true

but when you do the extrapolation

it shouldn’t be a string

thats the exact error you get currenly in latest stable when trying to set response time to 0.X: __builtin_StringToInt: strconv.ParseInt: parsing “0.5”: invalid syntax in:

ah

well then…. floor it is!

haha

Although this should change in 0.12

i wrapped the others to be safe - but obviously we should not expect a count to be a float

What if you took the quotes off the variable?

defaults


nah.. don;t worry about that

just floor it


i agree with erik there too - < 0.12 this is the most sane way to work with terraform i.e. strings, maps, lists

Yeah

I have a really large stack of metric alarm notes

For things like, containers per instance alarm

and ‘lambda max concurrent execution alarms’

and ‘rds database level custom metrics -> alarms’

Which im looking forward to adding to our arsenal

Oh and billing alerts

i copied some already for the module we were talking about

def good stuff - thanks for sharing it!

haha

I hope you plastered my face across the footer

of course


@jamie if you got a sec: https://github.com/cloudposse/terraform-aws-alb-target-group-cloudwatch-sns-alarms/pull/7 - CR please and I’ll owe you a
what Fix the dimensions by removing unnecessary join in interpolation why Fixes #4 Terraform will perform the following actions: ~ module.data_model_web_app.module.ecs_codepipeline.aws_codepipe…

or a CR whichever

Checked

Verified

lgtm


ty ty

Just a heads up - released 0.4.0 for https://github.com/cloudposse/terraform-aws-alb-target-group-cloudwatch-sns-alarms. This fixes a few reported issues: https://github.com/cloudposse/terraform-aws-alb-target-group-cloudwatch-sns-alarms/issues/8 https://github.com/cloudposse/terraform-aws-alb-target-group-cloudwatch-sns-alarms/issues/4 https://github.com/cloudposse/terraform-aws-alb-target-group-cloudwatch-sns-alarms/issues/6
Thanks to @tamsky, @Erik Osterman (Cloud Posse), @jamie!
terraform-aws-alb-target-group-cloudwatch-sns-alarms - Terraform module to create CloudWatch Alarms on ALB Target level metrics.
what AWS CloudWatch Alarm Thresholds should be assumed to contain floats when used in conditionals. why Fix the following case: CloudWatch thresholds are specified in seconds, which results in spec…
currently at https://github.com/cloudposse/terraform-aws-alb-target-group-cloudwatch-sns-alarms/blob/master/alarms.tf#L10-L11 we have: "TargetGroup" = "${join("/", list(&qu…
what what we’re mixing snake case with camel case why bad convention

thanks for the quick fix! like the use of floor()
.
terraform-aws-alb-target-group-cloudwatch-sns-alarms - Terraform module to create CloudWatch Alarms on ALB Target level metrics.
what AWS CloudWatch Alarm Thresholds should be assumed to contain floats when used in conditionals. why Fix the following case: CloudWatch thresholds are specified in seconds, which results in spec…
currently at https://github.com/cloudposse/terraform-aws-alb-target-group-cloudwatch-sns-alarms/blob/master/alarms.tf#L10-L11 we have: "TargetGroup" = "${join("/", list(&qu…
what what we’re mixing snake case with camel case why bad convention

Issue 6 was mixing case because the cloudwatch metric names are in

So that there was a 1:1 for name recognition

Originally

Since we have to use modules with terraform and really don’t have a desire to buy TF Enterprise just to get a “registry”, I was thinking of writing a little utility that will:
• parse through your TF project
• find any modules, then grab the source repo
• compare what version you have to the latest available in the source repo and then output if there is a new revision/tag compared the one being used


I think @mcrowe did wrote something similar

would like something like this though


hmm nice i’d love to see it - i might just write one on my free time anyway so i have an excuse to write a go utility with cobra

yea, go would be my pref!

this example is a bash
script


ah cool!

i also think it would be a popular utility for the community

yea i feel like can benefit from concurrency here - but of course already thinking about github api limits

there’s some way also to convert HCL to json

also, what’s nice with mike’s script is it can generate the composite outputs from all module deps

@Erik Osterman (Cloud Posse) uploaded a file: image.png

@sarkis you can start with something like this https://github.com/kvz/json2hcl. It shows what terraform packages to use for parsing, or you can convert hcl
to json
which is easier to analyze
json2hcl - Convert JSON to HCL, and vice versa

hey all, just want to invite those interested to #geodesic and #terraform

it might help organize information and get questions better answered if we distribute the traffic

set the channel topic: Cloud Posse Open Source Community #geodesic #terraform #random #releases #docs

2018-07-23

Hey guys, can you give me some idea why the term ‘stage’ was chosen over ‘environment’? In the label module?

@Erik Osterman (Cloud Posse) uploaded a file: Image from iOS

See #2

There are multiple stages

It’s where software performs

The term environment is also overloaded and often abbreviated as env

Which from my subjective experience more confusing. Stage imo is misused inside many organizations and I guess I made it our personal mission to correct its usage

However I could maybe consider adding environment after stage as another (optional) dimension of disambiguation

I have used environment in all other projects as the term to encapsulate the resources for development vs production.

And I think it is quite a common use, but I do understand the ambiguity issue.

How the term environment means, environment variables, it the old word for terraform workspaces as well.

My issue with the term stage is the implicit temporal nature it has. Like a stage in a pipeline or rocket is something that gets used and is destroyed.

But environment describes what surrounds something, and doesn’t imply any permanency, or lack of.

So for describing a split between production and preproduction, where the application is exactly the same build asset, but the configuration and attached resources are different. I align more with environment.

It feels like environment is something that should be encapsulated in a module

Thus the environment is baptized with name

Then resources with in that are disambiguated with attributes to that environment name

Couldn’t it be argued that the root level module invocation is the environment?

I align more with environment
for the same reasons Jamie has mentioned. I’m also not a fan of env
as an abbreviation.

What about adding more optional fields to the label module

Perhaps along the lines of Jamie’s document on canonical tag names

If not passed, they are not concatenated

Stage and environment can be adjacent, that way the caller can use what it’s most natural to their organization

I think this would satisfy both requirements. Thinking environment would be concatenated after stage.

It’s a non breaking change so I think it’s good idea

Stage can then be specific to things like. Source. Build. Test. Tag. Release. As well.

In any case it will add flexibility to the module that allows me to use it in different clients

agreed - and we want to support other use-cases too so lets do it

Terraform Module to define a consistent naming convention by (namespace, stage, name, [attributes])

see that?

what Support passing a label's context between label modules why DRY demo module "label1" { source = "../../" namespace = "Namespace" stage = &…

I’ll update it

So what’s the name for the env between dev and prod?

that might be staging. note, that staging is not the same word as stage.

@Erik Osterman (Cloud Posse) uploaded a file: image.png

could be QA, UAT, preproduction

so “dev” is a stage

production is a stage

it’s a stage in a lifecycle


A multistage rocket, or step rocket is a launch vehicle that uses two or more rocket stages, each of which contains its own engines and propellant. A tandem or serial stage is mounted on top of another stage; a parallel stage is attached alongside another stage. The result is effectively two or more rockets stacked on top of or attached next to each other. Taken together these are sometimes called a launch vehicle. Two-stage rockets are quite common, but rockets with as many as five separate stages have been successfully launched. By jettisoning stages when they run out of propellant, the mass of the remaining rocket is decreased. This staging allows the thrust of the remaining stages to more easily accelerate the rocket to its final speed and height. In serial or tandem staging schemes, the first stage is at the bottom and is usually the largest, the second stage and subsequent upper stages are above it, usually decreasing in size. In parallel staging schemes solid or liquid rocket boosters are used to assist with lift-off. These are sometimes referred to as “stage 0”. In the typical case, the first-stage and booster engines fire to propel the entire rocket upwards. When the boosters run out of fuel, they are detached from the rest of the rocket (usually with some kind of small explosive charge) and fall away. The first stage then burns to completion and falls off. This leaves a smaller rocket, with the second stage on the bottom, which then fires. Known in rocketry circles as staging, this process is repeated until the desired final velocity is achieved. In some cases with serial staging, the upper stage ignites before the separation—the interstage ring is designed with this in mind, and the thrust is used to help positively separate the two vehicles. A multistage rocket is required to reach the escape velocity of 11.186 km/s (25,020 mph) from Earth’s gravity.

stage 1, stage 2, …. stage n.
A file, which can’t be shown because your team is past the free storage limit, was commented on.

I think I like UAT

It’s descriptive as to what it actually does

“Stage” is like you say, it’s a part of the rocket, but it doesn’t actually define what it’s doing

I like y’all in this group, really makes me think about the important questions

2018-07-24


Suggestions on additional sections, or technologies to cover, or core concepts are very welcomed!

Can i just paste questions I’ve asked recently beneath it and don’t care about formatting ?

Yes! You can also use the comment feature

Less Meta , hard to keep an interview < 1h and to also ask a few easy questions to make the candidate feel at ease.

Thank you very much!

I was also providing question answer formats because I am not the one hiring them

@jamie as always, very nice doc, thanks for initiating these conversations


Met with @tamsky yesterday. He has also lots of nice things to say about your docs.

I had to move the document for the client. If you want to access the document still, whoch you are welcome to. Here https://docs.google.com/document/d/1yO7qgVyfKwPpK6EzBv0w64TzDOAUfCnE_nCEYX4maJg/edit?usp=sharing

It will need you to request access though guys. @Erik Osterman (Cloud Posse) @Andriy Knysh (Cloud Posse) @tamsky

Curious @jamie– is there a LICENSE for your https://github.com/firmstep-public/trainingdayone ?
trainingdayone - Day One: Using the Terraform command, creating a resource, and managing Terraform state.


@tamsky in regards to trainingdayone’s license. Thanks for asking. Do you want to use some of it? Or add to it?

It would be the MPL 2.0 license. So that I would be notified of any improvements if there were any.

I may want to use it - just checking in – I like the order and how you introduce the concepts.

Firstly if I post it in here. Take it if you want. It’s the cost I pay for the quality feedback.

Secondly, if I can help you improve it let me know. If you’re interested let’s do a slack video or something to rough out improvements

It’s a good investment in time for me and anyone else. As it’s very reusable

Thank you @Andriy Knysh (Cloud Posse) :-)

@pmuller hi and welcome

i have exposed kubernetes dashboard with bitly oauth proxy as we do with cloudposse portal

can we pass the dashboard token

somewhere within bitly redirect etc..

or for that kind of login we have to use dex

thanks!

funny how nice slack channel mostly live… during the night

(UTC+8 here)

i discovered your github a few days ago, and i love it

i am learning so much thanks to you guys !

(Cloud Posse is based in Los Angeles, CA)

Awesome! Glad to hear your getting some mileage out of it.

since when all of this is on GH ?

We’ve been publishing our modules for the past 2 years , but we only really started promoting them this year when we doubled down on our documentation, readme’s and community.

i thought i had a lot of not-that-nice patterns in my code base.. now i know for sure

thanks! would be happy to give you a tour of the complete open source ecosystem. we have A LOT so it can be hard to see the trees through the forest.

set the channel topic: Cloud Posse Open Source Community #geodesic #terraform #release-engineering #random #releases #docs
2018-07-25

Hey guys, pretty sweet collection of blocks. I’ve been swaying between the aws community modules on Terraform and yours. I have a quick question - terraform-aws-cicd <— I believe this module does not support multi-container beanstalk, is that correct? (using beanstalk for first time with new org.)

@tpagden welcome

correct, the module does not support multi-container beanstalk

you can open an issue for that and we’ll review

or use ECS instead

Expert Cloud Architects DevOps Professional Services

@Andriy Knysh (Cloud Posse) Cool, no problem at all. I was just confirming what I saw and that I didn’t miss anything. I’ll mull over some of the options

@Andriy Knysh (Cloud Posse) One more question - I know you all have avoided doing a wrapper approach (such as Terragrunt), which I’m inclined to avoid as well, however, do you all have a recommended directory structure approach that you recommend? Like live/non-prod/{region}/application ? If so, do you separate the application directories from infrastructure (like VPC) ?

We use containers instead

So we package all of our terraform invocations in one repo. Then we use docker Multi Stage builds to copy them. Have a look a look at our reference architectures

The Dockerfile shows our strategy

You could say we deploy infrastructure as code the same way we deploy applications.

So we don’t have our apps broken out into production and staging folders. We have then containerized. Then we deploy those containers. We think infrastructure code should be treated the same way.

@Cristin we still recommend to separate all resources into at lest two stages (dev
and prod
) and don’t mix anything b/w/ them

terraform-root-modules - Collection of Terraform root module invocations for provisioning reference architectures

staging.cloudposse.co - Example Terraform Reference Architecture for Geodesic Module Staging Organization in AWS.

This is how we deploy those modules for a staging account.

it’s wrapper approach (such as Terragrunt)
vs. container + ENV vars approach
. We use the second one

Yea that’s a succinct way of putting it. @Andriy Knysh (Cloud Posse) I should probably add that to the geodesic readme.

yep, naming is important, not only for TF resources

but also for patterns and approaches

next time somebody asks what approach we use, we can say container + ENV vars
(or something like that maybe with a better name)

I added a #jobs channel where people can post if they are looking for work or are hiring (FTE or contractors).
2018-07-26


Today we’re launching new ways to simplify your CI process, so you can use the tools you need to focus on the work that matters to your team.

Finally getting some of those awesome GitLab features.

Today we’re launching new ways to simplify your CI process, so you can use the tools you need to focus on the work that matters to your team.
2018-07-27

@Erik Osterman (Cloud Posse) I am working on updating the aws cloud front module to allow me to disable alias creation. https://github.com/cloudposse/terraform-aws-cloudfront-cdn/issues/14. I want to add a boolean dns_aliases_enabled. If the value is false no alias will be create. I see that DNS records are created using module “dns”. If DNS was created with resource I can set the count = “${var.dns_aliases_enabled}”. False is 0 so no record will be created. As far as I know I cannot pass in “count” to a module. Only option I see is to modify the module that creates DNS to take in count parameter. I am wondering if you have any suggestion for me.
We are hosting 5 different sites in AWS. All of them are behind the same ALB. We want to put CDN in front of all these sites. When I was creating CDN using your module I specified aliases for all t…

Correct, first we need to update the dns module to add that flag

Then we can do this module.

It is great to be here. Thanks for creating these modules. I have a consulting company and one of our clients is using terraform. They told me that cloudposse writes excellent modules.

I will also add web_acl_id parameter to the cloud front module.

Thanks @rajcheval !!

Can we add the web acl id as a separate PR? Just so we can be more pedantic about how we introduce changes.

yes this makes sense. adding web acl id is an easier change and I will keep it separate.

I am looking at the https://github.com/cloudposse/terraform-aws-route53-alias/blob/master/main.tf. It currently calculates the count based on number of elements in the aliases array. I am wondering how I will pass in a count parameter and still keep the current module usable. Current users are not setting count and relying on count being calculated by number of elements in array. Is there a way to not invoke the module that creates dns at all. There is a new beta for terraform and they are making a bunch of enhancements. Perhaps I need to see the language enhancements that may help us.
terraform-aws-route53-alias - Terraform Module to Define Vanity Host/Domain (e.g. [brand.com](http://brand.com)
) as an ALIAS record

@rajcheval hi, give me 1 min I’ll show you how to do it

first, it will work now w/o any modifications if you provide var.aliases
as an empty list, count
will be 0 and nothing will be created
resource "aws_route53_record" "default" {
count = "${length(compact(var.aliases))}"

if we want to introduce var.enabled
to be more specific (as we have in other modules), we do this:

count = "${var.enabled == "true" ? length(compact(var.aliases)) : 0}"

variable "enabled" {
type = "string"
default = "true"
description = "Set to false to prevent the module from creating any resources"
}

For <https://github.com/cloudposse/terraform-aws-cloudfront-cdn>
, it will work now w/o any modifications if you specify an empty var.aliases

module "dns" {
source = "git::<https://github.com/cloudposse/terraform-aws-route53-alias.git?ref=tags/0.2.2>"
aliases = []

if we add var.enabled
to route53-alias
, then we can add var.dns_aliases_enabled
to cloudfront-cdn
and use it like this:

module "dns" {
source = "git::<https://github.com/cloudposse/terraform-aws-route53-alias.git?ref=tags/0.2.2>"
enabled = "${var.dns_aliases_enabled}"
aliases = "${var.aliases}"
parent_zone_id = "${var.parent_zone_id}"
parent_zone_name = "${var.parent_zone_name}"
target_dns_name = "${aws_cloudfront_distribution.default.domain_name}"
target_zone_id = "${aws_cloudfront_distribution.default.hosted_zone_id}"
}

@rajcheval does it answer your questions?

@Andriy Knysh (Cloud Posse) passing in empty aliases is not an option because cloudfront distribution still needs aliases. However your other suggestion will work. Thank you so much for taking the time to help me. I am going to learn a lot from you.

(sorry, yes var.aliases
is needed in any case, so we need to modify the modules to add enabled
and dns_aliases_enabled
)

@rajcheval after you make modifications and before opening a PR, run these three commands to regenerate README
:

make init
make readme/deps
make readme

(not terraform-aws-cloudfront-cdn
yet, this was not converted to the new README
format yet. Just update README.md
)

what Add README.yaml why Standardize README

@Andriy Knysh (Cloud Posse) please merge for me

Afk

100% have been updated

working on it now (it needs terraform fmt
)

Just not all yet merged

@rajcheval we merged readme changes to master for https://github.com/cloudposse/terraform-aws-cloudfront-cdn
terraform-aws-cloudfront-cdn - Terraform Module that implements a CloudFront Distribution (CDN) for a custom origin.

if you open a PR, please run the three commands above to update README

@Andriy Knysh (Cloud Posse) I did run make commands to update readme. I have submitted PR’s for route53 and cloudfront. Once these are approved I will be making my final change on cloudfront resource to allow me to disable DNS record creation.

@rajcheval thanks, will review


Interviewing for a DevOps job? Here are some questions you’ll likely have to answer.

@jamie want to review ^ and maybe add to your doc?

sure do!

@rajcheval reviewed the PRs, look good, just a few comments
2018-07-30

I’ve been thinking lately that we should be training the interview skills like we train anything else

“How do you adapt when things don’t go as planned?” This question has nothing to do with devops per se, but is probably 90% of the actual job.

my buddy Ian has a great series of Tech Interview Prep emails => https://www.facebook.com/technicalinterviewprep/
Technical Interview Prep by Email. 66 likes. Improve your technical interview chances by learning to think like an interviewer. Get new insight every day for 30+ days via Email.

Nice

I have a friend that I’m going to try to convince to start up a consulting company that would specialize in training for the interview process

@Andriy Knysh (Cloud Posse) Thanks for merging my changes. I have one more PR for cloudfront resource ready for review.

@rajcheval thanks for the PR, looks good, merged to master

@Erik Osterman (Cloud Posse) if you want we can go over the specs you guys want for https://github.com/cloudposse/terraform-aws-ec2-autoscale-group and i can try to jam it out this week in my spare time
terraform-aws-ec2-autoscale-group - Terraform module provision an EC2 autoscale group

that would be awesome.

When’s good for you?

my general thoughts are that it should include:
- launch config
- autoscaling group
- basic security group (ingress + egress)
- min/max size vars
- enabled var
- volume size var
- vpc id
- user data script var (this would need to be a path using ${path.module} syntax in local module)
- elb enabled/disabled
- eip enabled/disabled
- dns record pointing to elb/eip?
- dns_zone_id var

maybe some basic security group ingress/egress rules for things like ssh/http/https

really any morning works for me

above is a broad stroke, but that’d be the general idea

Cool, tomorrow I’ll be busy until ~11:30. But free after that.

cool, just DM me

Yea, let’s move this to an issue under that repo


I know @maarten and @jamie will probably have some valuable input.

My general thoughts are that it should include the following resources: launch config autoscaling group security group dns record iam instance profile iam role Variables: min/max size enabled volum…

Ok, I’ve commented on it with some additional resources

A lot of the work has already been done by @jamie - we just need to modularize it

I link to it in the GH issue

This is massively appreciated. We need this module as one of the building-blocks for us to rollout EKS support later, as well

Nice! Okay cool, I’ll take a look at it in a bit

Also, we can roll it out in phases, if you don’t want to bite off too much at once.

That is a good idea haha

2018-07-31

hi

I am in the process of migrating terraform modules to codecommit. However, as i already use codecommit on other AWS accounts, I cannot rely on ~/.ssh/config to define a default username; but I want to keep my terraform code generic. I do not want to put my IAM user ssh key id in all module statements, otherwise my coworkers and the CI won’t be able to use them. So I tried to use interpolation in the module source to optionally define a SSH username. I ended up reading a bunch of GH wont-fix issues, then found https://github.com/hashicorp/terraform/issues/15614 which tracks precisely what I would need. So, how can I handle my use case? Any suggestion?
(This was split out of #1439, to capture one of the use-cases discussed there.) Currently all of the information for requesting a module gets packed into the source string, which cannot be paramete…

i’ve had good luck using an iam role with codecommit, and the codecommit helper for aws cli
(This was split out of #1439, to capture one of the use-cases discussed there.) Currently all of the information for requesting a module gets packed into the source string, which cannot be paramete…

i just configure my aws config file with a profile that has perms to the codecommit repo, and then add the credential helper to my .gitconfig

the module source
then looks like this:
source = "git::<https://git-codecommit.REGION.amazonaws.com/v1/repos/REPO//PATH?ref=REF>"

Provides detailed steps for setting up to connect to AWS CodeCommit repositories over HTTPS on Linux, macOS, or Unix, including setting up a credential helper.

I went the https way and it works fine. Thanks.

Awesome! If you use a mac, be aware that the system gitconfig uses keychain as a credential helper and will catch and store the temporary credential… Causes problems cuz keychain doesn’t know it’s temporary… I have our team remove the credential helper from their system config

Something like this, update for whatever region(s) and aws profiles you use:
git config --system --remove-section credential
git config --global --remove-section credential
git config --global --remove-section 'credential.<https://git-codecommit.us-east-1.amazonaws.com>'
git config --global credential.'<https://git-codecommit.us-east-1.amazonaws.com>'.helper '!aws --profile default codecommit credential-helper $@'
git config --global credential.'<https://git-codecommit.us-east-1.amazonaws.com>'.UseHttpPath true

(and yeah, I would also like to avoid relying on a wrapper which would rewrite all module sources)

As far as I can tell, @loren’s suggestion is not a wrapper to git. It adds an auth mechanism to the git config

You still interact directly with the git command

Moving from one git repo such as GitHub to CodeCommit will necessarily require updating your module sources, no?

Also, using AWS services for open source implementations will often require some hacks. Just like ECR which requires also using the AWS cli to first generate credentials to docker login.

if you really want to use ssh auth to codecommit, another option is to use [override.tf](http://override.tf)
and define just the source field for the module

add [override.tf](http://override.tf)
to .gitignore to avoid committing it

Terraform loads all configuration files within a directory and appends them together. Terraform also has a concept of overrides, a way to create files that are loaded last and merged into your configuration, rather than appended.

Wow, didn’t know about that feature :-)

very handy if you need to move the modules around between git remotes

i need to test it with nested modules though… not too sure off the top of my head exactly how i’d specify the module name

Yea don’t see how it could work recursively

maybe nest things inside the override?
module "foo" {
module "bar" {
source = "..."
}
}

perhaps terragrunt
has something to help write sources on-the-fly

i use terragrunt
pretty extensively…. i think you can interpolate the top level source
in the terragrunt
block, but the source
in any nested modules would not be interpolated… you’d have to get creative i think, with a terragrunt hook to edit the .tf file in place, after retrieving the modules…

no, that’s too late, the modules have already been cloned… hmm…

I think it’s an interesting use-case

Currently, our customers rely entirely on our git repo hosted modules

but I could see a case where they’d want to replicate them in-house, and then rewrite the base URL and pull locally

One way would be to use a HTTP proxy with url rewriting

(would only work with HTTP sources)

since we run terraform in a container (geodesic), it would be pretty easy to introduce a proxy in the mix

oh, good point, nice

oh, I did not know about tf overrides, that’s great, thank you @loren

(i’m looking into using them right now to help us solve our coldstart problem in a nicer way)

terraform-root-modules - Collection of Terraform root module invocations for provisioning reference architectures

you want to use overrides to disable the s3 remote state temporarily ?

yea, exactly!

our hack right now is to use sed

terraform-root-modules - Collection of Terraform root module invocations for provisioning reference architectures

yes i read that

so instead echo 'terraform { }' > [overrides.tf](http://overrides.tf)

i have the same issue on my side, so overrides are promising to solve a few ugly stuff (i really want to AVOID writing another tf wrapper… ;))

in the meantime, i did another attempt to solve my issue: using https to access codecommit ; now i need to prefix some terraform and git commands with aws-vault, which is not that nice… but it works well

(or aws-vault exec ... -- bash
)

hehe yeah

and just operate in a session

i discovered aws-vault a few weeks ago, thanks to you guys


before i used a dumb python script i wrote to do the same

but with much less features

yea, we did that too in the beginning


do you use the aws-vault login
action? love that!

https://github.com/pmuller/awsudo maybe i should archive this too now
awsudo - sudo for AWS roles

yep, love it too :))

and rotate !

yea, I should use rotate. haven’t yet.

the only thing i don’t like is that aws-vault rotate doesn’t handle MFA yet, and i don’t like the idea of allowing key rotation without MFA

yea, it’s more strict

though i think we ended up allowing self-service key changes because developers would loose their IAM keys

…so they can login to the console and generate a new key pair

guess it depends on your constraints

(but still require MFA to use keys)

so you only allow MFA device management when authenticated with the MFA ?

terraform-aws-iam-assumed-roles - Terraform Module for Assumed Roles on AWS with IAM Groups Requiring MFA

But I am definitely open to feedback

we wanted to allow new users to be able to setup their own MFA device without admin assistance

so you allow MFA management without MFA but require it to deactivate it

i have a similar policy in place

but i do not like it

but i do not require MFA to deactive MFA…yet

it means that a leaked API key or user password is enough to create a new MFA device, then use the new one to access all roles of the compromised user


Yea, looking at this again, seems like we should require MFA for that

yep !

unsure about "iam:ResyncMFADevice",

resync = allowing the user to pass 2 consecutive tokens to AWS ?

not sure I get how this could be dangerous

ok

when digging about my terraform modules / codecommit issue, i stumbled upon some “Terrafile” projects ; any thoughts about these ?

I’m not familiar with terrafile

Looks very interesting. Ultimately, what I want though is something more like this: https://github.com/dependabot/feedback/issues/118
what Open PRs against terraform repos when terraform modules have new releases Open PRs against terraform repos when terraform providers have new releases why It's an extremely diverse ecosyste…

To help us manage and keep deps up to date.

oh that’s a nice service !

i’d like the same for in house code

they support many languages and private repos

Automated dependency updates. Dependabot creates pull requests to keep your Ruby, Python, JavaScript, PHP, .NET, Go, Elixir, Rust and Java dependencies up-to-date.


we’re currently using them for Docker and Submodules.

in my current company, all repositories are in VPCs, and I cannot imagine opening this up to the internet (very sensitive code)

I think the project core is open source, you could run it in house

but for other businesses, i’ll definitely try this

heh, yea - understandable