#terraform-aws-modules (2022-02)
Terraform Modules
Discussions related to https://github.com/terraform-aws-modules
Archive: https://archive.sweetops.com/terraform-aws-modules/
2022-02-01
@Andy Miguel has joined the channel
@Dan Meyers has joined the channel
@Hugo Samayoa has joined the channel
@Leo Przybylski has joined the channel
@Lucky has joined the channel
@Ben Smith (Cloud Posse) has joined the channel
2022-02-18
What
Expose permission_boundary for IAM roles
Why
So that this can be passed in as a variable.
The environment I’m currently working in requires all IAM roles to be created
with an attached permissions boundary, otherwise we get denied.
Left a comment. I haven’t worked too much with permission boundaries, but I feel like each IAM role should have its own permission boundary enabled rather than only being able to pass one permission boundary policy for all 3 roles.
What
Expose permission_boundary for IAM roles
Why
So that this can be passed in as a variable.
The environment I’m currently working in requires all IAM roles to be created
with an attached permissions boundary, otherwise we get denied.
Cool, have made this change and pushed
Thanks @Yonatan Koren
Looks perfect
Releasing this with another PR (cloudposse/terraform-aws-datadog-lambda-forwarder/pull/29)
@joshmyers I’ve released https://github.com/cloudposse/terraform-aws-datadog-lambda-forwarder/releases/tag/0.11.0
2022-02-19
@joshmyers @Yonatan Koren if you don’t mind may i ask few q about DD around your approach in configuration / paths taken
• if you go for DD lambda forwarder for logs & custom metrics built, isn’t more cost efficient to stream all the logs to K Firehose -> DD and same using CW Metrics stream ?
• for Metrics, are you all using Metric API pooling which a) has a lag and b) can be quick pricey as you can’t control the frequency the DD crawlers pull from ?
• for estates where you have > 2k lambdas (hence 2k CW log groups) how do you manage the subscription filter for DD fwd if the apps are being rollout w/o TF (say cdk ) ? thanks in advance
are you specifically talking about this module? https://github.com/cloudposse/terraform-aws-datadog-lambda-forwarder
Terraform module to provision all the necessary infrastructure to deploy Datadog Lambda forwarders
or DD in genral?
in general @jose.amengual how folks have attacked the above trade-offs. The module is very neat though
I think the DD implementation of the lambda forwarder is absolutely terrible
they have 3 different pieces of code to do 1 thing
right, the implementation is one thing. But the other one is “out of all the options to skin the cat, which one is closer to the sweet spot” , that is what i’m trying to get my head around really. Sadly i see only half way house which is very hard to pitch with L1 because you know eventually you end up paying on AWS side (CW bill will kill you) and DD side.
All i want really is a simple:
• cost efficient solution
• simple/ clear path implementation If i’d say: Use CW Metrics stream ( and still be able to use the DD percentile metrics - you can’t btw, have just raised a Feature request with DD) to push Metrics and maybe with “transport mechanism” provided by Firehose push logs to DD. If you need APM for Lambda service only get also the Extension layer.
All that is simple, rather than current approach….
@DaniC (he/him) I want to defer some of the implementation questions to those who were more heavily involved with this module, namely @jose.amengual and @RB.
However I think this module, created ~7 months ago, was written with the intention to conform to the documentation on the Datadog site on the subject and have logs aggregated in Datadog.
What you are describing is an optimization that is nevertheless possible with the Datadog Forwarder, as documented in the project’s README for logs / monitoring. This document mentions Kinesis, whereas the previously-linked document on the DD website doesn’t at all.
• for estates where you have
2k lambdas (hence 2k CW log groups) how do you manage the subscription filter for DD fwd if the apps are being rollout w/o TF (say cdk ) ?
I haven’t done this and I’m assuming you mean that you have an environment with 2000+ log groups managed outside of Terraform. If the prefixes are deterministic, you can probably use the cloudwatch_log_groups data source in the TF configuration instantiating our module, then supply
cloudwatch_forwarder_log_groups = {
for log_group in data.cloudwatch_log_groups.default :
log_group.name => {
name = log_group.name
filter_pattern = "" }
}
}
2k sounds like an excessive amount. Are they all in one region + account? Cloud Posse’s term for the TF configuration that instantiates our modules is a component
. These are configured for each environment
(region) and stage
(account) and managed via atmos
. So we’d probably never have 2k app log groups in one account. But regardless not having a single monolithic AWS account for everything is a standard practice which may or may not apply to the legacy environment you’re dealing with, so maybe if the accounts are divided per SDLC environment, even if they are all in one region, you’re only dealing with 2000 / 3 = ~600-700 per account? Still a ton but maybe doable with what I described above.
As for your other questions, I’m not technically familiar enough (yet) with the specifics of the DD Forwarder to comment on them. But I think if there’s anything that can be adjusted for our module to potentially support an optimized log forwarding solution as you’re describing above on top of the current, officially-documented solution, then it definitely has its place in our module.
@Erik Osterman (Cloud Posse) maybe you have some additional insight too
@Yonatan Koren first massive thanks for writing such a detail response !!
What you are describing is an optimization that is nevertheless possible with the Datadog Forwarder, as documented in the project’s README for logs / monitoring. This document mentions Kinesis, whereas the previously-linked document on the DD website doesn’t at all.
oh, wasn’t aware of it, being new to DD i’ve only followed their official docs which …leaves a lot of room for improvements. I’ve tried to get good insights from their support folks but …the help is very basic (although they are very good and raising the ticket/ Zendesk does a brilliant job)
2k sounds like an excessive amount. Are they all in one region + account?
yes same acc, same region. Is because i have so many full deployed envs …
But regardless not having a single monolithic AWS account for everything is a standard practice which may or may not apply to the legacy environment you’re dealing with, so maybe if the accounts are divided per SDLC environment, even if they are all in one region, you’re only dealing with 2000 / 3 = ~600-700 per account? Still a ton but maybe doable with what I described above.
totally agree, i have a mountain to climb in getting the whole estate to a saner state, is coming
Yeah I’m not sure how well Terraform is going to handle 2k keys for the cloudwatch_log_groups
data source in the state. But I would still give it a go. Not much else to do about your legacy environment short of migrating it, which may totally be impractical if you don’t have the buy-in from the business types from above, and/or the resources to do it.
Currently only using DD lambda forwarder to pull ALB logs out of an S3 bucket and get them into DD
Services and lambdas log to CloudWatch and we have an optional Kinesis firehose to spray them into DD
Not a big fan of this forwarder honestly
but saves me writing something to do it
Thing is, if you have 2K+ Lambda log groups, you aren’t going to want to subscribe them all to the same firehose, so you’d need to shard that and it’ll get messy
The lambda/service module for each service controls the CWL group and individual firehose for that lambda/service
Yes that means a single firehose per log group, but isolation++
Current client are not tight on budget.
We have individual services that can saturate a firehose in prod
Can hit ~40M RPM
Interesting point about a firehose for everything, haven’t considered the implication. However in my case, we are tight on budget so we’ll need to take that into account.
and for metrics what do you do @joshmyers ? you using Metrics Streams -> or you went with pooling from Metrics API using DD crawlers/ aws integration ?
DD scraping metrics from CloudWatch (no doubt some lag)
Indeed, that is what i saw too and my users are getting itchy about it. Sadly if you move away from DD scraping metrics you lose the percentile metrics which is absent (feature request already raised but …) in Metrics stream
Have not used Metrics Stream
How are your app metrics getting into Cloudwatch?
2022-02-20
2022-02-21
2022-02-22
Any thoughts on what https://github.com/cloudposse/terraform-external-module-artifact/blob/master/main.tf#L20-L29 could be swapped out for? template_file
data source has been deprecated and there are no arm64 builds available. template_file()
expects to read an actual file
data "template_file" "url" {
count = module.this.enabled ? 1 : 0
template = replace(var.url, "$$", "$")
vars = {
filename = var.filename
git_ref = local.git_ref
module_name = var.module_name
}
}
Afk right now but will respond soon
Are you sure there’s no issue open for this? I feel like @RB raised this already but maybe my memory is lying to me
data "template_file" "url" {
count = module.this.enabled ? 1 : 0
template = replace(var.url, "$$", "$")
vars = {
filename = var.filename
git_ref = local.git_ref
module_name = var.module_name
}
}
I think we could switch for format and use %v with positional parameters
Not as easy to understand. Or we just switch to replace and just manually replace those 3 options
Our longer term plans is to convert to using Dockerized lambdas with public Docker images on ECR
Yeah I thought format could work
in the short term, i think we should fork that template provider and release an arm build.
there is a community version already with an arm build.
https://github.com/gxben/terraform-provider-template
and to use it you can set this in your versions.tf file
terraform {
required_version = ">= 1.0.0"
required_providers {
template = {
source = "gxben/template"
# version has to be set explicitly instead of using a > sign
version = "= 2.2.0-m1"
}
}
}
ya and a format function would work if someone was to pr it in
2022-02-23
I requested and PR’d additional functionality into the terraform-aws-ec2-autoscale-group module yesterday but was denied because there is the ability to create the resource outside of the module, however I think it is a gap in CloudPosse’s coverage to not have any modules that create/manage an ECS cluster.
Is there an opportunity for re-evaluation of this denial, or otherwise, for CloudPosse to start and maintain a terraform-aws-ecs-cluster
module?
whoops, I wrote this before seeing @nitrocode’s response, although his suggestion was to ping in Slack. very coincidental!
please take a look at https://github.com/cloudposse/terraform-aws-ecs-alb-service-task/blob/master/examples/complete/main.tf
provider "aws" {
region = var.region
}
module "vpc" {
source = "cloudposse/vpc/aws"
version = "0.28.1"
cidr_block = var.vpc_cidr_block
context = module.this.context
}
module "subnets" {
source = "cloudposse/dynamic-subnets/aws"
version = "0.39.8"
availability_zones = var.availability_zones
vpc_id = module.vpc.vpc_id
igw_id = module.vpc.igw_id
cidr_block = module.vpc.vpc_cidr_block
nat_gateway_enabled = true
nat_instance_enabled = false
context = module.this.context
}
resource "aws_ecs_cluster" "default" {
name = module.this.id
tags = module.this.tags
}
module "container_definition" {
source = "cloudposse/ecs-container-definition/aws"
version = "0.58.1"
container_name = var.container_name
container_image = var.container_image
container_memory = var.container_memory
container_memory_reservation = var.container_memory_reservation
container_cpu = var.container_cpu
essential = var.container_essential
readonly_root_filesystem = var.container_readonly_root_filesystem
environment = var.container_environment
port_mappings = var.container_port_mappings
}
module "ecs_alb_service_task" {
source = "../.."
alb_security_group = module.vpc.vpc_default_security_group_id
container_definition_json = module.container_definition.json_map_encoded_list
ecs_cluster_arn = aws_ecs_cluster.default.arn
launch_type = var.ecs_launch_type
vpc_id = module.vpc.vpc_id
security_group_ids = [module.vpc.vpc_default_security_group_id]
subnet_ids = module.subnets.public_subnet_ids
ignore_changes_task_definition = var.ignore_changes_task_definition
network_mode = var.network_mode
assign_public_ip = var.assign_public_ip
propagate_tags = var.propagate_tags
deployment_minimum_healthy_percent = var.deployment_minimum_healthy_percent
deployment_maximum_percent = var.deployment_maximum_percent
deployment_controller_type = var.deployment_controller_type
desired_count = var.desired_count
task_memory = var.task_memory
task_cpu = var.task_cpu
context = module.this.context
}
the ECS cluster itself is just 2 lines of code
resource "aws_ecs_cluster" "default" {
name = module.this.id
tags = module.this.tags
}
for all the rest, we have modules
``` provider “aws” { region = var.region }
module “vpc” { source = “cloudposse/vpc/aws” version = “0.18.2” cidr_block = var.vpc_cidr_block
context = module.this.context
}
module “subnets” { source = “cloudposse/dynamic-subnets/aws” version = “0.34.0” availability_zones = var.availability_zones vpc_id = module.vpc.vpc_id igw_id = module.vpc.igw_id cidr_block = module.vpc.vpc_cidr_block nat_gateway_enabled = true nat_instance_enabled = false aws_route_create_timeout = “5m” aws_route_delete_timeout = “10m”
context = module.this.context }
module “alb” { source = “cloudposse/alb/aws” version = “0.27.0” vpc_id = module.vpc.vpc_id security_group_ids = [module.vpc.vpc_default_security_group_id] subnet_ids = module.subnets.public_subnet_ids internal = false http_enabled = true access_logs_enabled = true alb_access_logs_s3_bucket_force_destroy = true cross_zone_load_balancing_enabled = true http2_enabled = true deletion_protection_enabled = false
context = module.this.context }
resource “aws_ecs_cluster” “default” { name = module.this.id tags = module.this.tags setting { name = “containerInsights” value = “enabled” } }
resource “aws_sns_topic” “sns_topic” { name = module.this.id display_name = “Test terraform-aws-ecs-web-app” tags = module.this.tags kms_master_key_id = “alias/aws/sns” }
module “ecs_web_app” { source = “../..”
region = var.region vpc_id = module.vpc.vpc_id
# Container container_image = var.container_image container_cpu = var.container_cpu container_memory = var.container_memory container_memory_reservation = var.container_memory_reservation port_mappings = var.container_port_mappings log_driver = var.log_driver aws_logs_region = var.region healthcheck = var.healthcheck
# Authentication authentication_type = var.authentication_type alb_ingress_listener_unauthenticated_priority = var.alb_ingress_listener_unauthenticated_priority alb_ingress_listener_authenticated_priority = var.alb_ingress_listener_authenticated_priority alb_ingress_unauthenticated_hosts = var.alb_ingress_unauthenticated_hosts alb_ingress_authenticated_hosts = var.alb_ingress_authenticated_hosts alb_ingress_unauthenticated_paths = var.alb_ingress_unauthenticated_paths alb_ingress_authenticated_paths = var.alb_ingress_authenticated_paths authentication_cognito_user_pool_arn = var.authentication_cognito_user_pool_arn authentication_cognito_user_pool_client_id = var.authentication_cognito_user_pool_client_id authentication_cognito_user_pool_domain = var.authentication_cognito_user_pool_domain authentication_oidc_client_id = var.authentication_oidc_client_id authentication_oidc_client_secret = var.authentication_oidc_client_secret authentication_oidc_issuer = var.authentication_oidc_issuer authentication_oidc_authorization_endpoint = var.authentication_oidc_authorization_endpoint authentication_oidc_token_endpoint = var.authentication_oidc_token_endpoint authentication_oidc_user_info_endpoint = var.authentication_oidc_user_info_endpoint
# ECS ecs_private_subnet_ids = module.subnets.private_subnet_ids ecs_cluster_arn = aws_ecs_cluster.default.arn ecs_cluster_name = aws_ecs_cluster.default.name ecs_security_group_ids = var.ecs_security_group_ids health_check_grace_period_seconds = var.health_check_grace_period_seconds desired_count = var.desired_count launch_type = var.launch_type container_port = var.container_port
# ALB alb_arn_suffix = module.alb.alb_arn_suffix alb_security_group = module.alb.security_group_id alb_ingress_unauthenticated_listener_arns = [module.alb.http_listener_arn] alb_ingress_unauthenticated_listener_arns_count = 1 alb_ingress_healthcheck_path = var.alb_ingress_healthcheck_path
# CodePipeline codepipeline_enabled = var.codepipeline_enabled badge_enabled = var.codepipeline_badge_enabled github_oauth_token = var.codepipeline_github_oauth_token github_webhooks_token = var.codepipeline_github_webhooks_token github_webhook_events = var.codepipeline_github_webhook_events repo_owner = var.codepipeline_repo_owner repo_name = var.codepipeline_repo_name branch = var.codepipeline_branch build_image = var.codepipeline_build_image build_timeout = var.codepipeline_build_timeout buildspec = var.codepipeline_buildspec poll_source_changes = var.poll_source_changes webhook_enabled = var.webhook_enabled webhook_target_action = var.webhook_target_action webhook_authentication = var.webhook_authentication webhook_filter_json_path = var.webhook_filter_json_path webhook_filter_match_equals = var.webhook_filter_match_equals codepipeline_s3_bucket_force_destroy = var.codepipeline_s3_bucket_force_destroy container_environment = var.container_environment secrets = var.secrets
# Autoscaling autoscaling_enabled = var.autoscaling_enabled autoscaling_dimension = var.autoscaling_dimension autoscaling_min_capacity = var.autoscaling_min_capacity autoscaling_max_capacity = var.autoscaling_max_capacity autoscaling_scale_up_adjustment = var.autoscaling_scale_up_adjustment autoscaling_scale_up_cooldown = var.autoscaling_scale_up_cooldown autoscaling_scale_down_adjustment = var.autoscaling_scale_down_adjustment autoscaling_scale_down_cooldown = var.autoscaling_scale_down_cooldown
# ECS alarms ecs_alarms_enabled = var.ecs_alarms_enabled ecs_alarms_cpu_utilization_high_threshold = var.ecs_alarms_cpu_utilization_high_threshold ecs_alarms_cpu_utilization_high_evaluation_periods = var.ecs_alarms_cpu_utilization_high_evaluation_periods ecs_alarms_cpu_utilization_high_period = var.ecs_alarms_cpu_utilization_high_period ecs_alarms_cpu_utilization_low_threshold = var.ecs_alarms_cpu_utilization_low_threshold ecs_alarms_cpu_utilization_low_evaluation_periods = var.ecs_alarms_cpu_utilization_low_evaluation_periods ecs_alarms_cpu_utilization_low_period = var.ecs_alarms_cpu_utilization_low_period ecs_alarms_memory_utilization_high_threshold = var.ecs_alarms_memory_utilization_high_threshold ecs_alarms_memory_utilization_high_evaluation_periods = var.ecs_alarms_memory_utilization_high_evaluation_periods ecs_alarms_memory_utilization_high_period = var.ecs_alarms_memory_utilization_high_period ecs_alarms_memory_utilization_low_threshold = var.ecs_alarms_memory_utilization_low_threshold ecs_alarms_memory_utilization_low_evaluation_periods = var.ecs_alarms_memory_utilization_low_evaluation_periods ecs_alarms_memory_utilization_low_pe…
oh wow i didn’t know we provisioned the ecs cluster anywhere
@Andriy Knysh (Cloud Posse) yeah, I understand the potential simplicity of creating an ECS cluster but still my infrastructure is entirely module based due to terragrunt.
That makes our options..
- write a module to use cloudposse’s naming/context/null-label which just creates an ECS cluster a. if we go the “write it ourselves” route, we can fork the ASG module to include cluster creation
- use the public terraform-aws-modules ecs cluster module a. we lose cloudposse’s standardized naming. Again, we can just fork For this reason, our ideal solution is to use a cloudposse module, whether we contribute it or not. But it is also easy enough for us to just use our fork of the ASG module.
@RB Those are test files, AFAIK there is no module that provisions the cluster
no module for the cluster b/c it’s just 2 lines of code
terragrunt does not work with plain TF resources?
no, only with modules. there has been discussion of working with resource
,data
, but it has been avoided because that blurs the lines between terragrunt and terraform
in your code, where you are using the ASG module, can you add those line to create the cluster?
from your code, you can create a top-level module and include the ASG module and other TF resources
yes but so far we have not had to maintain a module at all. This is what our terragrunt looks like …
terraform {
# Set to fork, version locked to commit hash
source = "[email protected]:kevcube/terraform-aws-ec2-autoscale-group.git?ref=bfe85953cf"
}
inputs = {
name = "ecs"
image_id = "ami-0071e84f5b73239a6"
instance_type = "t4g.small"
iam_instance_profile_name = dependency.instance_profile.outputs.instance_profile
subnet_ids = dependency.subnets.outputs.private_subnet_ids
security_group_ids = [
dependency.database.outputs.security_group_id,
dependency.bastion.outputs.security_group_id
]
min_size = 1
max_size = 2
key_name = "Kevin"
capacity_rebalance = true
create_ecs_cluster = true
update_default_version = true
}
dependency "instance_profile" {
config_path = "../../../global/iam/ecs-container-instance"
}
dependency "bastion" {
config_path = "../../security/bastion"
}
dependency "subnets" {
config_path = "../../network/vpc-main/subnets"
}
dependency "database" {
config_path = "../../database/postgres-main"
}
include "root" {
path = find_in_parent_folders("root.hcl")
}
include "aws" {
path = find_in_parent_folders("aws.hcl")
}
and then at higher levels we define other elements of null-label naming
we prob can create a module for that (it’s a few lines of code, but anyway) we’ll look into that
We would be required to either..
- create a module that wraps ASG module and includes ECS cluster
- Use our fork of ASG that includes ECS cluster
- use a module that exclusively defines ECS cluster
@Andriy Knysh (Cloud Posse) I would greatly appreciate that!
and yes, cluster definition can be done in a few lines of code, but as @RB mentioned in the PR, configuration for ECS clusters has become slightly more advanced over time, now there are multiple resources that can be involved like aws_ecs_cluster_capacity_providers
2022-02-24
hello!
on <https://registry.terraform.io/modules/cloudposse/ecs-container-definition/aws/latest>
the example links are all broken…
can someone help direct me to where i can figure out how to incorporate multiple containers into the ecs service definition with this module? (i need a logging sidecar.)
The links work in the github repo since they are all relative
Terraform module to generate well-formed JSON documents (container definitions) that are passed to the aws_ecs_task_definition Terraform resource
Ya, I really wonder why HashiCorp refuses to fix this long standing problem since day one with the registry
2022-02-25
This may not the place to bring up the issue on https://registry.terraform.io/modules/cloudposse/kms-key/aws/latest where we see a yellow tag in: