SweetOps #terraform for March, 2024

Archive: https://archive.sweetops.com/terraform/

2024-03-01

Daniel Grzelak

Trying to get a bit more awareness in the Terraform community that state files need to be well secured. If anyone is interested, I’m happy to share some research I published going from state file edit access to code execution in a pipeline.

12:54:31 PM

Is this with native terraform or does the research also mention opentofu?

I read recently opentofu encrypts the state so secrets can be used there somewhat safer than terraform

Hans D

03:49:12 PM

#516 Storing sensitive values in state files

#309 was the first change in Terraform that I could find that moved to store sensitive values in state files, in this case the password value for Amazon RDS. This was a bit of a surprise for me, as previously I’ve been sharing our state files publicly. I can’t do that now, and feel pretty nervous about the idea of storing state files in version control at all (and definitely can’t put them on github or anything).

If Terraform is going to store secrets, then some sort of field-level encryption should be built in as well. In the meantime, I’m going to change things around to use https://github.com/AGWA/git-crypt on sensitive files in my repos.

loren

04:10:33 PM

Here’s the opentofu RFC on client-side state encryption, https://github.com/opentofu/opentofu/issues/874

#874 RFC: Client Side State Encryption

Summary

This feature adds the option to encrypt local state files, remote state, and plan files. Encryption is off-by-default.
Partial encryption, when enabled, only encrypts values marked as sensitive to protect credentials contained in the
state. Full encryption, when enabled, protects against any information disclosure from leaked state or plans.

Problem Statement

OpenTofu state and plans contain lots of sensitive information.

The most obvious example are credentials such as primary access keys to storage, but even ignoring any credentials
state often includes a full map of your network, including every VM, kubernetes cluster, database, etc.
That is a treasure trove for an attacker who wishes to orient themselves in your private network.

Unlike runtime information processed by OpenTofu, which only lives in memory and is discarded when the run ends,
state and plans are persisted. In large installations, state is not (just) stored in local files because multiple
users need access to it. Remote state backend options include simple storage (such as storage accounts, various
databases, …), meaning these storage options do not “understand” the state, but there are also extended backends,
which do wish to gain information from state. The persistent nature and (often) cloud storage of state increases
the risk of it falling into the wrong hands.

Large corporations and financial institutions have compliance requirements for storage of sensitive information.
One frequent requirement is encryption at rest using a customer managed key. This is exactly what this feature
provides, and if you use it intelligently, even the cloud provider storing your state will not have access to the
encryption key at all.

User-facing description

OpenTofu masks sensitive values in its printed output, but those very same sensitive values are written to state:

example snippet from a statefile for an Azure storage account with the primary access key (of course not a real one)

Pay particular attention to the line listing the primary access key. The storage account listed here doesn’t exist,
but if it did, the primary access key would give an attacker full access to all the data on the storage account.

Getting Started

Note: the exact format of the configuration is likely to change as we test out the implementation and figure out
the precise details. So don’t rely too much on exact field names or format of the contents at this point in time.

With the feature this RFC is about, you could simply set an environment variable before running OpenTofu:

export TF_STATE_ENCRYPTION='{"backend":{"method":{"name":"full"},"key_provider":{"name":"passphrase","config":{"passphrase":"foobarbaz"}}}}'

For readability let’s spell out the value of the environment variable even though you wouldn’t normally set it like this:

export TF_STATE_ENCRYPTION='''{
  "backend": {
    "method": {
      "name": "full"
    },
    "key_provider": {
      "name": "passphrase",
      "config": {
        "passphrase": "foobarbaz"
      }
    }
  }
}'''

And suddenly, your remote state looks like this:

{
    "encryption": {
        "version": 1,
        "method": {
            "name": "full",
            "config": {}
        }
    },
    "payload": "e93e3e7ad3434055251f695865a13c11744b97e54cb7dee8f8fb40d1fb096b728f2a00606e7109f0720aacb15008b410cf2f92dd7989c2ff10b9712b6ef7d69ecdad1dccd2f1bddd127f0f0d87c79c3c062e03c2297614e2effa2fb1f4072d86df0dda4fc061"
}

This is the same state as before, only fully encrypted with AES256 using a key derived from the passphrase you provided.

Actually, most of the settings shown in the environment variable have sensible defaults, so this also works:

export TF_STATE_ENCRYPTION='''{
  "backend": {
    "key_provider": {
      "config": {
        "passphrase": "foobarbaz"
      }
    }
  }
}'''

You can also specify the 32-byte key directly instead of providing a passphrase:

export TF_STATE_ENCRYPTION='''{
  "backend": {
    "method": {
      "name": "full"
    },
    "key_provider": {
      "name": "direct",
      "config": {
        "key": "a0a1a2a3a4a5a6a7a8a9b0b1b2b3b4b5b6b7b8b9c0c1c2c3c4c5c6c7c8c9d0d1"
      }
    }
  }
}'''

Whether you use a passphrase or directly provide the key, it comes from an environment variable. Even if your state
is stored in another storage account, noone outside your organisation would have the encryption key.
Your users that run OpenTofu will need it, though.

Better yet, the key can also come from AWS KMS, all you’d need to change for that is the environment variable value:

export TF_STATE_ENCRYPTION='''{
  "backend": {
    "method": {
      "name": "full"
    },
    "key_provider": {
      "name": "awskms",
      "config": {
        "region": "us-east-1",
        "key_id": "alias/terraform"
      }
    }
  }
}'''

Or retrieve your encryption key from an Azure Key Vault, or GCP Key Mgmt, or Vault. Of course, if you retrieve
the key from the cloud provider your state storage is located at, they have both the state and the key now, so
maybe don’t use the same cloud provider if you worry about attacks from their side (or from government actors):

Using external key retrieval options allows you to place the equivalent configuration in the
remote state configuration, so the configuration is checked in with your code, and still be
properly secure, because now the configuration does not need to include the actual encryption key.

Instead of full state encryption, you can have just the sensitive values encrypted in the state:

export TF_STATE_ENCRYPTION=TODO example

This will make your state look almost exactly like the original unencrypted state, so you can still easily doctor it if
you need to, except that the primary access key is now encrypted, and that the encryption section is present.

{
    "encryption": {
        "version": 1,
        "methods": {
            "encrypt/SOPS/xyz": {
               ...
            }
        }
    },
    TODO
}

Once Your State Is Encrypted

State encryption is completely transparent. All OpenTofu commands work exactly the same, even tofu state push and
tofu state pull work as expected. The latter downloads the state, and prints it in decrypted form, which is useful
if you ever run into the need to manually doctor your state. Lately, that need has become much rarer than it
used to be.

Since the configuration can be set in environment variables, wrappers like Terragrunt work just fine. As do typical
CI systems for OpenTofu such as Atlantis.

Note: We will need to test whether it is possible to use multiple different encryption keys with terragrunt. It may
be that within the same tree, you must stick to one key. We know from experience, that terragrunt run-all works
in that scenario.

If your CI system is more involved and insists on reading your state contents, you can’t use full state encryption.
You may still be able to use partial state encryption, configuring it to only encrypt the sensitive values. This will still
prevent exposing your passwords to both the CI system and the state storage, greatly frustrating any threat actors
trying to get into your infrastructure through those attack vectors.

If you want to rotate state encryption keys, or even switch state encryption methods, there is a second
environment variable called TF_STATE_DECRYPTION_FALLBACK. This one is tried for decryption if the primary
configuration in TF_STATE_ENCRYPTION fails to decrypt your state successfully. Encryption, unlike decryption, always
uses only the primary configuration, so you can use this to rotate your key on the next write operation.

Unencrypted state is recognized and automatically bypasses the decryption step. That’s what happens during initial
encryption, or if for some other reason your state happens to be currently unencrypte…

Daniel Grzelak

08:50:20 PM

https://blog.plerion.com/hacking-terraform-state-privilege-escalation/

Hacking Terraform State for Privilege Escalation - Plerion attachment image

What can an attacker do if they can edit Terraform state? The answer should be ‘nothing’ but is actually ‘take over your CI/CD pipeline’.

Daniel Grzelak

08:51:05 PM

One of the tofu maintainers reached out so I assume it works similarly.

Christopher McGill

07:02:05 PM

Question. In atmos.yaml we are using “auto_generate_backend_file: true” and seeing in S3 atmos creating a folder for component then a sub-folder for stack, which it them places the terraform.tf into. When we run the same layout of components/stacks/config against GCP GCS we are seeing only a state file be created, no folders, example core-usc1-auto.tfstate which it is renaming the terraform.tf to which is the name of the stage. Has anyone seen this behaviour or can advise? Thanks

Andriy Knysh (Cloud Posse)

07:09:32 PM

did you correctly configure GCP backend in Atmos manifests? something like this:

  terraform:
    # Backend
    backend_type: gcs
    backend:
      gcs:
        bucket: "xxxxxxx-bucket-tfstate"
        prefix: "terraform/tfstate" 

Erik Osterman (Cloud Posse)

10:00:42 PM

(also, just a heads up, atmos is best for these questions)

Jonas Mellquist

07:42:22 AM

Anyone with some insight into the Cloudposse AWS modules or routing in S2S VPN Connections in general I posted a question here https://sweetops.slack.com/archives/CDYGZCLDQ/p1709301177485109

Greetings everyone. I’m using the cloudposse/vpn-connection/aws module and I’m facing some issues that I really don’t understand..

My module code is as follows

module "vpn_connection" {
  source  = "cloudposse/vpn-connection/aws"
  version = "1.0.0"

  namespace                                 = var.namespace
  stage                                     = var.env
  name                                      = var.vpn_connection_name
  vpc_id                                    = var.vpc_id
  vpn_gateway_amazon_side_asn               = var.amazon_asn
  customer_gateway_bgp_asn                  = var.customer_asn
  customer_gateway_ip_address               = var.customer_gateway_ip_address
  route_table_ids                           = var.route_table_ids
  vpn_connection_static_routes_only         = true
  vpn_connection_static_routes_destinations = [var.vpn_connection_static_routes_destinations]
  vpn_connection_local_ipv4_network_cidr    = var.vpn_connection_static_routes_destinations
  vpn_connection_remote_ipv4_network_cidr   = var.vpc_cidr
}

route_table_ids should contain a single element found using https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/route_tables and vpn_connection_static_routes_destinations is a simple ipv4 cidr coming in as a string

The ‘calling’ of the module

module "vpn-connection" {
  source = "../../modules/vpn-connection"

  namespace                                 = var.namespace
  env                                       = var.environment
  vpn_connection_name                       = var.vpn_connection_name
  vpc_id                                    = module.staging-vpc.vpc_id
  amazon_asn                                = var.amazon_asn
  customer_asn                              = var.customer_asn
  customer_gateway_ip_address               = var.customer_gateway_ip_address
  route_table_ids                           = data.aws_route_tables.route_tables_for_vpn_connection_to_public_subnets.ids
  vpn_connection_static_routes_destinations = var.vpn_connection_static_routes_destinations
  vpc_cidr                                  = var.vpc_cidr
}

Should I not in the route tables inside route_table_ids see a non-propagated / aka static route to the contents of var.vpn_connection_static_routes_destinations

I see Route propagation set to No under the Route table which is also what I want..

But where’s my static route?

Jonas Mellquist

07:58:02 AM

Anyone with a good example of how to structure ECS resources in Terraform.

Looking to soon build an AWS ECS Fargate Cluster, numerous services (some utilizing CloudMap) and numerous tasks.

How do I organise the task definitions in the code and make use of templating as much as possible?

My idea was to use the following resource types, but I’m in doubt of the structure and what makes the most sense

data template_file reference .tpl file in another folder
aws_ecs_task_definition -> container_definitions = data.template_file.shop.rendered
aws_ecs_service -> task_definition = aws_ecs_task_definition.shop.arn
aws_appautoscaling_target
aws_appautoscaling_policy

For the external facing containers I guess I’d also need a lot of ALB resources, I was hoping a module could help me here…

Was initially looking towards this module: https://github.com/terraform-aws-modules/terraform-aws-ecs/blob/master/examples/fargate/main.tf

Any other recommendations or perhaps a repo I can peek at or a blogpost or something similar?

provider "aws" {
  region = local.region
}

data "aws_availability_zones" "available" {}

locals {
  region = "eu-west-1"
  name   = "ex-${basename(path.cwd)}"

  vpc_cidr = "10.0.0.0/16"
  azs      = slice(data.aws_availability_zones.available.names, 0, 3)

  container_name = "ecsdemo-frontend"
  container_port = 3000

  tags = {
    Name       = local.name
    Example    = local.name
    Repository = "<https://github.com/terraform-aws-modules/terraform-aws-ecs>"
  }
}

################################################################################
# Cluster
################################################################################

module "ecs_cluster" {
  source = "../../modules/cluster"

  cluster_name = local.name

  # Capacity provider
  fargate_capacity_providers = {
    FARGATE = {
      default_capacity_provider_strategy = {
        weight = 50
        base   = 20
      }
    }
    FARGATE_SPOT = {
      default_capacity_provider_strategy = {
        weight = 50
      }
    }
  }

  tags = local.tags
}

################################################################################
# Service
################################################################################

module "ecs_service" {
  source = "../../modules/service"

  name        = local.name
  cluster_arn = module.ecs_cluster.arn

  cpu    = 1024
  memory = 4096

  # Enables ECS Exec
  enable_execute_command = true

  # Container definition(s)
  container_definitions = {

    fluent-bit = {
      cpu       = 512
      memory    = 1024
      essential = true
      image     = nonsensitive(data.aws_ssm_parameter.fluentbit.value)
      firelens_configuration = {
        type = "fluentbit"
      }
      memory_reservation = 50
      user               = "0"
    }

    (local.container_name) = {
      cpu       = 512
      memory    = 1024
      essential = true
      image     = "public.ecr.aws/aws-containers/ecsdemo-frontend:776fd50"
      port_mappings = [
        {
          name          = local.container_name
          containerPort = local.container_port
          hostPort      = local.container_port
          protocol      = "tcp"
        }
      ]

      # Example image used requires access to write to root filesystem
      readonly_root_filesystem = false

      dependencies = [{
        containerName = "fluent-bit"
        condition     = "START"
      }]

      enable_cloudwatch_logging = false
      log_configuration = {
        logDriver = "awsfirelens"
        options = {
          Name                    = "firehose"
          region                  = local.region
          delivery_stream         = "my-stream"
          log-driver-buffer-limit = "2097152"
        }
      }

      linux_parameters = {
        capabilities = {
          drop = [
            "NET_RAW"
          ]
        }
      }

      memory_reservation = 100
    }
  }

  service_connect_configuration = {
    namespace = aws_service_discovery_http_namespace.this.arn
    service = {
      client_alias = {
        port     = local.container_port
        dns_name = local.container_name
      }
      port_name      = local.container_name
      discovery_name = local.container_name
    }
  }

  load_balancer = {
    service = {
      target_group_arn = module.alb.target_groups["ex_ecs"].arn
      container_name   = local.container_name
      container_port   = local.container_port
    }
  }

  subnet_ids = module.vpc.private_subnets
  security_group_rules = {
    alb_ingress_3000 = {
      type                     = "ingress"
      from_port                = local.container_port
      to_port                  = local.container_port
      protocol                 = "tcp"
      description              = "Service port"
      source_security_group_id = module.alb.security_group_id
    }
    egress_all = {
      type        = "egress"
      from_port   = 0
      to_port     = 0
      protocol    = "-1"
      cidr_blocks = ["0.0.0.0/0"]
    }
  }

  service_tags = {
    "ServiceTag" = "Tag on service level"
  }

  tags = local.tags
}

################################################################################
# Supporting Resources
################################################################################

data "aws_ssm_parameter" "fluentbit" {
  name = "/aws/service/aws-for-fluent-bit/stable"
}

resource "aws_service_discovery_http_namespace" "this" {
  name        = local.name
  description = "CloudMap namespace for ${local.name}"
  tags        = local.tags
}

module "alb" {
  source  = "terraform-aws-modules/alb/aws"
  version = "~> 9.0"

  name = local.name

  load_balancer_type = "application"

  vpc_id  = module.vpc.vpc_id
  subnets = module.vpc.public_subnets

  # For example only
  enable_deletion_protection = false

  # Security Group
  security_group_ingress_rules = {
    all_http = {
      from_port   = 80
      to_port     = 80
      ip_protocol = "tcp"
      cidr_ipv4   = "0.0.0.0/0"
    }
  }
  security_group_egress_rules = {
    all = {
      ip_protocol = "-1"
      cidr_ipv4   = module.vpc.vpc_cidr_block
    }
  }

  listeners = {
    ex_http = {
      port     = 80
      protocol = "HTTP"

      forward = {
        target_group_key = "ex_ecs"
      }
    }
  }

  target_groups = {
    ex_ecs = {
      backend_protocol                  = "HTTP"
      backend_port                      = local.container_port
      target_type                       = "ip"
      deregistration_delay              = 5
      load_balancing_cross_zone_enabled = true

      health_check = {
        enabled             = true
        healthy_threshold   = 5
        interval            = 30
        matcher             = "200"
        path                = "/"
        port                = "traffic-port"
        protocol            = "HTTP"
        timeout             = 5
        unhealthy_threshold = 2
      }

      # There's nothing to attach here in this definition. Instead,
      # ECS will attach the IPs of the tasks to this target group
      create_attachment = false
    }
  }

  tags = local.tags
}

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 5.0"

  name = local.name
  cidr = local.vpc_cidr

  azs             = local.azs
  private_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 4, k)]
  public_subnets  = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 8, k + 48)]

  enable_nat_gateway = true
  single_nat_gateway = true

  tags = local.tags
}

01:05:01 PM

These root level components might help

https://github.com/cloudposse/terraform-aws-components/tree/main/modules/ecs

https://github.com/cloudposse/terraform-aws-components/tree/main/modules/ecs-service

If you use atmos, you can reuse the code per region-account using yaml inputs

provider "aws" {
  region = local.region
}

data "aws_availability_zones" "available" {}

locals {
  region = "eu-west-1"
  name   = "ex-${basename(path.cwd)}"

  vpc_cidr = "10.0.0.0/16"
  azs      = slice(data.aws_availability_zones.available.names, 0, 3)

  container_name = "ecsdemo-frontend"
  container_port = 3000

  tags = {
    Name       = local.name
    Example    = local.name
    Repository = "<https://github.com/terraform-aws-modules/terraform-aws-ecs>"
  }
}

################################################################################
# Cluster
################################################################################

module "ecs_cluster" {
  source = "../../modules/cluster"

  cluster_name = local.name

  # Capacity provider
  fargate_capacity_providers = {
    FARGATE = {
      default_capacity_provider_strategy = {
        weight = 50
        base   = 20
      }
    }
    FARGATE_SPOT = {
      default_capacity_provider_strategy = {
        weight = 50
      }
    }
  }

  tags = local.tags
}

################################################################################
# Service
################################################################################

module "ecs_service" {
  source = "../../modules/service"

  name        = local.name
  cluster_arn = module.ecs_cluster.arn

  cpu    = 1024
  memory = 4096

  # Enables ECS Exec
  enable_execute_command = true

  # Container definition(s)
  container_definitions = {

    fluent-bit = {
      cpu       = 512
      memory    = 1024
      essential = true
      image     = nonsensitive(data.aws_ssm_parameter.fluentbit.value)
      firelens_configuration = {
        type = "fluentbit"
      }
      memory_reservation = 50
      user               = "0"
    }

    (local.container_name) = {
      cpu       = 512
      memory    = 1024
      essential = true
      image     = "public.ecr.aws/aws-containers/ecsdemo-frontend:776fd50"
      port_mappings = [
        {
          name          = local.container_name
          containerPort = local.container_port
          hostPort      = local.container_port
          protocol      = "tcp"
        }
      ]

      # Example image used requires access to write to root filesystem
      readonly_root_filesystem = false

      dependencies = [{
        containerName = "fluent-bit"
        condition     = "START"
      }]

      enable_cloudwatch_logging = false
      log_configuration = {
        logDriver = "awsfirelens"
        options = {
          Name                    = "firehose"
          region                  = local.region
          delivery_stream         = "my-stream"
          log-driver-buffer-limit = "2097152"
        }
      }

      linux_parameters = {
        capabilities = {
          drop = [
            "NET_RAW"
          ]
        }
      }

      memory_reservation = 100
    }
  }

  service_connect_configuration = {
    namespace = aws_service_discovery_http_namespace.this.arn
    service = {
      client_alias = {
        port     = local.container_port
        dns_name = local.container_name
      }
      port_name      = local.container_name
      discovery_name = local.container_name
    }
  }

  load_balancer = {
    service = {
      target_group_arn = module.alb.target_groups["ex_ecs"].arn
      container_name   = local.container_name
      container_port   = local.container_port
    }
  }

  subnet_ids = module.vpc.private_subnets
  security_group_rules = {
    alb_ingress_3000 = {
      type                     = "ingress"
      from_port                = local.container_port
      to_port                  = local.container_port
      protocol                 = "tcp"
      description              = "Service port"
      source_security_group_id = module.alb.security_group_id
    }
    egress_all = {
      type        = "egress"
      from_port   = 0
      to_port     = 0
      protocol    = "-1"
      cidr_blocks = ["0.0.0.0/0"]
    }
  }

  service_tags = {
    "ServiceTag" = "Tag on service level"
  }

  tags = local.tags
}

################################################################################
# Supporting Resources
################################################################################

data "aws_ssm_parameter" "fluentbit" {
  name = "/aws/service/aws-for-fluent-bit/stable"
}

resource "aws_service_discovery_http_namespace" "this" {
  name        = local.name
  description = "CloudMap namespace for ${local.name}"
  tags        = local.tags
}

module "alb" {
  source  = "terraform-aws-modules/alb/aws"
  version = "~> 9.0"

  name = local.name

  load_balancer_type = "application"

  vpc_id  = module.vpc.vpc_id
  subnets = module.vpc.public_subnets

  # For example only
  enable_deletion_protection = false

  # Security Group
  security_group_ingress_rules = {
    all_http = {
      from_port   = 80
      to_port     = 80
      ip_protocol = "tcp"
      cidr_ipv4   = "0.0.0.0/0"
    }
  }
  security_group_egress_rules = {
    all = {
      ip_protocol = "-1"
      cidr_ipv4   = module.vpc.vpc_cidr_block
    }
  }

  listeners = {
    ex_http = {
      port     = 80
      protocol = "HTTP"

      forward = {
        target_group_key = "ex_ecs"
      }
    }
  }

  target_groups = {
    ex_ecs = {
      backend_protocol                  = "HTTP"
      backend_port                      = local.container_port
      target_type                       = "ip"
      deregistration_delay              = 5
      load_balancing_cross_zone_enabled = true

      health_check = {
        enabled             = true
        healthy_threshold   = 5
        interval            = 30
        matcher             = "200"
        path                = "/"
        port                = "traffic-port"
        protocol            = "HTTP"
        timeout             = 5
        unhealthy_threshold = 2
      }

      # There's nothing to attach here in this definition. Instead,
      # ECS will attach the IPs of the tasks to this target group
      create_attachment = false
    }
  }

  tags = local.tags
}

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 5.0"

  name = local.name
  cidr = local.vpc_cidr

  azs             = local.azs
  private_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 4, k)]
  public_subnets  = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 8, k + 48)]

  enable_nat_gateway = true
  single_nat_gateway = true

  tags = local.tags
}

2024-03-02

2024-03-03

Mahi C

11:48:46 AM

Hello everyone,

I’ve encountered an issue with my Terraform configuration for managing an Amazon RDS database. Here’s the situation:

I initially created an RDS instance from a snapshot using Terraform. Now, I need to update the instance size (e.g., change from db.t2.micro to db.t3.medium). However, when I rerun my Terraform script, it destroys the existing RDS instance and creates a new DB.

Is there a way to avoid this behavior? Ideally, I’d like to modify the existing RDS instance without causing unnecessary downtime or data loss.

Any suggestions or best practices would be greatly appreciated!

Hamza

10:50:43 AM

You can use the nested block lifecycle , first of all do a terraform plan look for the changes making Terraform think that the RDS resource needs to be replaced and ignore them using the lifecycle nested block lifecycle { ignore_changes = [ # Ignore changes to tags, e.g. because a management agent # updates these based on some ruleset managed elsewhere. tags, ] }

Adding deletion_protection = true is always a good practice to prevent the resource destruction

Mahi C

02:08:32 PM

Thanks @Hamza

2024-03-04

Christopher McGill

12:18:51 AM

Question. Using Atmos, have a single stack, that has a few network components that deployed fine, added a new GCE component, in same stack file, using same imports uses same GCS backend file for terraform.state, and we have atmos.yaml setup for auto state creation and management, seeing weird behaviour where it keeps wanting to destroy all the other components created. This is a standalone component that is no hard dependencies in config of terraform, we do ref the VPC network name and subnet for the GCE by name but not using module.X.selflink etc from the other component. Any ideas as I have spent hours on this. Thanks

jose.amengual

12:22:28 AM

can you clarify this :

VPC network name and subnet for the GCE by name

jose.amengual

12:22:38 AM

as in data lookup after passing name or ID?

Christopher McGill

12:28:16 AM

No just hardcoded in stack vars: like “vpc-01” and “subnet-01”

jose.amengual

12:28:27 AM

jose.amengual

12:29:19 AM

and you say you created like a network component, deployed to the stack and then you added a new component and somehow when planning wants to delete the other components?

Christopher McGill

12:31:02 AM

Yes. Deployed a few components VPC, Subnet, NAT GW, Router, all fine. Then add to same stack.yaml file a GCE Bastion and wants to destroy all the other components.

Christopher McGill

12:31:25 AM

Confirmed in describe component it is using the right backend file and workspace name.

jose.amengual

12:31:26 AM

that makes no sense unless you used the same name

Christopher McGill

12:31:35 AM

same name for?

jose.amengual

12:31:46 AM

you can’t have the same name for a component in yaml

Christopher McGill

12:32:02 AM

# (because google_compute_subnetwork.subnetwork is not in configuration)

Christopher McGill

12:32:08 AM

Diff different names

Christopher McGill

12:32:32 AM

just says its not in configuration why it wants to destroy it

jose.amengual

12:32:54 AM

components:
  terraform:
    example:
      vars:
        enabled: true
    example:
      vars:
        enabled: false

jose.amengual

12:33:08 AM

that will set example.enabled = false

Christopher McGill

12:33:55 AM

So I am not using context or component.tf in the component, so not used this enabled: true, could that be an issue?

jose.amengual

12:34:11 AM

no that is just an example

jose.amengual

12:34:14 AM

are you using terraform state data.source?

Christopher McGill

12:34:23 AM

jose.amengual

12:34:39 AM

can you show some of your stack.yaml?

Christopher McGill

12:37:30 AM

Sure. Where you see XX is just added there to project sensitive of company I work for.

Christopher McGill

12:37:32 AM

components:
  terraform:
    vpc:
      metadata:
        component: vpc
        inherits:
          - vpc/defaults
      vars:
        enabled: true
        label_key_case: lower
        project_id: auto-v1u
        region: us-central1
        shared_vpc_host: false
        subnets:
          - subnet_name: subnet-01
            subnet_ip: 10.150.2.0/24
            subnet_region: us-central1
            subnet_private_access: true
            subnet_flow_logs: true
            subnet_flow_logs_interval: INTERVAL_5_SEC
            subnet_flow_logs_sampling: 0.5
            subnet_flow_logs_metadata: INCLUDE_ALL_METADATA

        secondary_ranges: 
          XX-glb-vpc-auto: 
            - ip_cidr_range: "10.158.128.0/17"
              range_name: "us-central1-gke-01-pods"
            - ip_cidr_range: "10.160.208.0/20"
              range_name: "us-central1-gke-01-services"

        cloud_nat:
          subnetworks:
            - name: XX-glb-vpc-auto

    gke:
      vars:
        ip_range_pods: "us-central1-gke-01-pods"
        ip_range_services: "us-central1-gke-01-services"
        master_ipv4_cidr_block: "10.100.0.0/28"
        gke_name: "us-central1-gke-01"
        network: "XX-gbl-auto"
        project_id: "auto-v1u"
        subnetwork: "XX-glb-vpc-auto"
        region: "us-central1"
        machine_type: "e2-medium"
        disk_size_gb: "100"
        location: "us-central1"
        min_count: "1"
        max_count: "100"
        local_ssd_count: "0"

Christopher McGill

12:38:28 AM

The VPC and Cloud NAT deploy perfect. The GKE or GCE components don’t

jose.amengual

12:39:39 AM

do you see a {component-name}/terraform.tfstate folder in your state backend?

jose.amengual

12:40:08 AM

every component will have it’s own state file in a folder called the sane name as the component ( usually)

Christopher McGill

12:41:37 AM

So this is interesting in AWS, that is the setup each component has a folder, then another folder for stack. In our GCS here its each stack has a tf state file at moment, so three components here in one state file.

Christopher McGill

12:41:59 AM

Be really interested as to the why each component needs own state file?

jose.amengual

12:43:11 AM

That is how atmos works, it uses workspaces heavily

Christopher McGill

12:43:51 AM

Thanks. This sounds like the issue then. Is there any links or docs that goes into this in more detail?

jose.amengual

12:44:13 AM

That way you can have multiple components in one stack, and they all have their own state file that does not interfere with other components and makes the blast radius smaller

jose.amengual

12:45:12 AM

big state files are not a cool thing. they make plans and apply slow and if you think from the point of separation of concerns a bit dangerous too

jose.amengual

12:45:20 AM

https://atmos.tools/core-concepts/components/remote-state

Terraform Component Remote State | atmos

The Terraform Component Remote State is used when we need to get the outputs of an Terraform component,

Christopher McGill

12:46:00 AM

Thanks for your help.

Erik Osterman (Cloud Posse)

04:05:33 AM

@Christopher McGill please use atmos

2024-03-06

Release notes from terraform

07:23:32 PM

v1.8.0-beta1 1.8.0-beta1 (March 6, 2024) UPGRADE NOTES: If you are upgrading from Terraform v1.7 or earlier, please refer to the Terraform v1.8 Upgrade Guide.

backend/s3: The use_legacy_workflow argument has been removed to encourage consistency with the AWS SDKs. The backend will now search for credentials in the same order as the default provider chain in the AWS SDKs and AWS CLI.

NEW FEATURES:…

Upgrading to Terraform v1.8 | Terraform | HashiCorp Developer attachment image

Upgrading to Terraform v1.8

susie-h

07:49:01 PM

Can someone explain how module.this.enabled is used across your modules? When i try to replicate in my code, terraform says “there is no module named “this””. I see it used a lot throughout your code and it looks really neat, but i’m missing something. https://github.com/cloudposse/terraform-aws-api-gateway/blob/main/main.tf

locals {
  enabled                = module.this.enabled
  create_rest_api_policy = local.enabled && var.rest_api_policy != null
  create_log_group       = local.enabled && var.logging_level != "OFF"
  log_group_arn          = local.create_log_group ? module.cloudwatch_log_group.log_group_arn : null
  vpc_link_enabled       = local.enabled && length(var.private_link_target_arns) > 0
}

resource "aws_api_gateway_rest_api" "this" {
  count = local.enabled ? 1 : 0

  name = module.this.id
  body = jsonencode(var.openapi_config)
  tags = module.this.tags

  endpoint_configuration {
    types = [var.endpoint_type]
  }
}

resource "aws_api_gateway_rest_api_policy" "this" {
  count       = local.create_rest_api_policy ? 1 : 0
  rest_api_id = aws_api_gateway_rest_api.this[0].id

  policy = var.rest_api_policy
}

module "cloudwatch_log_group" {
  source  = "cloudposse/cloudwatch-logs/aws"
  version = "0.6.8"

  enabled              = local.create_log_group
  iam_tags_enabled     = var.iam_tags_enabled
  permissions_boundary = var.permissions_boundary

  context = module.this.context
}

resource "aws_api_gateway_deployment" "this" {
  count       = local.enabled ? 1 : 0
  rest_api_id = aws_api_gateway_rest_api.this[0].id

  triggers = {
    redeployment = sha1(jsonencode(aws_api_gateway_rest_api.this[0].body))
  }

  lifecycle {
    create_before_destroy = true
  }
  depends_on = [aws_api_gateway_rest_api_policy.this]
}

resource "aws_api_gateway_stage" "this" {
  count                = local.enabled ? 1 : 0
  deployment_id        = aws_api_gateway_deployment.this[0].id
  rest_api_id          = aws_api_gateway_rest_api.this[0].id
  stage_name           = var.stage_name != "" ? var.stage_name : module.this.stage
  xray_tracing_enabled = var.xray_tracing_enabled
  tags                 = module.this.tags

  variables = {
    vpc_link_id = local.vpc_link_enabled ? aws_api_gateway_vpc_link.this[0].id : null
  }

  dynamic "access_log_settings" {
    for_each = local.create_log_group ? [1] : []

    content {
      destination_arn = local.log_group_arn
      format          = replace(var.access_log_format, "\n", "")
    }
  }
}

# Set the logging, metrics and tracing levels for all methods
resource "aws_api_gateway_method_settings" "all" {
  count       = local.enabled ? 1 : 0
  rest_api_id = aws_api_gateway_rest_api.this[0].id
  stage_name  = aws_api_gateway_stage.this[0].stage_name
  method_path = "*/*"

  settings {
    metrics_enabled = var.metrics_enabled
    logging_level   = var.logging_level
  }
}

# Optionally create a VPC Link to allow the API Gateway to communicate with private resources (e.g. ALB)
resource "aws_api_gateway_vpc_link" "this" {
  count       = local.vpc_link_enabled ? 1 : 0
  name        = module.this.id
  description = "VPC Link for ${module.this.id}"
  target_arns = var.private_link_target_arns
}

Hans D

08:17:17 PM

see eg https://github.com/cloudposse/terraform-aws-components/blob/3727f96af1ed4a81c00445290e23360af3ee0cfe/modules/vpc/context.tf#L23 A generic “mixin” in a self-contained context.tf

module "this" {

Hans D

08:19:25 PM

coming from cloudposse/terraform-null-label/exports

Hans D

08:20:06 PM

(we vendor that file explicitly in all of our own components)

Erik Osterman (Cloud Posse)

03:28:39 AM

https://www.youtube.com/watch?v=V2b5F6jt6tQ

context.tf (terraform-null-label) | Cloud Posse Explains

Erik Osterman (Cloud Posse)

03:29:00 AM

(forgive the messed up, left-channel audio)

Matt Gowie

06:22:27 AM

Also,

terraform-null-label: the why and how it should be used | Masterpoint Consulting attachment image

A post highlighting one of our favorite terraform modules: terraform-null-label. We dive into what it is, why it’s great, and some potential use cases in …

terraform-null-label: Advanced Usage | Masterpoint Consulting attachment image

A post highlighting some advanced usage of the terraform-null-label module showing root/child module relationship and implementation of a naming + tagging …

Jeremy G (Cloud Posse)

10:34:33 AM

As explained above, module.this is defined in a drop-in file named [context.tf](http://context.tf) that is vendored in from null-label.

By convention, when module.this.enabled is false, the module should create no resources, and all outputs should be null or empty. This configuration is propagated to Cloud Posse Terraform modules (all of which include [context.tf](http://context.tf)) by the assignment

context = module.this.context

If you want to make a variant of the label (see the video above), you instantiate null-label, passing context = module.this.context, but then also passing in overrides or additions, and then use the context, tags, IDs, etc from that module instantiation going forward.

Ryan

04:40:57 AM

I hate asking this but are there any user modules besides s3-user or iam-system-user? iam-system-user I ran into a few issues with where it landed the account created, and it’s directly attaching policies. I’m still pretty new to TF, but I think I could make something that matches our compliance requirements with a little work, but I figured I’d ask before I go writing this. Definitely do not want to use the user, but vendor can’t provide trust relationship requirements for a role otherwise. New to the community otherwise, so hi everyone.

Matt Gowie

06:24:14 AM

I think you need to expand on what you’re trying to do. Sounds like system-user would work for you, but I don’t understand the reason why it won’t from what you shared.

Ryan

11:22:31 AM

Apologies. I need an access key to let a Cisco product into our environment, and yea system-user ideally should work but it needs to be modified to meet my compliance standards, nothing crazy except that it’s attaching permissions directly to the user/key in where compliance requires whatever access reqs to be part of a role.

I tried using system-user yesterday as well, and it would create a system-user as I wanted, but it was creating it in the account SAML goes through and not in my assumed role account. I believe our module of system-user is modified so I’m unsure if that’s part of the issue, I didn’t have time to go back and try it again yet.

Matt Gowie

05:01:16 PM

@Ryan – Okay, if I’m understanding correctly, I think what you may need to do is use iam-system-user to create your user resource, use terraform-aws-iam-role to create your role, give the role the right permissions for what Cisco needs to do, and give the system-user the permissions to assume the role. You can do that in a root module that combines those two child modules. Does that make sense and sound like what you need to do?
it was creating it in the account SAML goes through and not in my assumed role account. The account the resources are created in are dependent on what is in providers.tf, no exact what role you have assumed locally. I would check the logic in your root module and see if that will help.

cloudposse/terraform-aws-iam-role

A Terraform module that creates IAM role with provided JSON IAM polices documents.

Ryan

06:42:09 PM

Awesome response thank you. I’ll dig into this a little bit later today.

Ryan

09:28:01 PM

You were right no providers. Thank you for setting me down the right path. Definitely a lightbulb moment when I compared provider vs no provider tf, after reading providers.tf code. I’m still getting the hang of atmos and terraform but really enjoying it.

2024-03-07

2024-03-08

leonkatz

04:19:58 PM

Is there a file formatting I can use for “tftpl” template files? Does Jinja2 work? (I’m using Intellij IDE)

Gabriela Campana (Cloud Posse)

04:30:23 PM

@matt @Jeremy White (Cloud Posse)

Matt Gowie

08:13:20 PM

Not that I know of on VSCode at least.

James Humphries

02:09:19 PM

Interestingly, I just saw this in the latest jetbrains terraform plugin: https://github.com/JetBrains/intellij-plugins/commit/9c3336ff5ead368e3fb120263479340e701a0f32#diff-05a30b5d16dee0a3b96a[…]5a95f6047f23ec697bd356eR40

I do’t think there’s support for it in a lot of the editors, but jetbrains is giving this area some love

2024-03-11

2024-03-13

Andy Wortman

03:21:48 PM

Been smashing my head against a wall on this one for a while. We have a set of kubernetes ingresses defined via kubernetes_ingress_v1 resources, using a kubernetes_ingress_class resource, spec’d with the the ingress.k8s.aws/alb controller. I need to update the SSL Policy for the ALB, but I can’t find documentation on how to define it. The only place that seems to be relevant is as an annotation in the ingress definition, but that means I have to define it for every ingress that uses that ingress class - which seems inefficient and prone to problems. What happens if two ingresses define different values here?

Does anyone know how to set the SSL Policy for an ALB ingress class?

Brian

04:38:03 PM

I assume you’re using ingress class provided by aws-loadbalancer-controller . If so, here is the schema definition for it. It does have sslPolicy attribute.

Brian

04:38:07 PM

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    controller-gen.kubebuilder.io/version: v0.11.1
  creationTimestamp: 2024-03-13T15:02:07Z
  generation: 1
  name: ingressclassparams.elbv2.k8s.aws
  resourceVersion: "198995"
  uid: 0fd1a62b-d774-423a-89fe-fbc87fe0cda2
spec:
  group: elbv2.k8s.aws
  names:
    plural: ingressclassparams
    singular: ingressclassparams
    kind: IngressClassParams
    listKind: IngressClassParamsList
  scope: Cluster
  versions:
    - name: v1beta1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          description: IngressClassParams is the Schema for the IngressClassParams API
          type: object
          properties:
            apiVersion:
              description: "APIVersion defines the versioned schema of this representation of
                an object. Servers should convert recognized schemas to the
                latest internal value, and may reject unrecognized values. More
                info:
                <https://git.k8s.io/community/contributors/devel/sig-architectur>\
                e/api-conventions.md#resources"
              type: string
            kind:
              description: "Kind is a string value representing the REST resource this object
                represents. Servers may infer this from the endpoint the client
                submits requests to. Cannot be updated. In CamelCase. More info:
                <https://git.k8s.io/community/contributors/devel/sig-architectur>\
                e/api-conventions.md#types-kinds"
              type: string
            metadata:
              type: object
            spec:
              description: IngressClassParamsSpec defines the desired state of
                IngressClassParams
              type: object
              properties:
                group:
                  description: Group defines the IngressGroup for all Ingresses that belong to
                    IngressClass with this IngressClassParams.
                  type: object
                  required:
                    - name
                  properties:
                    name:
                      description: Name is the name of IngressGroup.
                      type: string
                inboundCIDRs:
                  description: InboundCIDRs specifies the CIDRs that are allowed to access the
                    Ingresses that belong to IngressClass with this
                    IngressClassParams.
                  type: array
                  items:
                    type: string
                ipAddressType:
                  description: IPAddressType defines the ip address type for all Ingresses that
                    belong to IngressClass with this IngressClassParams.
                  type: string
                  enum:
                    - ipv4
                    - dualstack
                loadBalancerAttributes:
                  description: LoadBalancerAttributes define the custom attributes to
                    LoadBalancers for all Ingress that that belong to
                    IngressClass with this IngressClassParams.
                  type: array
                  items:
                    description: Attributes defines custom attributes on resources.
                    type: object
                    required:
                      - key
                      - value
                    properties:
                      key:
                        description: The key of the attribute.
                        type: string
                      value:
                        description: The value of the attribute.
                        type: string
                namespaceSelector:
                  description: NamespaceSelector restrict the namespaces of Ingresses that are
                    allowed to specify the IngressClass with this
                    IngressClassParams. * if absent or present but empty, it
                    selects all namespaces.
                  type: object
                  properties:
                    matchExpressions:
                      description: matchExpressions is a list of label selector requirements. The
                        requirements are ANDed.
                      type: array
                      items:
                        description: A label selector requirement is a selector that contains values, a
                          key, and an operator that relates the key and values.
                        type: object
                        required:
                          - key
                          - operator
                        properties:
                          key:
                            description: key is the label key that the selector applies to.
                            type: string
                          operator:
                            description: operator represents a key's relationship to a set of values. Valid
                              operators are In, NotIn, Exists and DoesNotExist.
                            type: string
                          values:
                            description: values is an array of string values. If the operator is In or
                              NotIn, the values array must be non-empty. If the
                              operator is Exists or DoesNotExist, the values
                              array must be empty. This array is replaced during
                              a strategic merge patch.
                            type: array
                            items:
                              type: string
                    matchLabels:
                      description: matchLabels is a map of {key,value} pairs. A single {key,value} in
                        the matchLabels map is equivalent to an element of
                        matchExpressions, whose key field is "key", the operator
                        is "In", and the values array contains only "value". The
                        requirements are ANDed.
                      type: object
                      additionalProperties:
                        type: string
                  x-kubernetes-map-type: atomic
                scheme:
                  description: Scheme defines the scheme for all Ingresses that belong to
                    IngressClass with this IngressClassParams.
                  type: string
                  enum:
                    - internal
                    - internet-facing
                sslPolicy:
                  description: SSLPolicy specifies the SSL Policy for all Ingresses that belong to
                    IngressClass with this IngressClassParams.
                  type: string
                subnets:
                  description: Subnets defines the subnets for all Ingresses that belong to
                    IngressClass with this IngressClassParams.
                  type: object
                  properties:
                    ids:
                      description: IDs specify the resource IDs of subnets. Exactly one of this or
                        `tags` must be specified.
                      type: array
                      minItems: 1
                      items:
                        description: SubnetID specifies a subnet ID.
                        type: string
                        pattern: subnet-[0-9a-f]+
                    tags:
                      description: Tags specifies subnets in the load balancer's VPC where each tag
                        specified in the map key contains one of the values in
                        the corresponding value list. Exactly one of this or
                        `ids` must be specified.
                      type: object
                      additionalProperties:
                        type: array
                        items:
                          type: string
                tags:
                  description: Tags defines list of Tags on AWS resources provisioned for
                    Ingresses that belong to IngressClass with this
                    IngressClassParams.
                  type: array
                  items:
                    description: Tag defines a AWS Tag on resources.
                    type: object
                    required:
                      - key
                      - value
                    properties:
                      key:
                        description: The key of the tag.
                        type: string
                      value:
                        description: The value of the tag.
                        type: string
      subresources: {}
      additionalPrinterColumns:
        - name: GROUP-NAME
          type: string
          description: The Ingress Group name
          jsonPath: .spec.group.name
        - name: SCHEME
          type: string
          description: The AWS Load Balancer scheme
          jsonPath: .spec.scheme
        - name: IP-ADDRESS-TYPE
          type: string
          description: The AWS Load Balancer ipAddressType
          jsonPath: .spec.ipAddressType
        - name: AGE
          type: date
          jsonPath: .metadata.creationTimestamp
  conversion:
    strategy: None

Andy Wortman

04:52:12 PM

Ok, so I can define this at the load balancer controller? It looks like we implemented that via a helm release. Now to figure out how to override values in that chart…

Thanks!

Brian

04:54:53 PM

No problem. If you’re using CloudPosse’s eks/alb-controller , you can add sslPolicy to this section.

https://github.com/cloudposse/terraform-aws-components/blob/37d8a5bfa04054231a04bf31cb66a575978352c8/modules/eks/alb-controller/main.tf#L50-L57

Brian

04:55:29 PM

Or here if your using their eks/alb-controller-ingress-class… https://github.com/cloudposse/terraform-aws-components/blob/37d8a5bfa04054231a04bf[…]66a575978352c8/modules/eks/alb-controller-ingress-class/main.tf

Andy Wortman

05:07:09 PM

ok cool, we are using a kubenetes_manifest resource for the ingress class params. I’m trying this:

resource "kubernetes_manifest" "stack_ingress_public_class_params" {
  provider = kubernetes.cluster
  manifest = {
    apiVersion = "elbv2.k8s.aws/v1beta1"
    kind       = "IngressClassParams"

    metadata = {
      name = "stack-ingress-public"
    }

    spec = {
      group = {
        name = "stack-ingress-public"
      }
      sslPolicy = "ELBSecurityPolicy-TLS13-1-2-2021-06"
    }
  }
}

Andy Wortman

05:09:29 PM

hmm…. I must have the structure wrong. Error: Manifest configuration incompatible with resource schema

Brian

05:12:00 PM

You may need to check your controller’s (and its CRDs) version. I provided you the CRD schema for the latest (v2.7) aws-loadbalancer-controller . I am not sure when sslPolicy was added.

Andy Wortman

05:17:36 PM

How did you pull that schema?

Brian

05:20:43 PM

kubectl --context <your-kube-context> get ingressclassparams.elbv2.k8s.aws -o yaml

Brian

05:22:08 PM

Correction… This is the command.

kubectl --context <your-kube-context> get crd ingressclassparams.elbv2.k8s.aws -o yaml

Brian

05:22:39 PM

That first command likely works too, but it’s not what you want.

Andy Wortman

05:26:02 PM

Yeah, my schema is too old, doesn’t have SSLPolicy. Now to upgrade…

Release notes from terraform

11:03:30 PM

v1.7.5 1.7.5 (March 13, 2024) BUG FIXES:

backend/s3: When using s3 backend and encountering a network issue, the retry code would fail with “failed to rewind transport stream for retry”. Now the retry should be successful. (#34796)

Update AWS SDK versions to the latest by kojoru · Pull Request #34796 · hashicorp/terraform attachment image

This pull request updates the AWS SDKs to the latest version. The intent is to fix #34528, which is intermittent and can’t be easily reproduced. The root cause was discussed in the relevant AWS SDK…

2024-03-14

bessey

06:58:17 AM

Hello, we encounter an issue with the CloudPosse AWS backup vault module. During the destruction of a backup vault, the process trying to remove the backup vault before the recovery points, and due to this sequence, the deployment failed.

• Do we need to update the module to be able to remove the recovery points before the backup vault ?

• Or Could we add a lifecycle in the cloudposse module ?

Erik Osterman (Cloud Posse)

02:28:33 PM

@Ben Smith (Cloud Posse)

bessey

10:02:54 PM

Same issue with the 1.0.0 version

Gabriela Campana (Cloud Posse)

06:11:34 PM

Hi @bessey Is this something you can fix and open PR? Otherwise this should be addressed eventually (can’t give you an eta).

bessey

07:25:24 AM

Hi @Gabriela Campana (Cloud Posse) we have decided to replace the module by terraform resources in our next release. We wasted too much time on this subject. Thanks all for your infos and help.

2024-03-15

François Davier

10:52:44 AM

Hi Guys, is someone of you met this issue too ? https://github.com/cloudposse/terraform-aws-backup/issues/60

#60 set disable for module cause issue with recovery points

Describe the Bug

if i set disable on module by adding count parameter. when i execute my terraform code. terraform try to delete backup vault but it failed because it containing recovery points

Expected Behavior

should be able to set count to 0. apply terraform and see backup vault destroyed without issue

Steps to Reproduce

set count to 0 o backup vault with recovery point inside and apply terraform

Screenshots Screenshot 2023-11-16 at 14 33 18 Environment

actual module deploy in our environments 0.7.1
aws = {
source = “hashicorp/aws”
version = “5.16.1”
}
and terraform 1.0.0

Additional Context

No response

Jeremy G (Cloud Posse)

09:18:23 PM

IMHO, deleting backups should be hard to do, so I am not bothered by this behavior. What happens if you run terraform destroy with the module enabled?

#60 set disable for module cause issue with recovery points

Describe the Bug

if i set disable on module by adding count parameter. when i execute my terraform code. terraform try to delete backup vault but it failed because it containing recovery points

Expected Behavior

should be able to set count to 0. apply terraform and see backup vault destroyed without issue

Steps to Reproduce

set count to 0 o backup vault with recovery point inside and apply terraform

Screenshots Screenshot 2023-11-16 at 14 33 18 Environment

actual module deploy in our environments 0.7.1
aws = {
source = “hashicorp/aws”
version = “5.16.1”
}
and terraform 1.0.0

Additional Context

No response

bessey

10:04:56 PM

The issue is still remaining. After upgraded to 1.0.0, issue is the same : Error: deleting Backup Vault (MY_BACKUP_VAULT_NAME): InvalidRequestException: Backup vault cannot be deleted because it contains recovery points.

bessey

10:06:41 PM

I noticed that even with a depends_on, the module don’t take into account this option !

bessey

10:22:57 PM

Did you tried to run a terraform destroy with at least 1 recovery point inside the backup vault ? I don’t think so, because it’s a requisite to remove a backup vault, it’s mandatory to delete all recovery points before to delete a backup vault

Jeremy G (Cloud Posse)

10:05:50 AM

There is similar behavior around deleting S3 buckets: because of the permanent loss of data, extra precautions are in place.

I believe the best approach is to provide a force_destroy option which would override this protection, but I will ask @Ben Smith (Cloud Posse) to look into it, since he is the most familiar with the topic. CC @Erik Osterman (Cloud Posse)

bessey

01:03:53 PM

Hi, regarding S3, the issue was due to the versioning, the sequence tried to delete the S3 before the versioning, we have fix the issue with a sleep_time resource to avoid to remove the S3 before to suspend the versioning and it’s working fine, but not with the cloudposse aws backup module

bessey

01:44:58 PM

Trying to remove recovery poins from a null_resource and add a depends_on on this null_resource from the module, same issue, even if the recovery points are removed from the AWS console before the backup vault, in Cloudtrail we can see that the backup vault is removed before the recovery points, and it’s failed as well :

bessey

08:32:33 AM

For info, the depends_on works fine when we enabled the module, but not when we deactivate it.

bessey

09:00:23 AM

One question : in your documentation from https://docs.cloudposse.com/modules/library/aws/backup/ it’s mentioned the below code regarding the retention period : rules = [ { name = “${module.this.name}-daily” schedule = var.schedule start_window = var.start_window completion_window = var.completion_window lifecycle = { cold_storage_after = var.cold_storage_after delete_after = var.delete_after } } ]

But from https://github.com/cloudposse/terraform-aws-backup/blob/1.0.0/docs/migration-0.13.x-0.14.x+.md it’s mentioned the below one :

  rules = [
    {
      schedule           = var.schedule
      start_window       = var.start_window
      completion_window  = var.completion_window
      cold_storage_after = var.cold_storage_after
      delete_after       = var.delete_after
    }
  ]

With the second code, I noticed that the retention period is “Always” and not my specific value from the AWS console :

backup | The Cloud Posse Developer Hub

Terraform module to provision AWS Backup, a fully managed backup service that makes it easy to centralize and automate the back up of data across AWS services such as Amazon EBS volumes, Amazon EC2 instances, Amazon RDS databases, Amazon DynamoDB tables, Amazon EFS file systems, and AWS Storage Gateway volumes.
[!NOTE]
The syntax of declaring a backup schedule has changed as of release 0.14.0, follow the instructions in the 0.13.x to 0.14.x+ migration guide.

[!WARNING] The deprecated variables have been fully deprecated as of 1.x.x. Please use the new variables as described in the 0.13.x to 0.14.x+ migration guide.

# Migration from 0.13.x to 0.14.x

Version 0.14.0 of this module implements ability to add multiple schedules in a backup plan. This requires changing inputs to the module slightly. Make sure to update your configuration to use the new syntax.

Before:

hcl module “backup” { source = “cloudposse/backup/aws”

schedule = var.schedule start_window = var.start_window completion_window = var.completion_window cold_storage_after = var.cold_storage_after delete_after = var.delete_after }


After:

hcl module “backup” { source = “cloudposse/backup/aws”

rules = [ { schedule = var.schedule start_window = var.start_window completion_window = var.completion_window cold_storage_after = var.cold_storage_after delete_after = var.delete_after } ] }


Now you can have multiple backup schedules:

hcl module “backup” { source = “cloudposse/backup/aws”

rules = [ { name = “daily” schedule = “cron(0 10 * * ? *)” start_window = 60 completion_window = 120 cold_storage_after = 30 delete_after = 180 }, { name = “monthly” schedule = “cron(0 12 1 * ? *)” start_window = 60 completion_window = 120 cold_storage_after = 30 delete_after = 180 } ] }

bessey

09:11:07 AM

Do we need to set the lifecycle block to take into account the retention period for the source AWS backup and recovery points ?

Jeremy G (Cloud Posse)

07:58:44 PM

@Ben Smith (Cloud Posse)

bessey

09:04:25 AM

I confirm that the lifecycle block is expected to take into account the delete_after option

Ben Smith (Cloud Posse)

04:09:59 PM

Thanks @bessey for pointing this out. Sounds like a bug in our module. I’ll get to this soon so that we can configure the retention period

François Davier

10:52:47 AM

thanks

2024-03-16

2024-03-17

jose.amengual

01:18:26 PM

https://www.reuters.com/technology/cloud-software-company-hashicorp-exploring-potential-sale-bloomberg-news-reports-2024-03-15/

Cloud software company HashiCorp exploring potential sale, Bloomberg News reports attachment image

Cloud software vendor HashiCorp is exploring options, including a sale, Bloomberg News reported on Friday citing people familiar with the matter.

managedkaos

05:02:34 PM

“HashiCorp has been working with a financial adviser in recent months and has held exploratory talks with other industry players, the report said.”

I’m eager to know who the ‘industry players’ are!

Cloud software company HashiCorp exploring potential sale, Bloomberg News reports attachment image

Cloud software vendor HashiCorp is exploring options, including a sale, Bloomberg News reported on Friday citing people familiar with the matter.

jose.amengual

05:03:11 PM

chef is going to buy it and deprecate terraform lol

managedkaos

05:05:44 PM

managedkaos

05:06:40 PM

Perhaps they will get beat out by Puppet.

2024-03-18

Almighty

06:37:05 PM

Any terragrunt users here?

Erik Osterman (Cloud Posse)

07:21:53 PM

Try #terragrunt

Josh B.

06:37:13 PM

Sadly yes

Almighty

06:38:10 PM

Haha do you prefer something over terragrunt?

Josh B.

06:41:32 PM

Just Terraform would be nice, but with so many environments and regions it seems Terragrunt does the job we need. I am sure there are other tools like CP’s , but we were already in deep with Terragrunt.

joey jensen

08:45:44 PM

Anybody using terraform to manage kubernetes? I’m curious if you’ve found any advantage to terraform as vs any other technology to manage kubernetes objets. …or any other opinions you have about terraform, or other kubernetes Iac solutions.

Erik Osterman (Cloud Posse)

09:23:17 PM

Yes, we do at Cloud Posse.

Erik Osterman (Cloud Posse)

09:23:47 PM

Here you can see what we do. https://github.com/cloudposse/terraform-aws-components/tree/main/modules/eks

Erik Osterman (Cloud Posse)

09:24:12 PM

You’ll notice we also deploy ArgoCD…. with Terraform

Erik Osterman (Cloud Posse)

09:25:25 PM

Since we practice GitOps with Terraform, we get most of the same benefits as we do with ArgoCD. We rely on Terraform predominantly for kubernetes backing services. Things like ALBs, Ingress, Operators/Controllers, etc. Things that rely on other things to exist, which we provision with Terraform. E.g. IAM roles, DNS zones, etc.

Erik Osterman (Cloud Posse)

09:25:47 PM

Use ArgoCD for your applications.

Erik Osterman (Cloud Posse)

09:26:52 PM

We’ve considered, but not adopted things like Amazon Controllers for Kubernetes. Happy to answer anything else related to this.

2024-03-19

Mahesh

03:59:39 PM

When I am using cloudposse IAM module , by default it creates namespace,stage & name as prefix to IAM role which do not want..how do I avoid?

Erik Osterman (Cloud Posse)

04:55:17 PM

Don’t set them

Erik Osterman (Cloud Posse)

04:55:27 PM

I believe all are nullable

Erik Osterman (Cloud Posse)

04:56:10 PM

For example, if you have your own naming convention, just use name and set it to your requirements

Mahesh

05:04:51 PM

Let me try..But I remember if namespace not set ..tf apply fails with namespace need to set.

Erik Osterman (Cloud Posse)

06:12:11 PM

If that happens, let me know which module.

Fizz

06:51:41 PM

add this

label_order            = ["name"]

and only include the attributes you want

Fizz

06:52:21 PM

that will limit the name to a concatenated list of whatever is in label_order, while still adding the other attributes to tags

Taimur Gibson

06:50:30 PM

Hi all, I’m trying to create some custom IAM policies through terraform. I don’t see a dedicated iam-policy component, but it looks like it might be doable through the iam-role component? I don’t quite understand how to use the policy_documents variable though. Can anyone shed some light on this? https://github.com/cloudposse/terraform-aws-components/blob/main/modules/iam-role/README.md#input_policy_documents

Erik Osterman (Cloud Posse)

10:15:55 PM

@Dan Miller (Cloud Posse) can maybe shed some light on this one

Dan Miller (Cloud Posse)

11:01:51 PM

In our reference architecture we don’t use a IAM policy component. What we have is a little bit different. In the simple use case when we need some policy for a particular component, we just include that policy with the given component directly.

However in the more complex use case where we need an advanced policy or role, we use our AWS Team and Team Roles design. This is part of our reference architecture, but in short Team Roles are IAM roles that are deployed to any number of accounts and grant permission there.

What you see with the iam-role component is an exception. We had a customer that had a specific requirement that went against our recommendation. If we need a role, it should be more consistent with our design to use the aws-team-roles component instead

Dan Miller (Cloud Posse)

11:03:45 PM

What is the use case for an IAM policy + role that you’re considering? perhaps I can give a better idea of what we’d recommend

Taimur Gibson

03:16:24 PM

I have an ECS service that I would like to grant access to an S3 bucket

Taimur Gibson

03:17:03 PM

so I see where I can add policies to the task_role and have done that a successfully with a policy that I click-ops’d

Taimur Gibson

03:17:23 PM

but I would like to be able to define IAM policies in terraform so I can assign them to the roles for ECS/EC2/etc

Gabriela Campana (Cloud Posse)

06:06:38 PM

@Dan Miller (Cloud Posse)

Dan Miller (Cloud Posse)

10:47:46 PM

In that case I would create the policy with the ecs-service component. You can use iam_policy_statements and define the policy in YAML with the component catalog

variable "iam_policy_statements" {
  type        = any
  description = "Map of IAM policy statements to use in the policy. This can be used with or instead of the `var.iam_source_json_url`."
  default     = {}
}

Taimur Gibson

02:54:42 PM

@Dan Miller (Cloud Posse) thanks for this! Can I see some examples on how this is used? I tried inserting a JSON policy with this but wasn’t able to get it working properly

Dan Miller (Cloud Posse)

04:38:19 PM

How about something like this? Since it’s in the YAML file you have to either convert it to YAML or inline the JSON

vars:
  iam_policy_statements:
    ListMyBucket:
      effect: "Allow"
      actions: 
        - "s3:ListBucket"
      resources:
        - "arn:aws:s3:::test"
      conditions:
    WriteMyBucket:
      effect: "Allow"
      actions: 
        - "s3:PutObject"
        - "s3:GetObject"
        - "s3:DeleteObject"
      resources:
        - "arn:aws:s3:::test/*"
      conditions:

2024-03-20

Alex Atkinson

02:28:11 PM

Does anyone have a link to a list of new features available on OpenTofu since it’s fork?

Alex Atkinson

02:28:53 PM

More concisely maintained than their release notes.

Erik Osterman (Cloud Posse)

02:32:15 PM

also try opentofu

aj_baller23

06:00:07 PM

I’m new to terraform and wanted to get some feedback on what is the best way of dealing with passwords in terraform files. Hypothetical case… We geneated an api key from another 3 party service. We want to add the api key to our aws secrete manger so that our services are able to use it. How would I go about getting the secrete into aws secrete manager with out committing the secrete in plain text in my terraform file? Thanks in advance

Gabriela Campana (Cloud Posse)

06:07:11 PM

@Dan Miller (Cloud Posse)

Adi

06:57:38 PM

I would create kms encrypted secrets to add them on tf files

Moti

07:02:45 PM

or use sops https://calzone.proofofpizza.com/tech/tutorial/using-sops-with-aws-and-terraform/

Setup SOPS with AWS KMS and Terraform to encrypt your secrets in git attachment image

In this post we’ll explore Mozilla SOPS to manage configuration secrets.

Dan Miller (Cloud Posse)

10:43:58 PM

I prefer to use AWS SSM parameter store for secrets personally, but SOPS and AWS Secret Manager are both great as well

Moti

10:40:10 AM

well depends on the secret

Release notes from terraform

09:03:32 PM

v1.8.0-rc1 1.8.0-rc1 (March 20 2024) If you are upgrading from Terraform v1.7 or earlier, please refer to the Terraform v1.8 Upgrade Guide. NEW FEATURES:

Providers can now offer functions which can be used from within the Terraform configuration language. The syntax for calling a provider-contributed function is provider::function_name(). (<a…

Release v1.8.0-rc1 · hashicorp/terraform attachment image

1.8.0-rc1 (March 20 2024) If you are upgrading from Terraform v1.7 or earlier, please refer to the Terraform v1.8 Upgrade Guide. NEW FEATURES:

Providers can now offer functions which can be used …

Upgrading to Terraform v1.8 | Terraform | HashiCorp Developer attachment image

Upgrading to Terraform v1.8

2024-03-21

Michael

03:28:33 AM

Has anyone heard when Terraform stacks will go GA?

Marty Haught

11:43:26 PM

They haven’t set a date yet. I would expect that Oct would be the earliest it could possibly be.

Michael

12:19:20 AM

Wow, I was hoping it would be sooner but that’s wild

2024-03-22

Jonas Mellquist

01:13:23 PM

When using modules in Terraform where would I add the lifecycle ignore_changes meta-argument? For resources it’s pretty simple - https://developer.hashicorp.com/terraform/language/meta-arguments/lifecycle#ignore_changes

A bit of searching around led me to this Terraform issue https://github.com/hashicorp/terraform/issues/27360 Seems like it cannot be done.. :disappointed:

I’m using https://github.com/cloudposse/terraform-aws-cloudfront-s3-cdn and one of the created buckets has since had its bucket policy changed because of a migration of contents into it.. While this policy is temporary it’s not something Terraform should remove/reverse…

Already tried https://github.com/cloudposse/terraform-aws-cloudfront-s3-cdn?tab=readme-ov-file#input_override_origin_bucket_policy set to false, but it doesn’t change anything.. And I want the module to be in charge of creating the bucket..

Any workarounds, tips, comments?

I guess another approach is to try https://github.com/cloudposse/terraform-aws-cloudfront-s3-cdn?tab=readme-ov-file#input_additional_bucket_policy

Dan Miller (Cloud Posse)

05:21:06 PM

as you pointed out, it’s not supported natively in modules. We could handle lifecycle rules as an input and then create different resources conditionally by the enabled lifecycle setting.

However, that’s is very extreme, as you can see with this module: https://github.com/cloudposse/terraform-aws-ecs-alb-service-task/blob/main/main.tf

Dan Miller (Cloud Posse)

05:22:06 PM

As for your use case with terraform-aws-cloudfront-s3-cdn, would you be able to update your input bucket policy to match what you’ve changed?

Dan Miller (Cloud Posse)

05:23:57 PM

Or if you want to use override_origin_bucket_policy but you already deploy the bucket policy initially with Terraform, then you could set override_origin_bucket_policy to true and then remove the old policy from Terraform state

Dan Miller (Cloud Posse)

05:24:50 PM

terraform state rm module.your_module_name.aws_s3_bucket_policy.default
terraform plan

aj_baller23

11:50:54 PM

does anyone know how I can delete the auto generated domain created by aws when deploying a cognito resource. I’m trying to add a custom domain to the cognito resource and my terraform plan fails because it needs to removed the auto generated domain first before making the change. I’m not 100% sure how to manage or refrence the domain that is auto generated by aws when deploying the cognito resource.

resource "aws_cognito_user_pool_domain" "ee_domain" {
   domain =  "${var.login_url}"
   user_pool_id = aws_cognito_user_pool.ee_user_pool.id
}
terraform apply
aws_cognito_user_pool.ee_user_pool: Destroying... [id=us-east-dafda]
╷
│ Error: deleting Cognito user pool (us-east-1dafdasfa): InvalidParameterException: User pool cannot be deleted. It has a domain configured that should be deleted first.
│ 
│ 
╵

Exited with code exit status 1

2024-03-23

2024-03-25

Alex Goldstone

04:56:01 PM

Hi All… I am bootstratting a fresh AWS environment with the intention to do as much via Terraform as possible. I am aware of higher-level tooling such as Terragrunt and Atmos but I am attempting to gain a good understanding about how everything is held together so holding off on these for now.

I have created a fresh aws root account and enabled IAM Identity Centre via the aws console… this automatically enabled AWS Organisations so my Org structure is Root (OU) - > Management (Account).

I have manually created an iam user called terraform_bootstrap via the Console and I believe having the management account and iam user is enough to do everything else in Terraform.

It is my intention that the first Terraform project run using the terraform_bootstrap user credentials to create the subsequent OUs (e.g. Core) and accounts (e.g. Identity) and then I guess each one of these accounts would have its own terraform project.

I am aiming for a single S3 bucket to store Terraform state… I’ve seen debate about whether this is the case but Cloud Posse seem ok with it… and from what I can tell I can always create a policy to restrict access to specific state files to different sub-sets of users.

Where does this S3 bucket for state files live (i.e. under which account is it created) ?

I have seen a suggestion in the #aws channel that Cloud Posse just create the S3 bucket under the root account. Based on the Org structure above I am not sure wha that means (perhaps the naming conventions are outdated)… Is it good practice to create the S3 bucket under the Management Account and have all the subsequent Terraform projects for sub-accounts store their state there? Is there a security downside to storing in the Management Account?

Gabriela Campana (Cloud Posse)

05:31:22 PM

@Jeremy White (Cloud Posse)

Alex Goldstone

04:55:20 PM

Was there a better channel for me to ask this in ?

Gabriela Campana (Cloud Posse)

05:34:06 PM

@Ben Smith (Cloud Posse)

Gabriela Campana (Cloud Posse)

05:34:46 PM

I think you asked on the right channel

Ben Smith (Cloud Posse)

05:43:45 PM

We call it the root account (your Management account) under an OU we call Core (your Root)

Our structure thus looks like

Core/
  - Root
  - Security
  - Auto
Platform/
  - Dev
  - Staging
  - Prod

yours can look similar with different names, depends on needs and how you want to name things.

we recommend the terraform state bucket live in your management account core-root or root-management for you.
Is it good practice to create the S3 bucket under the Management Account and have all the subsequent Terraform projects for sub-accounts store their state there? Is there a security downside to storing in the Management Account? So theres pros and cons to having the state bucket live in one account vs many.

Essentially the reason we recommend 1 state bucket, is usually the reason you chose terraform is so that your components of terraform and know about eachother and reference state. it becomes A LOT less complex to maintain.

The times we would recommend hierarchical state buckets is when you have extremely tight security requirements and usually a big team that requires segmented access.

Ben Smith (Cloud Posse)

05:47:06 PM

https://github.com/cloudposse/terraform-aws-components/tree/main/modules/tfstate-backend is how we recommend (and ourselves do) deploy tfstate backend, it integrates with our aws-teams) and aws-team-roles for user based access

Alex Goldstone

06:01:41 PM

Thank you @Ben Smith (Cloud Posse). I think the confusion is that AWS created my ‘root’ and ‘management account’ when I let it automatically enable ‘aws organisations’ when I activated IAM.

So, if I understand correctly, the account for which you create the S3 folder is not the aws created management account (shown in the screenshot) but you create a separate OU and account outside of the actual aws management account ?

Or, did I misunderstand, and what AWS created for me is the same as what you have but under different names and the ‘aws mangement account’ is the mangement account in which you create the S3 bucket. People seem to say not to store stuff in the ass management account hence my confusion.

Ben Smith (Cloud Posse)

06:09:23 PM

what AWS created for me is the same as what you have but under different names this

usually what we do is get a root accout with no AWS Org setup (so a single account). We then deploy tfstatebackend (component linked above) this will generate a s3 bucket and dynamodb lock table in your management account.

Then we deploy the accounts component [link] which will create and manage your org for you (beware aws has a 10 account limit but you can increase with a quota increase)

so essentially this then sets up your various accounts as your different needs. we usually have

core/
 - root - this is the management account, it only has accounts and tfstatebackend deployed to it
 - identity - aws-teams and aws-team-roles which define how your users login
 - auto - automation and overhead tooling
 - network - transit gateways, dns how things connect
platform/
 - dev/staging/sandbox/prod - where your apps live

what you’ve got is correct just named differently than our typical setup - though theres nothing wrong with that, just need to keep a mental note of how we will generally refer to it

Alex Goldstone

06:32:38 PM

Brilliant, thank you… Given the sensitivity around use of the management account (link) I was not sure I should be storing state there and given the difference in names i’ve ended up with it added to the confusion but what you have laid out makes perfect sense - thank you.

Taimur Gibson

07:58:46 PM

Hello, another ECS terraform question

I’m trying to create a new ECS cluster, and it’s mostly working, but it’s getting stuck when trying to create the S3 buckets for the ALB access logs

Taimur Gibson

08:00:07 PM

│ Error: creating S3 Bucket (ORGNAME-plat-use1-sandbox-CLUSTERNAME-private-alb-access-logs) ACL: operation error S3: PutBucketAcl, https response error StatusCode: 400, RequestID: TPTA013GT569ETAX, HostID: N6C6f+Limdwsdty86tsEnIKUGmQ212aT4okjJnN10GcQaG/NlfUOkHCplpsu3caLOHeCo=, api error AccessControlListNotSupported: The bucket does not allow ACLs
│ 
│   with module.alb["private"].module.access_logs.module.s3_bucket.module.aws_s3_bucket.aws_s3_bucket_acl.default[0],
│   on .terraform/modules/alb.access_logs.s3_bucket.aws_s3_bucket/main.tf line 148, in resource "aws_s3_bucket_acl" "default":
│  148: resource "aws_s3_bucket_acl" "default" {
│ 
╵
╷
│ Error: creating S3 Bucket (ORGNAME-plat-use1-sandbox-CLUSTERNAME-public-alb-access-logs) ACL: operation error S3: PutBucketAcl, https response error StatusCode: 400, RequestID: TPTDW2SVGHQWBJ9P, HostID: C5vrCU0oWsnSNy/hlDREWtFYp+/zMKd2VXatYruvJ/opNd3FrMtXRtJklQ/drOyr28=, api error AccessControlListNotSupported: The bucket does not allow ACLs
│ 
│   with module.alb["public"].module.access_logs.module.s3_bucket.module.aws_s3_bucket.aws_s3_bucket_acl.default[0],
│   on .terraform/modules/alb.access_logs.s3_bucket.aws_s3_bucket/main.tf line 148, in resource "aws_s3_bucket_acl" "default":
│  148: resource "aws_s3_bucket_acl" "default" {
│ 
╵
exit status 1

Taimur Gibson

08:00:46 PM

if I the terraform deploy again, it’s able to finish creating the remaining resources and works properly

Taimur Gibson

08:01:06 PM

the ALB creates correctly, and the ACLs get set on the corresponding S3 buckets

Taimur Gibson

08:01:15 PM

so we have a workaround, but it would be great to figure out why it’s failing to set ACLs the first time around

Erik Osterman (Cloud Posse)

02:08:46 AM

Looks like you are using the refarch ecs components?

Moti

01:00:13 PM

make sure there’s no acl = private or something of tyhat sort in your code.

Moti

01:01:36 PM

https://aws.amazon.com/blogs/aws/heads-up-amazon-s3-security-changes-are-coming-in-april-of-2023/

all newly created buckets in the Region will by default have S3 Block Public Access enabled and access control lists (ACLs) disabled

Heads-Up: Amazon S3 Security Changes Are Coming in April of 2023 | Amazon Web Services attachment image

Update (4/27/2023): Amazon S3 now automatically enables S3 Block Public Access and disables S3 access control lists (ACLs) for all new S3 buckets in all AWS Regions. Starting in April of 2023 we will be making two changes to Amazon Simple Storage Service (Amazon S3) to put our latest best practices for bucket security into […]

Taimur Gibson

03:50:48 PM

yeah we’re using the refarch ecs component

Taimur Gibson

03:51:04 PM

we haven’t changed anything about the ACL settings from the reference architecture

Taimur Gibson

04:11:14 PM

it looks like the default behavior of the terraform is to try to set the ACL

Taimur Gibson

05:24:28 PM

actually looks like this was fixed in a terraform component version update. we were on 1.388.0 and updating to 1.419.0 worked

Erik Osterman (Cloud Posse)

11:47:27 PM

@Jeremy White (Cloud Posse) @Dan Miller (Cloud Posse)

Dan Miller (Cloud Posse)

12:45:17 AM

actually looks like this was fixed in a terraform component version update. we were on 1.388.0 and updating to 1.419.0 worked great! I believe this was a fix in the terraform-aws-alb module that’s included in that component

cloudposse/terraform-aws-alb

Terraform module to provision a standard ALB for HTTP/HTTP traffic

Dan Miller (Cloud Posse)

12:49:06 AM

also as a general tip since you’re using our reference architecture, you should get a quicker response for reference architecture specific questions with the refarch channel (around 100 members). Whereas the #terraform channel is intended for anything related to Terraform as a topic (around 8000 members)

2024-03-26

AdamP

06:21:55 PM

Hey Folks, anyone using aws eks cluster module v4.0.0.? I can’t seem to get terraform to pick up my access_entry_map… terraform plan never notices it or any changes I make to it. Pretty simple setup:

module "eks_cluster" {
  source  = "cloudposse/eks-cluster/aws"
  version = "4.0.0"
  // <https://github.com/cloudposse/terraform-aws-eks-cluster>

  name      = var.name
  namespace = var.namespace
  region    = var.region
  stage     = var.stage

  cluster_encryption_config_enabled                     = true
  cluster_encryption_config_kms_key_enable_key_rotation = true
  oidc_provider_enabled                                 = true

  access_config             = var.access_config
  access_entry_map          = var.access_entry_map
  addons                    = var.addons
  addons_depends_on         = [module.eks_node_group_main, module.eks_node_group_secondary]
  endpoint_private_access   = var.endpoint_private_access
  endpoint_public_access    = var.endpoint_public_access
  enabled_cluster_log_types = var.enabled_cluster_log_types
  kubernetes_version        = var.kubenetes_version
  public_access_cidrs       = var.public_access_cidrs
  subnet_ids                = module.subnets.public_subnet_ids

  tags = var.tags
}

sandbox.tfvar:

..
..
..
access_entry_map = {
  (data.aws_iam_session_context.current.issuer_arn) = {
    access_policy_associations = {
      ClusterAdmin = {}
    }
  }
}
..
..

my var.access_config:

variable "access_config" {
  description = "Access configuration for the EKS cluster."
  type = object({
    authentication_mode = string
    bootstrap_cluster_creator_admin_permissions = bool
  })
  default =  {
    authentication_mode = "API"
    bootstrap_cluster_creator_admin_permissions = false  
  }
}

so weird, I’ll keep at it and see what i’m missing, felt like posting in here as I may be missing something obvious

AdamP

06:47:10 PM

hang tight, I think its my jenkins pipeline. I’ll let you all know what i find out, its gotta be on my end now that i saw some weirdness when I ran my build pipeline

AdamP

06:49:32 PM

ok it was my pipeline, and I see that I can’t use the data source in my tfvar too.. rookie mistakes here.. nothing to see carry on

Chris

08:48:58 PM

has anyone who is using ECS clusters successfully deployed a second cluster without messing up the ACM certificates and existing clusters? • we now have three (3) certificates in ACM and two (2) of them contain duplicates for <environment>.<stage>.<tenant>.<domain>.<tld> • however now our envs can no longer deploy to the platform cluster very strange, def think we’re missing something, and could not find any documentation about it

Erik Osterman (Cloud Posse)

11:45:12 PM

@Dan Miller (Cloud Posse) @Ben Smith (Cloud Posse)

Dan Miller (Cloud Posse)

12:54:59 AM

Is this referring to the ecs component with our reference architecture? If so, then the ACM cert should be created with the acm component, not with the ecs component. You might have a specific component called ecs/platform/acm (likely defined in your stack catalog, maybe in stacks/catalog/ecs/clusters/default.yaml) that creates the certs. You will need a unique domain for each ~cluster~ service you want to deploy. You can do this with a different name or set of attributes

I can create a quick example if that sounds right

Chris

04:25:55 PM

I think that sounds right, could you post a quick example, I would like to compare what I have vs what may be ideal/BP

Dan Miller (Cloud Posse)

05:47:15 PM

I put together an example but it quickly got out of hand. So I put it into this google doc. Let me know if this makes sense to you: https://docs.google.com/document/d/1I6QM_f10T-xG3z1Hyef6t6oXNQqIzANecnTttdGtHzo/edit?usp=sharing

Also I recently updated our setup steps for ECS as part of our reference arch. docs. These may be helpful to skim https://docs.cloudposse.com/reference-architecture/setup/ecs/

Dan Miller (Cloud Posse)

05:49:32 PM

Also please let me know if this solves your issue or if you have any other questions about it, because then I can upstream it to our website for future reference for others as well

Chris

06:31:17 PM

Thanks @Dan Miller (Cloud Posse) reviewing them now

2024-03-27

maarten

08:19:29 PM

Back from a very long winter break , hi everyone. Is AWS Control Tower Account Factory for Terraform (AFT) the preferred way for bootstrapping our AWS accounts now, or is there something better ?

loren

08:23:12 PM

current engagement is using Spacelift as an “account factory”. feels a lot cleaner and more transparent than AFT

maarten

08:30:05 PM

Hi Loren, I can’t use ext dependencies with this one. Is AFT too cumbersome for something like : cross acct roles , oidc roles, state bucket / ddb ? What transparency is it lacking ?

loren

08:36:47 PM

Yeah understood. AFT just feels like a black box to me. An overly complicated, typical AWS solution architecture. I wouldn’t want to sign up to troubleshoot failures.

loren

08:39:50 PM

If I have a module for the resources I want in the new account, then it’s easy enough to have a directory per account with a simple terraform config that uses that module. And a template to simplify new account creation. Each new account, copy the template, update tfvars, deploy with ci/cd

maarten

08:41:20 PM

Right, but which state bucket and role to use ?

loren

08:42:17 PM

i don’t think AFT solves that either

maarten

08:43:26 PM

What do you think of using stackset with a bit of CF using the default cross-account roles within the org ?

loren

08:44:22 PM

i use terragrunt for that part. and a lambda in the management account to create an “automation” role in every new account. account creation then is just a single command:

aws organizations create-account --email "$email" --account-name "$name" --iam-user-access-to-billing ALLOW --role-name "$account_access_role"

loren

08:44:42 PM

but yeah, a stackset works to create the “automation” role also

maarten

08:46:08 PM

I meant using the role “aws organizations” is creating, then create the specific ( least privilege roles ) with the stack set and the rest.

loren

08:48:35 PM

the stackset doesn’t need to “use” another role, i don’t think. if you’re comfortable using the role aws organizations creates, and having your automation run from the organization account to assume that role, then your terraform config can create all extra roles no problem

maarten

08:52:58 PM

I’ll dive in, thanks !

Jake Lundberg (HashiCorp)

11:05:47 PM

The important thing to know about AFT is that it was a co-development effort between AWS and HashiCorp. It originated with the AWS Professional Service team and was taken over as an actual product delivered by the Service Catalogue/Control Tower team. The reason Control Tower now has APIs is because of this project/product.

Generally speaking, if folks are going to use TF for Landing Zones, this is the suggested method from both AWS and HashiCorp.

Now, all that said, we have had some customers that have had to do some larger customizations to integrate with existing automation, generally through their pipelines. I’m actually trying to get one of them to talk about this at HashiConf this year.

2024-03-28

toka

08:07:26 AM

Hey Folks, anyone using terraform private package registry of some kind for the purpose of module versioning vs using git tag ref? I wonder are there any pros and cons of switching to leverage private registry at some point.

Monish Devendran

07:35:45 PM

Can someone help me,

Im trying to pass a secret which is stored in akeyless,

data "akeyless_secret" "secret" { path = "/GCP/Secrets/cf-triggers/tf-cf-triggers" }

provider "google" { project = "cf-triggers" credentials = data.akeyless_secret.secret }

resource "google_pubsub_topic" "example" { name = "akeyless_topic" message_retention_duration = "86600s" }

❯ terraform apply
data.akeyless_secret.secret: Reading...
data.akeyless_secret.secret: Read complete after 1s [id=/GCP/Secrets/cf-triggers/tf-cf-triggers]

Planning failed. Terraform encountered an error while generating this plan.

╷
│ Error: Incorrect attribute value type
│ 
│   on main.tf line 38, in provider "google":
│   38:   credentials = data.akeyless_secret.secret
│     ├────────────────
│     │ data.akeyless_secret.secret is object with 4 attributes
│ 
│ Inappropriate value for attribute "credentials": string required.

Monish Devendran

07:35:55 PM

is there a way to resolve this ?

V.S

08:19:43 PM

Been working with TF a few months, had a few questions/discussion to get into based on a SO post . Been getting a lot of errors 403 and 409, using terraform on gcp (even on a fresh project), where I have to manually enable an API upon apply and manually delete a resource upon destroy, not always related to a service that maybe dependant or instantiated on a child resource, like SQL and then a db. I read that there is a Terraform resource definition called “google_project_service” that allows auto-enable api service. This is documented at google_project_service. Apparently resource “google_project_service” “project”, only one service argument can be taken so would have to loop over a list of services, but will this resolve the issue, I am yet to try to loop, or the other suggestions. Any have this issue and resolve it?

resource "google_project_service" "project" {
  project = "your-project-id"
  service = "iam.googleapis.com"
  timeouts {
    create = "30m"
    update = "40m"
  }

  disable_dependent_services = true
  disable_on_destroy = true
}

Below is what I used now but I have to manually enable some apis and destroy some and not just ones with dependants. When I added the disable on destroy I get less destroy errors. Getting these errors isn’t an issue unless provisioned resources are not terminate, which most times isnt the case, but has been the case a few times leaving chargable cloud resources.

resource "google_project_service" "iam" {
  service            = "iam.googleapis.com"
  # disable_on_destroy = true
}

Can I automatically enable APIs when using GCP cloud with terraform?

I am very new to GCP with terraform and I want to deploy all my modules using centralized tools.

Is there any way to remove the step of enabling google API’s every time so that deployment is not

Monish Devendran

08:26:37 PM

did you authenticate ?

Can I automatically enable APIs when using GCP cloud with terraform?

I am very new to GCP with terraform and I want to deploy all my modules using centralized tools.

Is there any way to remove the step of enabling google API’s every time so that deployment is not

Monish Devendran

08:27:55 PM

have a separate module for to enable the apis, and apply them first

Monish Devendran

08:28:47 PM

for example, if you are working with cloudfunctions gen2, there are some pre api need to be enabled for example, cloudbuild, artifactregistry, eventarc etc..

V.S

08:29:29 PM

Yes with a service account json key. which is in credentials field in variable.tf, then used file(var.credentials). It wouldnt provision without authentication, but you could be right that maybe there is other permissions needed. How ever not all resources are affected by these errors.

Monish Devendran

08:29:31 PM

have org level project and make it as pre-requisite step to enable apis for each project

Monish Devendran

08:29:52 PM

i would say isolate that api enabling separately

Monish Devendran

08:30:03 PM

as that would be a onetime process

V.S

08:30:20 PM

Yes this is what was recommended in the SO post, splitting the TF.

Monish Devendran

08:30:30 PM

correct

V.S

08:31:52 PM

I will try it but unfortunately wont have time in project right now to deep dive, so will try a work around which is what most oss fixes seem to be

V.S

08:33:24 PM

I know some people don’t recommend auto api enabling but from an automation standpoint of TF, seems counter intuitive.

Monish Devendran

08:33:27 PM

have something like this

org/{projects}/{project_id}.yaml

project:
    - 'your-project-id'
       apis:
         - 'cloudfunctions'
         - 'bigquery'
         -
          .
           .

Monish Devendran

08:34:03 PM

In our company we have something like this

Monish Devendran

08:34:35 PM

we raise a pr to public_cloud team and then it wil enable api for that project, then we can focus on tf code

V.S

08:35:20 PM

Yes and having disable on destroy and disable dependant services important

V.S

08:39:37 PM

I toggled the “run.googleapis.com/cpu-throttling” = true option to terminate time based charging to bring down costs rather using request based option. It switches between “CPU is always allocated” to “CPU is only allocated during request processing”, as I intended, I got it to work here and there but most of the time the service becomes idle, then when a request is made by app to run some code, it stays idle with no return on anything, any idea why? I have to keep CPU always allocated to get consistent work loads. (edited).

V.S

06:09:57 AM

I toggled the “run.googleapis.com/cpu-throttling” = true option to terminate (always on) time based charging to bring down costs rather using request based option. It switches between “CPU is always allocated” to “CPU is only allocated during request processing”, as I intended, I got it to work here and there but most of the time the service becomes idle, then when a request by my app, the app just keeps hangs. I have to keep CPU always allocated to get consistent work loads. I think this has something to do with cold start and 15 min timeout after requests, but it seems that max is 60 mins. Basically it seems to take advantage of less costs of “CPU is only allocated during request processing”, you can only work with a 60 min window. Is there any other way to get around this 60 mins max timeout?

#terraform (2024-03)

Discussions related to Terraform or Terraform Modules

2024-03-01

2024-03-02

2024-03-03

2024-03-04

2024-03-06

2024-03-07

2024-03-08

2024-03-11

2024-03-13

2024-03-14

2024-03-15

2024-03-16

2024-03-17

2024-03-18

2024-03-19

2024-03-20

2024-03-21

2024-03-22

2024-03-23

2024-03-25

2024-03-26

2024-03-27

2024-03-28