#terraform (2024-05)

terraform Discussions related to Terraform or Terraform Modules

Archive: https://archive.sweetops.com/terraform/

2024-05-01

Pradeepvarma Senguttuvan avatar
Pradeepvarma Senguttuvan

Hi Team, i am trying to create read replica for document DB with different instance class than primary

module "documentdb_cluster" {
  source                          = "cloudposse/documentdb-cluster/aws"

since instance_class is string i cannot have different instance class for my read replica - any suggestions on this ? how do i have different instance class for my replica (edited)

Release notes from terraform avatar
Release notes from terraform
12:23:30 AM

v1.9.0-alpha20240501 1.9.0-alpha20240501 (May 1, 2024) ENHANCEMENTS:

terraform console: Now has basic support for multi-line input in interactive mode. (#34822) If an entered line contains opening paretheses/etc that are not closed, Terraform will await another line of input to complete the expression. This initial implementation is primarily intended…

Release v1.9.0-alpha20240501 · hashicorp/terraformattachment image

1.9.0-alpha20240501 (May 1, 2024) ENHANCEMENTS:

terraform console: Now has basic support for multi-line input in interactive mode. (#34822) If an entered line contains opening paretheses/etc that…

terraform console: Multi-line entry support by apparentlymart · Pull Request #34822 · hashicorp/terraformattachment image

The console command, when running in interactive mode, will now detect if the input seems to be an incomplete (but valid enough so far) expression, and if so will produce another prompt to accept a…

2024-05-02

loren avatar

Super cool experiment in the 1.9 alpha release, they’re looking for feedback if you want to give it a go… https://discuss.hashicorp.com/t/experiment-feedback-input-variable-validation-can-cross-reference-other-objects/66644

Experiment Feedback: Input Variable Validation can cross-reference other objects

Hi everyone, In yesterday’s Terraform CLI v1.9.0-alpha20240501 there is an experimental implementation of the long-requested feature of allowing input variable validation rules to refer to other values in the same module as the variable declaration. For example, it allows the validation rule of one variable to refer to another: terraform { # This experiment opt-in will be required as long # as this remains experimental. If the experiment # is successful then this won’t be needed in the …

2024-05-03

dinodam avatar
dinodam

:wave: Hello, team!

I am having a slight issue with your terraform-aws-lambda-elasticsearch-cleanup module. It use to work fine, but since I upgrade the TF AWS provider to 5.47.0 from 4.20.1 and bumped the pinned module version to 0.14.0 from 0.12.3 I am getting the following error. I am using Terraform version 1.8.2

 Error: External Program Execution Failed
│ 
│   with module.lambda-elasticsearch-cleanup.module.artifact.data.external.curl[0],
│   on .terraform/modules/lambda-elasticsearch-cleanup.artifact/main.tf line 3, in data "external" "curl":
│    3:   program    = concat(["curl"], var.curl_arguments, ["--write-out", "{\"success\": \"true\", \"filename_effective\": \"%%{filename_effective}\"}", "-o", local.output_file, local.url])
│ 
│ The data source received an unexpected error while attempting to execute
│ the program.
│ 
│ Program: /usr/bin/curl
│ Error Message: curl: (22) The requested URL returned error: 404
│ 
│ State: exit status 22

Have I missed an upgrade step somewhere or is there an issue with the file?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Hrmmm… I have a theory. We recently rolled out some new workflows, maybe the artifact wasn’t produced.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Igor Rodionov can you take a look?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Essentially, what the module does is a clever hack to download the artifact from S3 based on the commit SHA of the module version you are pulling

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

For some reason, the commit sha corresponding to that module version does not exist, which leads me to believe there’s a problem with the artfact and something wrong with the pipeline

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

cc @Gabriela Campana (Cloud Posse)

1
dinodam avatar
dinodam

Thanks for getting back to me. Would this not effect all users then?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Yep

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

….all users using it at that version

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

For now, try an older version

dinodam avatar
dinodam

ok… standby

dinodam avatar
dinodam

Issue with an older version of the module is that AWS TF provider 5 has depreciated some calls.

"source_json" and override_json

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

I gotcha… so this is a problem, especially if you manage the terraform-aws-lambda-elasticsearch-cleanup in the same life cycle as, say, your elastic search cluster. While that’s not unreasonable, and what most probably are doing, it’s an example of why we like to break root modules out by lifecycle, reducing the tight coupling and dependencies on provider versions. That said, I totally get why this is a problem, just explaining why we (cloud posse) are less affected by these types of changes.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
Atmos Components | atmos

Components are opinionated building blocks of infrastructure as code that solve one specific problem or use-case.

dinodam avatar
dinodam

Makes sense . I just grouped things together that went together. Ending up in this situation.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Thanks for understanding… We’ll get this fixed, just cannot commit to when that will be.

1
dinodam avatar
dinodam

No issues, this is just on my upgrade TF branch and not on master. So I am good for now

2024-05-04

Jeremy G (Cloud Posse) avatar
Jeremy G (Cloud Posse)

Announcement: In support of using OpenTofu, starting with Geodesic v2.11.0, we are pre-installing package repos to allow you to easily install OpenTofu in your Dockerfile.

ARG OPEN_TOFU_VERSION=1.6.2
RUN apt-get update && apt-get install tofu=${OPEN_TOFU_VERSION}
5

2024-05-05

miko avatar

Guys is this normal behavior? In AWS EKS I have upgraded my nodes into t3.large from t3.medium, I saw before confirming “yes” that terraform will destroy the old nodes in order to proceed with the upgrade but I didn’t expect it to delete the volumes as well, good thing it only happened in our testing environment, my question is this normal behavior if I upgrade the instance_types? Because I was hoping to be able to upgrade it without affecting my persistent volumes

Jeremy G (Cloud Posse) avatar
Jeremy G (Cloud Posse)

This is really more of a #kubernetes question, but I will take a crack at it here.

It seems to me you are confusing the EBS volumes associated with EC2 Instances as root volume, providing ephemeral storage (e.g. emptyDir) for Kubernetes, with EBS volumes associated with PersistentVolumes. The former have lifecycles tied to the instances: When new instances are created (e.g. when the AutoScaling Group scales up), new EBS volumes are created, and when the instances are deleted, so are the EBS volumes.

Kubernetes PersistentVolumes, which may be implemented as EBS volumes or something else, should persist until their PersistentVolumeClaims are deleted, and then only if the reclaim policy is set to “delete”.

miko avatar

Thanks @Jeremy G (Cloud Posse), though I’m using StatefulSet for my deployment (I have EBS CSI driver setup as well) so I thought this should be using EBS that from what I understood should be independent with my EC2 lifecycle?

In my StatefulSet deployment I have defined volumeClaimTemplates that from what I understood should be using the EBS volume? Thank you for the answer though should I post this to K8 (my bad for posting here because I was using Terraform to maintain our infra) and continue the discussion there? :o

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: authentication-postgres
  labels:
    app: authentication-postgres-app
  namespace: postgres
spec:
  serviceName: authentication-postgres-svc
  replicas: 1
  selector:
    matchLabels:
      app: authentication-postgres-app
  template:
    metadata:
      labels:
        app: authentication-postgres-app
    spec:
      containers:
        - name: authentication-postgres-container
          image: postgres:16.2-bullseye
          ports:
            - containerPort: 5432
          env:
            - name: POSTGRES_DB
              value: authentication_db
            - name: POSTGRES_USER
              value: postgres
            - name: POSTGRES_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: authentication-postgres-secret
                  key: postgres-password
          volumeMounts:
            - name: data
              mountPath: /mnt/authentication-postgres-data
      imagePullSecrets:
      - name: docker-reg-cred
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes:
          - "ReadWriteOnce"
        resources:
          requests:
            storage: "5Gi"
Jeremy G (Cloud Posse) avatar
Jeremy G (Cloud Posse)

What I’m saying is that on upgrade, some EBS volumes will get deleted and some will not. What leads you to believe that your PersistentVolumes, with active PersistentVolumeClaims, are the ones being deleted?

miko avatar

Ohhh I see! I came to this conclusion because the PostgresSQL database lost its data after I upgraded the nodes, which means it’s one of those EBS that unfortunately got cleared? Is there a way for me to avoid that so that when I upgrade nodes the EBS are safe?

Jeremy G (Cloud Posse) avatar
Jeremy G (Cloud Posse)

I don’t know how you deployed PostgreSQL. It seems like you deployed it to use ephemeral storage rather than dedicated PersistentVolumes, despite having a volumeClaimTempate, but this gets into the details of your PostgreSQL deployment, and maybe Helm chart. Which is why I directed you to #kubernetes

2024-05-06

2024-05-07

Pradeepvarma Senguttuvan avatar
Pradeepvarma Senguttuvan
09:12:11 AM

any luck on this ?

Hi Team, i am trying to create read replica for document DB with different instance class than primary

module "documentdb_cluster" {
  source                          = "cloudposse/documentdb-cluster/aws"

since instance_class is string i cannot have different instance class for my read replica - any suggestions on this ? how do i have different instance class for my replica (edited)

Ercan Ermis avatar
Ercan Ermis

Hello all,

This is my first message here in Slack! I found a little bug on memcached module. Issue is opened: https://github.com/cloudposse/terraform-aws-elasticache-memcached/issues/78 is someone can check and help to me for send a PR? my changes are ready on my local. Thanks!

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

If you are able to open a PR, post it in #pr-reviews and someone will review it promptly

1
1
Ercan Ermis avatar
Ercan Ermis

PR sent. thank you so much @Erik Osterman (Cloud Posse)

1
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

I don’t see one from you in #pr-reviews

Ercan Ermis avatar
Ercan Ermis

yep, because i was little bit sleepy last night. approves are welcome.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
#79 fix: elasticache_subnet_group creation

what

• If we pass elasticache_subnet_group_name, the aws_elasticache_subnet_group.default[0] won’t be created anymore

why

• Who needs a new elasticache_subnet_group even we already created before and just want to pass a name

references

• Check issue #78

2024-05-08

Release notes from terraform avatar
Release notes from terraform
09:13:31 AM

v1.8.3 1.8.3 (May 8, 2024) BUG FIXES:

terraform test: Providers configured within an overridden module could panic. (#35110) core: Fix crash when a provider incorrectly plans a nested object when the configuration is null (<a href=”https://github.com/hashicorp/terraform/issues/35090” data-hovercard-type=”pull_request”…

Don't evaluate providers within overridden modules by jbardin · Pull Request #35110 · hashicorp/terraformattachment image

While we don’t normally encounter providers within modules, they are technically still supported, and could exist within a module which has been overridden for testing. Since the module is not bein…

core: prevent panics with null objects in nested attrs by jbardin · Pull Request #35090 · hashicorp/terraformattachment image

When descending into structural attributes, don’t try to extract attributes from null objects. Unlike with blocks, nested attributes allow the possibility of assigning null values which could be ov…

2024-05-10

Juan Pablo Lorier avatar
Juan Pablo Lorier

Hi, not sure if this is an issue but I’m having cycles every time I try to destroy a service and after a long work, I discovered that it’s related to the security groups. If I manually remove the service SG and rules, the cycles are gone. This is related to the ecs alb service module

Julien Bonnier avatar
Julien Bonnier

I have that problem a lot when using a security group created in another backend.

ie:

Backend A: contains a resource group Backend B: uses the resource group

One pattern I’ve been using to avoid problems is

Backend A: contains a resource group Backend B: attaches rules to the resources group and uses these.

https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/security_group_rule

Don’t know if that helps ¯_(ツ)_/¯

Juan Pablo Lorier avatar
Juan Pablo Lorier

thanks! I will look into it. I manage all the resources in the same backend but I have little control on the creation/destruction as the cloudposse modules are the ones managing the resources

2024-05-13

susie-h avatar
susie-h

This might have been talked about in an earlier thread already, but is anyone else seeing some weird behavior in their editor within terragrunt-cache for the module download for terraform-null-label? VSCode is throwing an error in the cache folder, when i tunnel down it takes me to /examples/autoscalinggroup/main.tf line 28:

# terraform-null-label example used here: Set tags on everything that can be tagged
  tag_specifications {
    for_each = ["instance", "volume", "elastic-gpu", "spot-instance-request"]

with the error message “Unexpected attribute: An attribute named “for_each” is not expected here. Terraform”

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

maybe post a link to this message in #terragrunt

susie-h avatar
susie-h

ok, just to cross-post? or am i in the wrong channel?

2024-05-14

Veerapandian M avatar
Veerapandian M

Hi Expertis,

I would like to learn about the IaC (Terraform with Terragrunt), but I have no experience with it. If possible, please help me continue to explore the next step profile.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Hi @Veerapandian M welcome to the community!

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Definitely feel free to ask pointed questions as you continue your journey.

Prasanna avatar
Prasanna

Hello Team I am beginner to terraform, I want to set up the environment specified in https://github.com/cloudposse/terraform-datadog-platform. Can some one help me to point out the documentation. I know its stupid question

Kindly provide me basic flow and installation, set up. I am referring this solution so that I can customize to read swagger.Json file and convert it synthetic tests automatically. Its end goal. want to build solution for same

cloudposse/terraform-datadog-platform

Terraform module to configure and provision Datadog monitors, custom RBAC roles with permissions, Datadog synthetic tests, Datadog child organizations, and other Datadog resources from a YAML configuration, complete with automated tests.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Jeremy White (Cloud Posse)

cloudposse/terraform-datadog-platform

Terraform module to configure and provision Datadog monitors, custom RBAC roles with permissions, Datadog synthetic tests, Datadog child organizations, and other Datadog resources from a YAML configuration, complete with automated tests.

Jeremy White (Cloud Posse) avatar
Jeremy White (Cloud Posse)

If you just want to get synthetics up and running, I think you can just copy the synthetics example and adjust it to use your own datadog api endpoints. After that, just start creating synthetics similar to what you see in the synthetics catalog within the same example

2024-05-15

Sergio avatar

Hello everyone! I hope you’re all doing well. I’m currently facing an issue with creating simple infrastructure using terragrunt as wraper for terraform. Goal is to create 2 subnetworks in 2 different zones and create for 3 vms in each subnetwork. Subnetworks created without issues, the problem arises when I try to create 3vms in each subnetworks.

Project has the following structure

├── environmentsLive │ ├── dev │ ├── net │ │ └── terragrunt.hcl │ ├── vms │ │ └── terragrunt.hcl │ └──terragrunt.hcl └── modules ├── network │ ├── main.tf │ ├── outputs.tf │ ├── variables.tf │ └── versions.tf ├── vm ├── main.tf ├── outputs.tf ├── variables.tf └── versions.tf

I am running terragrunt run-all apply inside of the dev folder in order to have state file for each module specified in dev folder and it works. The problem is that for “vms” module I need iterate over output “subnet_id” variable of “net” module which is

subnet_id = [
  "projects/playground-s-11-59f50f2a/regions/us-central1/subnetworks/dev-subnet-us-central1",
  "projects/playground-s-11-59f50f2a/regions/us-east1/subnetworks/dev-subnet-us-east1",
]

But in inputs{} block of “terragrunt.hcl” file of vms module is expected only one value per variable

The content of “terragrunt.hcl” file of vms module is:

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "/home/app/terr/Terraform/src/ModulesBySeperateState/modules/vm"
}

dependency "vpc" {
  config_path = "/home/app/terr/Terraform/src/ModulesBySeperateState/environmentsLive/dev/net"

}


inputs = {
  subnet_id = dependency.vpc.outputs.subnet_id
  first_zone_per_region = dependency.vpc.outputs.first_zone_per_region
  regions = dependency.vpc.outputs.regions
}

The main.tf for vm module looks like this:

resource "google_compute_instance" "vm" {
  for_each = var.names
  name = "${each.value.name}-${var.environment}-${var.first_zone_per_region[var.regions]}"
  machine_type = each.value.type
  zone = var.first_zone_per_region[var.regions]
  network_interface {
    subnetwork = var.subnet_id
	}

  } 

I’ve tried to create a wraper module for vms to iterate over subnet_id and provide output for vms.

module   "wrapvms" {
  source = "./emptyVmModuleForWrap"
  environment           = var.environment
  count                 = length(var.subnet_id)
  region                = var.regions[count.index]
  subnet_id             = subnet_id[count.index]
  first_zone_per_region = var.first_zone_per_region
  names                 = var.names

}

But due to lack of my experience it doesn’t work. Could someone please offer some assistance or guidance? Any help would be greatly appreciated. Thank you in advance!

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Please use #terragrunt

2024-05-16

Release notes from terraform avatar
Release notes from terraform
02:13:33 PM

v1.9.0-alpha20240516 1.9.0-alpha20240516 (May 16, 2024) ENHANCEMENTS:

terraform console: Now has basic support for multi-line input in interactive mode. (#34822) If an entered line contains opening parentheses/etc that are not closed, Terraform will await another line of input to complete the expression. This initial implementation is primarily…

Release v1.9.0-alpha20240516 · hashicorp/terraformattachment image

1.9.0-alpha20240516 (May 16, 2024) ENHANCEMENTS:

terraform console: Now has basic support for multi-line input in interactive mode. (#34822) If an entered line contains opening parentheses/etc th…

terraform console: Multi-line entry support by apparentlymart · Pull Request #34822 · hashicorp/terraformattachment image

The console command, when running in interactive mode, will now detect if the input seems to be an incomplete (but valid enough so far) expression, and if so will produce another prompt to accept a…

2024-05-20

setheryops avatar
setheryops

Has anyone ever come across an error like this? Im trying to update some security group rules. I looked at the link in the error and that issue was merged and closed back in 2015 and reading through the issue notes im not even sure what the issue is. I found another issue that is similar and still open but its also old as dirt and doesnt have any clear “fix”. Im on TF 1.5.5. Ive seen “taint” as a possible fix but thats not in the TF version we are on. Any ideas here??

Error: [WARN] A duplicate Security Group rule was found on (sg-0ef73123456700cc). This may be
│ a side effect of a now-fixed Terraform issue causing two security groups with
│ identical attributes but different source_security_group_ids to overwrite each
│ other in the state. See <https://github.com/hashicorp/terraform/pull/2376> for more
│ information and instructions for recovery. Error: InvalidPermission.Duplicate: the specified rule "peer: 10.243.16.0/23, UDP, from port: 8301, to port: 8301, ALLOW" already exists
│ 	status code: 400, request id: d3725f91-da05-450c-a2e3-b3380653f637
│
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Jeremy G (Cloud Posse)

1
Jeremy G (Cloud Posse) avatar
Jeremy G (Cloud Posse)

It’s a long story. Does this help?

#34 aws_security_group_rule create_before_destroy triggers bug in provider

Describe the Bug

This module creates Security Group Rules using create_before_destroy = true.
This causes Terraform to fail when adding or removing CIDRs to an existing rule where an existing CIDR is retained, due to an issue with the Terraform AWS provider.

See hashicorp/terraform-provider-aws#25173 for details and examples.

See also hashicorp/terraform#31316 for proposed solutions.

1
setheryops avatar
setheryops

Hmmm….Sounds like the best workaround is to just del all the rules for a SG and then run apply.

Jeremy G (Cloud Posse) avatar
Jeremy G (Cloud Posse)

@setheryops Upgrade to cloudposse/terraform-aws-security-group v2 and that should take care of it for you.

cloudposse/terraform-aws-security-group

Terraform module to provision an AWS Security Group

2024-05-21

IK avatar

Does anyone have way to update an existing roles trust policy via terraform? permission policy etc; to remain unchanged. Initial reading seems to not be as simple as expected.

Aaron Miller avatar
Aaron Miller

Are you using a resource or module?

IK avatar

resource through a local module

IK avatar

FYI - for anyone who needs it.. had to resort to the aws-CLI, unless someone has a better way

resource "null_resource" "this" {
  provisioner "local-exec" {
    command = "aws iam update-assume-role-policy --role-name OrganizationAccountAccessRole --policy-document '${data.aws_iam_policy_document.update_assume_role.json}'"
  }
  triggers = {
    always_run = timestamp()
  }
}
RB avatar

hmm perhaps not exactly the same. You may want to create a new issue for your use-case

Luke Hsiao avatar
Luke Hsiao

Hi CloudPosse!

We’re using cloudposse/terraform-aws-elasticache-redis at work, and we are interested in setting the maxmemory-policy, so that we can change the eviction behavior [1, 2]. But, I’m not sure it’s possible with this module.

Could someone confirm my suspicion? Or, if I’m wrong, point me at how we can set these policies?

Cheers, Luke

1: https://docs.aws.amazon.com/whitepapers/latest/database-caching-strategies-using-redis/evictions.html 2: https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/ParameterGroups.Redis.html#ParameterGroups.Redis.4-0-10

2024-05-22

Release notes from terraform avatar
Release notes from terraform
04:53:27 PM

v1.8.4 1.8.4 (May 22, 2024) BUG FIXES:

core: Fix exponential slowdown in some cases when modules are using depends_on. (#35157) import blocks: Fix bug where resources with nested, computed, and optional id attributes would fail to generate configuration. (<a href=”https://github.com/hashicorp/terraform/issues/35220“…

Reduce redundant walks when resolving module `depends_on` by jbardin · Pull Request #35157 · hashicorp/terraformattachment image

The use of depends_on in modules can cause large numbers of nodes to become connected to all of the same dependencies. When using Ancestors to walk the graph and find all of these dependencies, the…

import: only filter id attribute at root level when generating config by liamcervante · Pull Request #35220 · hashicorp/terraformattachment image

The legacy SDK introduces an id attribute at the root level of resources. This caused invalid config to be generated for the legacy SDK. To avoid this we introduced a filter that removed this attri…

Jason avatar

Has anyone tried deploying helm charts on the same run when creating an AKS? I don’t want to use a kubeconfig file, but my machine or a ci/cd pipeline would not have one on the initial Terraform run.

The way I’m currently trying to authenticate to deploy helm is as follows:

provider "helm" {
  kubernetes {
    host                   = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].host
    username               = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].username
    password               = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].password
    client_certificate     = base64decode(azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].client_certificate)
    client_key             = base64decode(azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].client_key)
    cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].cluster_ca_certificate)
  }
}
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Not AKS, but with EKS, you typlically need to wait for the control plane to come online and be ready before deploying helm

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Also, now you have a requirement for Terraform to have direct connectivity to the control plane

Jason avatar

my helm release just sits there for ages going round for like 10mins and then comes back failed

Jason avatar

I have three helm charts that are all doing the same thing

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Yes, that sounds like control plane is either a) not reachable for network reasons b) not ready yet and timing out

Jason avatar

hmm

Jason avatar

thanks for the helm

Jason avatar

i mean help lol

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Here’s what we used to do.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
  default = "if test -n \"$ENDPOINT\"; then curl --silent --fail --retry 30 --retry-delay 10 --retry-connrefused --max-time 11 --insecure --output /dev/null $ENDPOINT/healthz; fi"
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Literally called curl until we had a successful response from the cluster

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

hitting the /healthz endpoint

Jason avatar

I did not know you could do that

Jason avatar

then how did you pass in the variable?

Jason avatar

did you do depends on default?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
resource "null_resource" "wait_for_cluster" {
  count      = local.enabled && var.apply_config_map_aws_auth ? 1 : 0
  depends_on = [aws_eks_cluster.default]

  provisioner "local-exec" {
    command     = var.wait_for_cluster_command
    interpreter = var.local_exec_interpreter
    environment = {
      ENDPOINT = local.cluster_endpoint_data
    }
  }
}
Jason avatar

nice

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

I consider this a dirty hack. But there was no other way.

Jason avatar

I think your message from earlier has put me in the right direction.

Jason avatar

The pub IP im assigning to the cluster is not responding to pings

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

That’s it probably

Jason avatar

With Terraform there are only dirty hacks

Jason avatar

Im trying to move more and more away from it

Jason avatar

I want to be a dev im sick of devops

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Hahaha! Yes, I empathize with that. But to quote Mike Rowe, “It’s a dirty job, but someone’s gotta do it”

Jason avatar

Yeh been doing it for 7 years now no more

Jason avatar

Too many late nights now and early mornings

Jason avatar

I literally watch devs throw stuff over the fence to DevOps and then sign off. I’m like, yep, that’s what I want to be doing. Also, I want to create new stuff; I’m sick and tired of creating Kubernetes Infrastructure or VMs. I used to do web apps, and that was okay.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Yup - that’s what we felt at Cloud Posse and why we built our module ecosystem. We were tired of doing the same thing over and over again, having to fix the same things over and over again. No convergence to a solution. Unfortunately, we don’t do much with Azure.

loren avatar

I was able to get this kind of provider-level dependency working by using the terraform_data resource to create a dependency between the provider authentication attributes and the resources that need to be ready, using the input of the terraform_data. On vacation at the moment, but happy to share when I get back

loren avatar

Basically, “don’t initialize even the provider until all the required deploy-time dependencies are ready.” Most approaches initialize the provider as soon as the cluster auth attributes are ready, and that isn’t sufficient

Jason avatar

As in dont do a terraform init on the provider at the same time as the other stuff?

Jason avatar

I cant quite grasp what you mean

loren avatar

No. I’ll post code later when I have access to it. One init, one apply. It’s just careful management of the edges in the terraform graph, so it can map out the dependency order correctly

Jason avatar

So, I might be understanding you, build the cluster, then data resource it and use the data resource in the Kubernetes provider setup, not the resource part? But for that to work, I would have to re-initialize as on the first init, the cluster does not exist…

loren avatar

Yes, the provider references the terraform_data resource attributes, rather than directly referencing the cluster resource attributes. But no, terraform handles that provider dependency chain naturally in a single init/apply

loren avatar

I do it with eks, and even alias a couple providers… Build the cluster, link an aliased “before compute” provider, resources that need to exist before node groups use that provider, chain the node groups to those resources, link another aliased “after compute” provider to the node group arns, resources that require compute use that aliased provider…

Jason avatar

Okay, if you can share some code, that would be amazing. I need to turn this in today. I’ve been on this for far too long.

loren avatar

Sorry, on vacation like I said. Won’t have code until Monday.

Jason avatar

Okay, thanks. I’d still be interested to see how you did this when you’re back. Thanks for your help.

loren avatar

I remembered I wrote up half of it on a PR for the eks module we used… This does the “after compute” provider… https://github.com/terraform-aws-modules/terraform-aws-eks/pull/3000#issuecomment-2059599312

Comment on #3000 fix: Forces cluster outputs to wait until access entries are complete

Nifty, yeah this seems to work just fine… For the kubectl provider, just use the module outputs since they will be available before the node group. And for other providers, thread the inputs through terraform_data in a way that links them to the compute option used in the config (or just use all of them, as I did below). Then all resources that depend on the node group will be created after the node group is ready. And if you need some things to kick off before compute and others after compute, for the same provider type, just use two providers with a provider alias.

# Use terraform_data inputs to create resources after compute
provider "kubernetes" {
  host                   = terraform_data.eks_cluster_after_compute.input.cluster_endpoint
  cluster_ca_certificate = base64decode(terraform_data.eks_cluster_after_compute.input.cluster_certificate_authority_data)
  token                  = data.aws_eks_cluster_auth.this.token
}

# Use terraform_data inputs to create resources after compute
provider "helm" {
  kubernetes {
    host                   = terraform_data.eks_cluster_after_compute.input.cluster_endpoint
    cluster_ca_certificate = base64decode(terraform_data.eks_cluster_after_compute.input.cluster_certificate_authority_data)
    token                  = data.aws_eks_cluster_auth.this.token
  }
}

# Use module outputs directly to create resources before compute
provider "kubectl" {
  apply_retry_count      = 10
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
  load_config_file       = false
  token                  = data.aws_eks_cluster_auth.this.token
}

# Force a dependency between the EKS cluster auth inputs, and the compute resources
resource "terraform_data" "eks_cluster_after_compute" {
  input = {
    cluster_endpoint                   = module.eks.cluster_endpoint
    cluster_certificate_authority_data = module.eks.cluster_certificate_authority_data

    fargate_profiles         = module.eks.fargate_profiles
    eks_managed_node_groups  = module.eks.eks_managed_node_groups
    self_managed_node_groups = module.eks.self_managed_node_groups
  }
}

And it does actually work to thread tags back through from the kubectl_manifest to the node group. This will create the graph edge between the kubectl resource and the node group, so terraform can order actions correctly.

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "~> 20.8.5"
  # ...
  eks_managed_node_groups = {
    foo = {
      # ...
      tags = merge(
        { for label, config in kubectl_manifest.eni_config : "eni-config/${label}" => config.id },
        local.tags
      )
    }
  }
}

Though, for the moment, I chose not to thread the tags through that way because of the potential for recursive references, instead just relying on dataplane_wait_duration. If I really wanted to establish this graph edge, I would probably not use the top-level node-group/compute options in the module, and instead separately call the node-group submodule. Makes the dependencies easier to visualize and manipulate.

Jason avatar

Okay thanks and in the module thats just a normal eks resource right?

loren avatar

Yes, only trick there is that some of the module outputs have a dependency on the eks access resources to ensure cluster authentication is setup before the outputs become available. Check the module code and you’ll see what I mean

Jason avatar

So you work for AWS?

loren avatar

Oh gosh no, not at all

Jason avatar

Ohh I thought you did because this the Github for AWS Provider?

loren avatar

No not the AWS provider, just a community-maintained terraform module for eks

Jason avatar

Ohh

Jason avatar

How do you find the time?

Jason avatar

Im always blooming working or looking after my little girl…

loren avatar

Umm, #newclient was already using the module, and I was brought in to fix/modernize some of their practices, and the module was broken so I fixed it as part of the client engagement

Jason avatar

Ohh nice

Jason avatar

Thats good work. Where I work, I am on this project with a tight deadline, then get moved to another tight deadline, then move on to another project. I never get to have time to give back, which Is something I want to do.

loren avatar

Yeah we try to use/publish open source terraform modules as part of all our engagements. Really helps improve reuse, and teaches our own folks good hygiene

Jason avatar

Excellent, I code in Go and want to do some open-source stuff with Providers. That would be nice.

loren avatar

Follow apparentlymart on GitHub? He’s a terraform-core contributor, and frequently publishes niche providers. could be a good model to follow for working on your own…

Jason avatar

I need help finding this bit data.aws_eks_cluster_auth.this.token on this repo. The aws_eks_cluster_auth part.

Jason avatar

okay ill go look them up

loren avatar

Oh that’s just a data resource offered by the AWS provider. Check the docs. I don’t have that snippet handy

loren avatar

I think it takes the cluster id or something as input

Jason avatar

okay is it essentially just the eks cluster giving out its token?

loren avatar

Yes

Jason avatar

Cool ive seen something similar in AKS

Jason avatar

right I think I might have this together in my head.

1
Jason avatar

This guy apparentlymart is a G

Jason avatar

Im reading through his Go proposal

Jason avatar

totally distracted now lol

loren avatar

Yes, dude is sharp as F, super thorough, writes well, and is crazy humble. Great role model

golang1
Jason avatar

Is there a terraform registry on this terraform_data as in what fields you can pass in as I can only find this on it: https://developer.hashicorp.com/terraform/language/resources/terraform-data#example-usage-null_resource-replacement

The terraform_data Managed Resource Type | Terraform | HashiCorp Developerattachment image

Retrieves the root module output values from a Terraform state snapshot stored in a remote backend.

loren avatar

That’s all there is to it. The input argument takes any object or expression, and its value is available as an attribute

Jason avatar

I couldnt get it to work with an AKS

loren avatar

The general idea should be agnostic to aks vs eks, I think. It’s only relying on terraform core features

loren avatar

I set a reminder for when I’m back to try and provide the equivalent setup for aks, based on your snippet above. I just can’t test it since I don’t have access to an azure environment at the moment

loren avatar

is this what you had?

# Use terraform_data inputs to create resources after compute
provider "helm" {
  kubernetes {
    host                   = terraform_data.aks_cluster_after_compute.input.host
    username               = terraform_data.aks_cluster_after_compute.input.username
    password               = terraform_data.aks_cluster_after_compute.input.password
    client_certificate     = base64decode(terraform_data.aks_cluster_after_compute.input.client_certificate)
    client_key             = base64decode(terraform_data.aks_cluster_after_compute.input.client_key)
    cluster_ca_certificate = base64decode(terraform_data.aks_cluster_after_compute.input.cluster_ca_certificate)
  }
}

# Force a dependency between the AKS cluster auth details, and the compute resources
resource "terraform_data" "aks_cluster_after_compute" {
  input = {
    host                   = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].host
    username               = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].username
	password               = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].password
    client_certificate     = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].client_certificate
    client_key             = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].client_key
    cluster_ca_certificate = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].cluster_ca_certificate
  }
}
Jason avatar

Yeh something like that

Jason avatar

Didn’t work

loren avatar

“didn’t work” isn’t exactly something that can be investigated. what config, what command, what error?

1
Jason avatar

Sorry, @loren. I was tired when I replied to this. Basically, for the AKS, no matter what, it has to read from the Kubeconfig file, and that does not update on a first run. With EKS, you can get around this by using the Token reference, but for AKS, there is no Token. For username and password, it still goes to the Kubeconfig file to get the Username and Password.

loren avatar

kinda curious, can you close the loop for me? what about this reference is “goes to the Kubeconfig file to get the username and password”? it certainly appears to me like it gets it from the cluster resource attribute, not from a file…

azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].username
Jason avatar

The way I understand it to work is the part where it says kube_config[0] its reading from the kube_config file

2024-05-23

2024-05-24

theherk avatar
theherk

Not sure what your preferred workflow is, but maybe worth sharing here. I believe since the latest aws provider was released the route53 alias module is no longer useable. Several teams internally have reported the same issue I had, so I’d be surprised if any implementations are not suffering the same. I opened #53 therefore.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Best to post PRs in #pr-reviews

2024-05-25

Dmitry avatar

Hey! Been trying to create couple of sftp servers and users with the module https://github.com/cloudposse/terraform-aws-transfer-sftp This is my code:

module "sftp" {
  source  = "cloudposse/transfer-sftp/aws"
  version = "1.3.0"

  for_each = local.app_resources

  domain                 = "S3"
  s3_bucket_name         = lookup(each.value, "bucket_name", null)
  vpc_id                 = module.vpc[local.default_vpc_name].vpc_id
  subnet_ids             = local.private_subnet_ids
  vpc_security_group_ids = [module.s3_sftp_sg.security_group_id]
  domain_name            = "${lookup(each.value, "namespace", null)}-sftp.${local.stage_dns_domain}"
  zone_id                = local.private_route53_id

  sftp_users = {
    for user, config in each.value.users :
    user => {
      user_name  = config.user_name
      public_key = config.public_key
    }
  }

  delimiter = "-"
  context = {
    additional_tag_map  = {}
    attributes          = []
    delimiter           = null
    descriptor_formats  = {}
    enabled             = true
    environment         = null
    id_length_limit     = null
    label_key_case      = null
    label_order         = []
    label_value_case    = null
    labels_as_tags      = []
    name                = "sftp"
    namespace           = null
    regex_replace_chars = null
    stage               = "${each.key}"
    tags = merge(
      local.tags,
      local.app_tag
    )
    tenant = null
  }

  tags = merge(
    local.tags,
    local.app_tag
  )
}

And all works well except the endpoints and DNS names. By default aws_transfer_server endpoint is [s-12345678.server.transfer.REGION.amazonaws.com](http://s-12345678.server.transfer.REGION.amazonaws.com) And this doesn’t resolve in any IP address, and when there is a DNS CNAME created to it - it also has no IP addr behind: According to the official doc - https://docs.aws.amazon.com/transfer/latest/userguide/transfer-file.html#openssh - we should use DNS names from the Endpoiont and they not the same as [s-12345678.server.transfer.REGION.amazonaws.com](http://s-12345678.server.transfer.REGION.amazonaws.com) . And with these DNS names from the example everything works. How is it suppose to be with this module? Is there a way to get proper DNS name of the endpoint?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
data "aws_route53_zone" "default" {
  count = local.enabled ? 1 : 0

  name = "${var.stage}.${var.hosted_zone_suffix}"
}

module "sftp" {
  source  = "cloudposse/transfer-sftp/aws"
  version = "1.2.0"

  domain                 = var.domain
  sftp_users             = local.sftp_users
  s3_bucket_name         = data.aws_s3_bucket.default[local.default_global_s3_bucket_name_key].id
  restricted_home        = var.restricted_home
  force_destroy          = var.force_destroy
  address_allocation_ids = var.address_allocation_ids
  security_policy_name   = var.security_policy_name
  domain_name            = var.domain_name
  zone_id                = one(data.aws_route53_zone.default[*].id)
  eip_enabled            = var.eip_enabled

2024-05-28

Narayanaperumal Gurusamy avatar
Narayanaperumal Gurusamy

How can i enable ebs addon on eks cluster?

1
Narayanaperumal Gurusamy avatar
Narayanaperumal Gurusamy

after adding i see issue related to iam role, “AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity”

Narayanaperumal Gurusamy avatar
Narayanaperumal Gurusamy

am i missing anything

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Jeremy G (Cloud Posse)

Jeremy G (Cloud Posse) avatar
Jeremy G (Cloud Posse)

@Narayanaperumal Gurusamy Yes, the EBS addon requires an IAM Role with a specific name and with a specific policy attached. See how Cloud Posse does it here.

Creating the Amazon EBS CSI driver IAM role - Amazon EKS

The Amazon EBS CSI plugin requires IAM permissions to make calls to AWS APIs on your behalf.

resource "aws_iam_role_policy_attachment" "aws_ebs_csi_driver" {
  count = local.ebs_csi_sa_needed ? 1 : 0

  role       = module.aws_ebs_csi_driver_eks_iam_role.service_account_role_name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy"
}

module "aws_ebs_csi_driver_eks_iam_role" {
  source  = "cloudposse/eks-iam-role/aws"
  version = "2.1.1"

  enabled = local.ebs_csi_sa_needed

  eks_cluster_oidc_issuer_url = local.eks_cluster_oidc_issuer_url

  service_account_name      = "ebs-csi-controller-sa"
  service_account_namespace = "kube-system"

  context = module.this.context
}
Narayanaperumal Gurusamy avatar
Narayanaperumal Gurusamy

Thanks @Jeremy G (Cloud Posse) its helped lot.

1

2024-05-29

2024-05-30

2024-05-31

Release notes from terraform avatar
Release notes from terraform
12:33:30 PM

v1.9.0-beta1 1.9.0-beta1 (May 31, 2024) NEW FEATURES:

Input variable validation rules can refer to other objects: Previously input variable validation rules could refer only to the variable being validated. Now they are general expressions, similar to those elsewhere in a module, which can refer to other input variables and to other objects such as data resources. templatestring function: a new built-in function which is similar to templatefile but designed to render templates obtained dynamically, such…

Release v1.9.0-beta1 · hashicorp/terraformattachment image

1.9.0-beta1 (May 31, 2024) NEW FEATURES:

Input variable validation rules can refer to other objects: Previously input variable validation rules could refer only to the variable being validated. No…

Jason avatar

#terraform people do you fancy getting your stack overflow points to go up

Jason avatar

And help a fellow participant out. Take a look at my question here: https://stackoverflow.com/questions/78560433/this-bash-script-will-not-run-in-terraform

Happy to post here if your not interested in growing your points on Stack Overflow

This bash script will not run in terraform

I have this piece of Terraform code to run a cli bash script to configure a cluster for me, but Terraform will not run the script. The script runs fine, with no errors outside of Terraform. The Err…

RB avatar

what does gpt say

This bash script will not run in terraform

I have this piece of Terraform code to run a cli bash script to configure a cluster for me, but Terraform will not run the script. The script runs fine, with no errors outside of Terraform. The Err…

Jason avatar

Haha

Jason avatar

I fixed it if you want to see the anwser

1
sheldonh avatar
sheldonh

Did a search through archives, and couldn’t find confirmation….

I pin all versions of resources in most things. However, if you are building a module for your org, do you version pin the provider in the module too? I’m assuming no issues with this as plan would just download both, but I’m rusty having been digging into pulumi in the last year and can’t recall if version pinning in the module itself is a bad practice.

Even though I’m not active here much right now I still refer all folks to you I can cause still one of the most useful/expert communities I’ve joined. You all rock wave

loren avatar

no pinning of provider versions in reusable modules. just set the min version.

in the root module, pin everything and use .terraform.lock.hcl

Nate McCurdy avatar
Nate McCurdy

For child modules, it’s best to specify the minimum required version of a provider: https://developer.hashicorp.com/terraform/language/modules/develop/providers#provider-version-constraints-in-modules

2
loren avatar

on the other hand, for module references, always pin exact versions

sheldonh avatar
sheldonh

nice. I can do that. Just have to go look at the renovate config settings and i’ll have our module project look at provider max (main version) and min being set.

yes, otherwise I always do exact version pinning and use renovate to process

sheldonh avatar
sheldonh

just thought i remembered some issue with resuable modules and provider versions being exact being a problem but it’s been so long

loren avatar

yes, any given terraform state can only reference a single version of a provider at a time. so if you have two modules in the same config trying to exact-pin different versions of the same provider, it fails

1
sheldonh avatar
sheldonh

do you’ll normally do just the minimum version instead of also the max range allowed? I see that in Cloudposse’s repos, but curious if you’ve seen benefit to do “between min and max versions” like >=3 , <= 4.0 when it’s on 3? Otherwise I’ll stick with minimum version and we’ll give up a little testability

loren avatar

i don’t see a ton of value in restricting the max version of the provider in reusable modules, since i’m pinning exact versions in the root config and also using dependabot/renovate against the root config to test updates to provider versions.

loren avatar

sure, the update may fail, but the project is still “healthy” and we can update the module for compatibility on our own pace

sheldonh avatar
sheldonh

ok. I don’t have control downstream for consumers. Taking over part of the internal resuable modules and making sure i updated renovate config to the correct setup, so I’ll look disabling provider version updates only in my Azure DevOps project.

sheldonh avatar
sheldonh

should be able to use required_provider depType based on examples in github and override this one project to make sure it doesn’t try to pin those. great advice thank you!

loren avatar

what you can do in your reusable module to get an “early warning” of an issue, is write test configs as root modules and pin provider versions there, and use renovate to update the test configs

sheldonh avatar
sheldonh

yes! Structure is aligned to normal cloudposse files in base, examples directory, so I can match this and ignore pinning anything in root directory which isn’t typical for root modules

1
loren avatar

of course, you need a test harness with credentials that can apply and destroy the test configs

sheldonh avatar
sheldonh

yeah part of that is in place with Nuke and maybe I’ll replace with either the native new test features in terraform or just write it like I did before in Go with some terratest calls. Either way it’s on the list!

loren avatar

depending on the resources in use, localstack can be a reasonable option in tests to avoid real credentials and resources/costs

sheldonh avatar
sheldonh

we have credentials for azure, but i’ll check. haven’t seen if localstack does azure.

What I want is to just use pulumi and move on I’m using it for all my independent projects, but it’s hard to get interest in it so i’ll keep grumbling at terraform

1
Ryan avatar

Hi all,

This is Ryan (ex Spacelift, now Resourcely).

We are looking to document and test integrating Resourcely with additional CICD and other terraform runners.

Resourcely is a configuration engine focusing on paved roads for infrastructure as code.

It’s kind of like an AWS Service Catalog, but 10x better and with support across cloud providers and platforms. You can define safe & sane blueprints, then present developers with a nice quick and clean interface for selecting those things, instead of the naked cloud console. Multi-cloud support out of the gate makes for a pretty compelling story, and users really like the security implications of default-deny when it comes to resources (developers can select from blessed service blueprints, which ensures they aren’t accidentally choosing a $30/hr instance type or an unapproved region or a software version that hasn’t been approved by the compliance folks, etc.).

Our team currently has documented a number of integrations with “terraform runners” / TACOS solutions found here: https://docs.resourcely.io/integrations/terraform-integration

We need to create some documentation and run a few end to end tests with more real world use cases tested. I’d love to figure out a way to make it beneficial to the community and any individuals who want to assist.

Is there anyone interested in collaborating with us?

Feel free to comment or send me a DM, then we can connect next week.

    keyboard_arrow_up