#terraform (2024-05)
Discussions related to Terraform or Terraform Modules
Archive: https://archive.sweetops.com/terraform/
2024-05-01
Hi Team, i am trying to create read replica for document DB with different instance class than primary
module "documentdb_cluster" {
source = "cloudposse/documentdb-cluster/aws"
since instance_class is string i cannot have different instance class for my read replica - any suggestions on this ? how do i have different instance class for my replica (edited)
v1.9.0-alpha20240501 1.9.0-alpha20240501 (May 1, 2024) ENHANCEMENTS:
terraform console: Now has basic support for multi-line input in interactive mode. (#34822) If an entered line contains opening paretheses/etc that are not closed, Terraform will await another line of input to complete the expression. This initial implementation is primarily intended…
1.9.0-alpha20240501 (May 1, 2024) ENHANCEMENTS:
terraform console: Now has basic support for multi-line input in interactive mode. (#34822) If an entered line contains opening paretheses/etc that…
The console command, when running in interactive mode, will now detect if the input seems to be an incomplete (but valid enough so far) expression, and if so will produce another prompt to accept a…
2024-05-02
Super cool experiment in the 1.9 alpha release, they’re looking for feedback if you want to give it a go… https://discuss.hashicorp.com/t/experiment-feedback-input-variable-validation-can-cross-reference-other-objects/66644
Hi everyone, In yesterday’s Terraform CLI v1.9.0-alpha20240501 there is an experimental implementation of the long-requested feature of allowing input variable validation rules to refer to other values in the same module as the variable declaration. For example, it allows the validation rule of one variable to refer to another: terraform { # This experiment opt-in will be required as long # as this remains experimental. If the experiment # is successful then this won’t be needed in the …
2024-05-03
:wave: Hello, team!
I am having a slight issue with your terraform-aws-lambda-elasticsearch-cleanup
module. It use to work fine, but since I upgrade the TF AWS provider to 5.47.0
from 4.20.1
and bumped the pinned module version to 0.14.0
from 0.12.3
I am getting the following error.
I am using Terraform version 1.8.2
Error: External Program Execution Failed
│
│ with module.lambda-elasticsearch-cleanup.module.artifact.data.external.curl[0],
│ on .terraform/modules/lambda-elasticsearch-cleanup.artifact/main.tf line 3, in data "external" "curl":
│ 3: program = concat(["curl"], var.curl_arguments, ["--write-out", "{\"success\": \"true\", \"filename_effective\": \"%%{filename_effective}\"}", "-o", local.output_file, local.url])
│
│ The data source received an unexpected error while attempting to execute
│ the program.
│
│ Program: /usr/bin/curl
│ Error Message: curl: (22) The requested URL returned error: 404
│
│ State: exit status 22
Have I missed an upgrade step somewhere or is there an issue with the file?
Hrmmm… I have a theory. We recently rolled out some new workflows, maybe the artifact wasn’t produced.
@Igor Rodionov can you take a look?
Essentially, what the module does is a clever hack to download the artifact from S3 based on the commit SHA of the module version you are pulling
For some reason, the commit sha corresponding to that module version does not exist, which leads me to believe there’s a problem with the artfact and something wrong with the pipeline
Thanks for getting back to me. Would this not effect all users then?
Yep
….all users using it at that version
For now, try an older version
ok… standby
Issue with an older version of the module is that AWS TF provider 5 has depreciated some calls.
"source_json"
and
override_json
I gotcha… so this is a problem, especially if you manage the terraform-aws-lambda-elasticsearch-cleanup
in the same life cycle as, say, your elastic search cluster. While that’s not unreasonable, and what most probably are doing, it’s an example of why we like to break root modules out by lifecycle, reducing the tight coupling and dependencies on provider versions. That said, I totally get why this is a problem, just explaining why we (cloud posse) are less affected by these types of changes.
Components are opinionated building blocks of infrastructure as code that solve one specific problem or use-case.
Makes sense . I just grouped things together that went together. Ending up in this situation.
Thanks for understanding… We’ll get this fixed, just cannot commit to when that will be.
No issues, this is just on my upgrade TF branch and not on master. So I am good for now
2024-05-04
Announcement: In support of using OpenTofu, starting with Geodesic v2.11.0, we are pre-installing package repos to allow you to easily install OpenTofu in your Dockerfile.
ARG OPEN_TOFU_VERSION=1.6.2
RUN apt-get update && apt-get install tofu=${OPEN_TOFU_VERSION}
2024-05-05
Guys is this normal behavior? In AWS EKS I have upgraded my nodes into t3.large from t3.medium, I saw before confirming “yes” that terraform will destroy the old nodes in order to proceed with the upgrade but I didn’t expect it to delete the volumes as well, good thing it only happened in our testing environment, my question is this normal behavior if I upgrade the instance_types? Because I was hoping to be able to upgrade it without affecting my persistent volumes
This is really more of a #kubernetes question, but I will take a crack at it here.
It seems to me you are confusing the EBS volumes associated with EC2 Instances as root volume, providing ephemeral storage (e.g. emptyDir
) for Kubernetes, with EBS volumes associated with PersistentVolumes. The former have lifecycles tied to the instances: When new instances are created (e.g. when the AutoScaling Group scales up), new EBS volumes are created, and when the instances are deleted, so are the EBS volumes.
Kubernetes PersistentVolumes, which may be implemented as EBS volumes or something else, should persist until their PersistentVolumeClaims are deleted, and then only if the reclaim policy is set to “delete”.
Thanks @Jeremy G (Cloud Posse), though I’m using StatefulSet for my deployment (I have EBS CSI driver setup as well) so I thought this should be using EBS that from what I understood should be independent with my EC2 lifecycle?
In my StatefulSet deployment I have defined volumeClaimTemplates
that from what I understood should be using the EBS volume? Thank you for the answer though should I post this to K8 (my bad for posting here because I was using Terraform to maintain our infra) and continue the discussion there? :o
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: authentication-postgres
labels:
app: authentication-postgres-app
namespace: postgres
spec:
serviceName: authentication-postgres-svc
replicas: 1
selector:
matchLabels:
app: authentication-postgres-app
template:
metadata:
labels:
app: authentication-postgres-app
spec:
containers:
- name: authentication-postgres-container
image: postgres:16.2-bullseye
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB
value: authentication_db
- name: POSTGRES_USER
value: postgres
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: authentication-postgres-secret
key: postgres-password
volumeMounts:
- name: data
mountPath: /mnt/authentication-postgres-data
imagePullSecrets:
- name: docker-reg-cred
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: "5Gi"
What I’m saying is that on upgrade, some EBS volumes will get deleted and some will not. What leads you to believe that your PersistentVolumes, with active PersistentVolumeClaims, are the ones being deleted?
Ohhh I see! I came to this conclusion because the PostgresSQL database lost its data after I upgraded the nodes, which means it’s one of those EBS that unfortunately got cleared? Is there a way for me to avoid that so that when I upgrade nodes the EBS are safe?
I don’t know how you deployed PostgreSQL. It seems like you deployed it to use ephemeral storage rather than dedicated PersistentVolumes, despite having a volumeClaimTempate, but this gets into the details of your PostgreSQL deployment, and maybe Helm chart. Which is why I directed you to #kubernetes
2024-05-06
2024-05-07
any luck on this ?
Hi Team, i am trying to create read replica for document DB with different instance class than primary
module "documentdb_cluster" {
source = "cloudposse/documentdb-cluster/aws"
since instance_class is string i cannot have different instance class for my read replica - any suggestions on this ? how do i have different instance class for my replica (edited)
Hello all,
This is my first message here in Slack! I found a little bug on memcached module. Issue is opened: https://github.com/cloudposse/terraform-aws-elasticache-memcached/issues/78 is someone can check and help to me for send a PR? my changes are ready on my local. Thanks!
If you are able to open a PR, post it in #pr-reviews and someone will review it promptly
I don’t see one from you in #pr-reviews
yep, because i was little bit sleepy last night. approves are welcome.
Looks like https://github.com/cloudposse/terraform-aws-elasticache-memcached/pull/79 was already merged
what
• If we pass elasticache_subnet_group_name, the aws_elasticache_subnet_group.default[0] won’t be created anymore
why
• Who needs a new elasticache_subnet_group even we already created before and just want to pass a name
references
• Check issue #78
2024-05-08
v1.8.3 1.8.3 (May 8, 2024) BUG FIXES:
terraform test: Providers configured within an overridden module could panic. (#35110) core: Fix crash when a provider incorrectly plans a nested object when the configuration is null (<a href=”https://github.com/hashicorp/terraform/issues/35090” data-hovercard-type=”pull_request”…
While we don’t normally encounter providers within modules, they are technically still supported, and could exist within a module which has been overridden for testing. Since the module is not bein…
When descending into structural attributes, don’t try to extract attributes from null objects. Unlike with blocks, nested attributes allow the possibility of assigning null values which could be ov…
2024-05-10
Hi, not sure if this is an issue but I’m having cycles every time I try to destroy a service and after a long work, I discovered that it’s related to the security groups. If I manually remove the service SG and rules, the cycles are gone. This is related to the ecs alb service module
I have that problem a lot when using a security group created in another backend.
ie:
Backend A: contains a resource group Backend B: uses the resource group
One pattern I’ve been using to avoid problems is
Backend A: contains a resource group Backend B: attaches rules to the resources group and uses these.
https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/security_group_rule
Don’t know if that helps ¯_(ツ)_/¯
thanks! I will look into it. I manage all the resources in the same backend but I have little control on the creation/destruction as the cloudposse modules are the ones managing the resources
2024-05-13
This might have been talked about in an earlier thread already, but is anyone else seeing some weird behavior in their editor within terragrunt-cache for the module download for terraform-null-label? VSCode is throwing an error in the cache folder, when i tunnel down it takes me to /examples/autoscalinggroup/main.tf line 28:
# terraform-null-label example used here: Set tags on everything that can be tagged
tag_specifications {
for_each = ["instance", "volume", "elastic-gpu", "spot-instance-request"]
with the error message “Unexpected attribute: An attribute named “for_each” is not expected here. Terraform”
maybe post a link to this message in #terragrunt
ok, just to cross-post? or am i in the wrong channel?
2024-05-14
Hi Expertis,
I would like to learn about the IaC (Terraform with Terragrunt), but I have no experience with it. If possible, please help me continue to explore the next step profile.
Hi @Veerapandian M welcome to the community!
Definitely feel free to ask pointed questions as you continue your journey.
Hello Team I am beginner to terraform, I want to set up the environment specified in https://github.com/cloudposse/terraform-datadog-platform. Can some one help me to point out the documentation. I know its stupid question
Kindly provide me basic flow and installation, set up. I am referring this solution so that I can customize to read swagger.Json file and convert it synthetic tests automatically. Its end goal. want to build solution for same
Terraform module to configure and provision Datadog monitors, custom RBAC roles with permissions, Datadog synthetic tests, Datadog child organizations, and other Datadog resources from a YAML configuration, complete with automated tests.
@Jeremy White (Cloud Posse)
Terraform module to configure and provision Datadog monitors, custom RBAC roles with permissions, Datadog synthetic tests, Datadog child organizations, and other Datadog resources from a YAML configuration, complete with automated tests.
If you just want to get synthetics up and running, I think you can just copy the synthetics example and adjust it to use your own datadog api endpoints. After that, just start creating synthetics similar to what you see in the synthetics catalog within the same example
2024-05-15
Hello everyone! I hope you’re all doing well. I’m currently facing an issue with creating simple infrastructure using terragrunt as wraper for terraform. Goal is to create 2 subnetworks in 2 different zones and create for 3 vms in each subnetwork. Subnetworks created without issues, the problem arises when I try to create 3vms in each subnetworks.
Project has the following structure
├── environmentsLive │ ├── dev │ ├── net │ │ └── terragrunt.hcl │ ├── vms │ │ └── terragrunt.hcl │ └──terragrunt.hcl └── modules ├── network │ ├── main.tf │ ├── outputs.tf │ ├── variables.tf │ └── versions.tf ├── vm ├── main.tf ├── outputs.tf ├── variables.tf └── versions.tf
I am running terragrunt run-all apply inside of the dev folder in order to have state file for each module specified in dev folder and it works. The problem is that for “vms” module I need iterate over output “subnet_id” variable of “net” module which is
subnet_id = [
"projects/playground-s-11-59f50f2a/regions/us-central1/subnetworks/dev-subnet-us-central1",
"projects/playground-s-11-59f50f2a/regions/us-east1/subnetworks/dev-subnet-us-east1",
]
But in inputs{} block of “terragrunt.hcl” file of vms module is expected only one value per variable
The content of “terragrunt.hcl” file of vms module is:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "/home/app/terr/Terraform/src/ModulesBySeperateState/modules/vm"
}
dependency "vpc" {
config_path = "/home/app/terr/Terraform/src/ModulesBySeperateState/environmentsLive/dev/net"
}
inputs = {
subnet_id = dependency.vpc.outputs.subnet_id
first_zone_per_region = dependency.vpc.outputs.first_zone_per_region
regions = dependency.vpc.outputs.regions
}
The main.tf for vm module looks like this:
resource "google_compute_instance" "vm" {
for_each = var.names
name = "${each.value.name}-${var.environment}-${var.first_zone_per_region[var.regions]}"
machine_type = each.value.type
zone = var.first_zone_per_region[var.regions]
network_interface {
subnetwork = var.subnet_id
}
}
I’ve tried to create a wraper module for vms to iterate over subnet_id and provide output for vms.
module "wrapvms" {
source = "./emptyVmModuleForWrap"
environment = var.environment
count = length(var.subnet_id)
region = var.regions[count.index]
subnet_id = subnet_id[count.index]
first_zone_per_region = var.first_zone_per_region
names = var.names
}
But due to lack of my experience it doesn’t work. Could someone please offer some assistance or guidance? Any help would be greatly appreciated. Thank you in advance!
Please use #terragrunt
2024-05-16
v1.9.0-alpha20240516 1.9.0-alpha20240516 (May 16, 2024) ENHANCEMENTS:
terraform console: Now has basic support for multi-line input in interactive mode. (#34822) If an entered line contains opening parentheses/etc that are not closed, Terraform will await another line of input to complete the expression. This initial implementation is primarily…
1.9.0-alpha20240516 (May 16, 2024) ENHANCEMENTS:
terraform console: Now has basic support for multi-line input in interactive mode. (#34822) If an entered line contains opening parentheses/etc th…
The console command, when running in interactive mode, will now detect if the input seems to be an incomplete (but valid enough so far) expression, and if so will produce another prompt to accept a…
2024-05-20
Has anyone ever come across an error like this? Im trying to update some security group rules. I looked at the link in the error and that issue was merged and closed back in 2015 and reading through the issue notes im not even sure what the issue is. I found another issue that is similar and still open but its also old as dirt and doesnt have any clear “fix”. Im on TF 1.5.5. Ive seen “taint” as a possible fix but thats not in the TF version we are on. Any ideas here??
Error: [WARN] A duplicate Security Group rule was found on (sg-0ef73123456700cc). This may be
│ a side effect of a now-fixed Terraform issue causing two security groups with
│ identical attributes but different source_security_group_ids to overwrite each
│ other in the state. See <https://github.com/hashicorp/terraform/pull/2376> for more
│ information and instructions for recovery. Error: InvalidPermission.Duplicate: the specified rule "peer: 10.243.16.0/23, UDP, from port: 8301, to port: 8301, ALLOW" already exists
│ status code: 400, request id: d3725f91-da05-450c-a2e3-b3380653f637
│
It’s a long story. Does this help?
Describe the Bug
This module creates Security Group Rules using create_before_destroy = true
.
This causes Terraform to fail when adding or removing CIDRs to an existing rule where an existing CIDR is retained, due to an issue with the Terraform AWS provider.
See hashicorp/terraform-provider-aws#25173 for details and examples.
See also hashicorp/terraform#31316 for proposed solutions.
Hmmm….Sounds like the best workaround is to just del all the rules for a SG and then run apply.
@setheryops Upgrade to cloudposse/terraform-aws-security-group v2 and that should take care of it for you.
Terraform module to provision an AWS Security Group
2024-05-21
Does anyone have way to update an existing roles trust policy via terraform? permission policy etc; to remain unchanged. Initial reading seems to not be as simple as expected.
Are you using a resource or module?
resource through a local module
FYI - for anyone who needs it.. had to resort to the aws-CLI, unless someone has a better way
resource "null_resource" "this" {
provisioner "local-exec" {
command = "aws iam update-assume-role-policy --role-name OrganizationAccountAccessRole --policy-document '${data.aws_iam_policy_document.update_assume_role.json}'"
}
triggers = {
always_run = timestamp()
}
}
hmm perhaps not exactly the same. You may want to create a new issue for your use-case
Hi CloudPosse!
We’re using cloudposse/terraform-aws-elasticache-redis
at work, and we are interested in setting the maxmemory-policy
, so that we can change the eviction behavior [1, 2]. But, I’m not sure it’s possible with this module.
Could someone confirm my suspicion? Or, if I’m wrong, point me at how we can set these policies?
Cheers, Luke
1: https://docs.aws.amazon.com/whitepapers/latest/database-caching-strategies-using-redis/evictions.html 2: https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/ParameterGroups.Redis.html#ParameterGroups.Redis.4-0-10
Ah, on further reading, I think I found my answer: https://github.com/cloudposse/terraform-aws-elasticache-redis?tab=readme-ov-file#input_parameter
2024-05-22
v1.8.4 1.8.4 (May 22, 2024) BUG FIXES:
core: Fix exponential slowdown in some cases when modules are using depends_on. (#35157) import blocks: Fix bug where resources with nested, computed, and optional id attributes would fail to generate configuration. (<a href=”https://github.com/hashicorp/terraform/issues/35220“…
The use of depends_on in modules can cause large numbers of nodes to become connected to all of the same dependencies. When using Ancestors to walk the graph and find all of these dependencies, the…
The legacy SDK introduces an id attribute at the root level of resources. This caused invalid config to be generated for the legacy SDK. To avoid this we introduced a filter that removed this attri…
Has anyone tried deploying helm charts on the same run when creating an AKS? I don’t want to use a kubeconfig file, but my machine or a ci/cd pipeline would not have one on the initial Terraform run.
The way I’m currently trying to authenticate to deploy helm is as follows:
provider "helm" {
kubernetes {
host = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].host
username = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].username
password = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].password
client_certificate = base64decode(azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].client_certificate)
client_key = base64decode(azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].client_key)
cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].cluster_ca_certificate)
}
}
Not AKS, but with EKS, you typlically need to wait for the control plane to come online and be ready before deploying helm
Also, now you have a requirement for Terraform to have direct connectivity to the control plane
my helm release just sits there for ages going round for like 10mins and then comes back failed
I have three helm charts that are all doing the same thing
Yes, that sounds like control plane is either a) not reachable for network reasons b) not ready yet and timing out
hmm
thanks for the helm
i mean help lol
Here’s what we used to do.
default = "if test -n \"$ENDPOINT\"; then curl --silent --fail --retry 30 --retry-delay 10 --retry-connrefused --max-time 11 --insecure --output /dev/null $ENDPOINT/healthz; fi"
Literally called curl
until we had a successful response from the cluster
hitting the /healthz
endpoint
I did not know you could do that
then how did you pass in the variable?
did you do depends on default?
resource "null_resource" "wait_for_cluster" {
count = local.enabled && var.apply_config_map_aws_auth ? 1 : 0
depends_on = [aws_eks_cluster.default]
provisioner "local-exec" {
command = var.wait_for_cluster_command
interpreter = var.local_exec_interpreter
environment = {
ENDPOINT = local.cluster_endpoint_data
}
}
}
nice
I consider this a dirty hack. But there was no other way.
I think your message from earlier has put me in the right direction.
The pub IP im assigning to the cluster is not responding to pings
That’s it probably
With Terraform there are only dirty hacks
Im trying to move more and more away from it
I want to be a dev im sick of devops
Hahaha! Yes, I empathize with that. But to quote Mike Rowe, “It’s a dirty job, but someone’s gotta do it”
Yeh been doing it for 7 years now no more
Too many late nights now and early mornings
I literally watch devs throw stuff over the fence to DevOps and then sign off. I’m like, yep, that’s what I want to be doing. Also, I want to create new stuff; I’m sick and tired of creating Kubernetes Infrastructure or VMs. I used to do web apps, and that was okay.
Yup - that’s what we felt at Cloud Posse and why we built our module ecosystem. We were tired of doing the same thing over and over again, having to fix the same things over and over again. No convergence to a solution. Unfortunately, we don’t do much with Azure.
I was able to get this kind of provider-level dependency working by using the terraform_data resource to create a dependency between the provider authentication attributes and the resources that need to be ready, using the input of the terraform_data. On vacation at the moment, but happy to share when I get back
Basically, “don’t initialize even the provider until all the required deploy-time dependencies are ready.” Most approaches initialize the provider as soon as the cluster auth attributes are ready, and that isn’t sufficient
As in dont do a terraform init on the provider at the same time as the other stuff?
I cant quite grasp what you mean
No. I’ll post code later when I have access to it. One init, one apply. It’s just careful management of the edges in the terraform graph, so it can map out the dependency order correctly
So, I might be understanding you, build the cluster, then data resource it and use the data resource in the Kubernetes provider setup, not the resource part? But for that to work, I would have to re-initialize as on the first init, the cluster does not exist…
Yes, the provider references the terraform_data resource attributes, rather than directly referencing the cluster resource attributes. But no, terraform handles that provider dependency chain naturally in a single init/apply
I do it with eks, and even alias a couple providers… Build the cluster, link an aliased “before compute” provider, resources that need to exist before node groups use that provider, chain the node groups to those resources, link another aliased “after compute” provider to the node group arns, resources that require compute use that aliased provider…
Okay, if you can share some code, that would be amazing. I need to turn this in today. I’ve been on this for far too long.
Sorry, on vacation like I said. Won’t have code until Monday.
Okay, thanks. I’d still be interested to see how you did this when you’re back. Thanks for your help.
I remembered I wrote up half of it on a PR for the eks module we used… This does the “after compute” provider… https://github.com/terraform-aws-modules/terraform-aws-eks/pull/3000#issuecomment-2059599312
Nifty, yeah this seems to work just fine… For the kubectl
provider, just use the module outputs since they will be available before the node group. And for other providers, thread the inputs through terraform_data
in a way that links them to the compute option used in the config (or just use all of them, as I did below). Then all resources that depend on the node group will be created after the node group is ready. And if you need some things to kick off before compute and others after compute, for the same provider type, just use two providers with a provider alias.
# Use terraform_data inputs to create resources after compute
provider "kubernetes" {
host = terraform_data.eks_cluster_after_compute.input.cluster_endpoint
cluster_ca_certificate = base64decode(terraform_data.eks_cluster_after_compute.input.cluster_certificate_authority_data)
token = data.aws_eks_cluster_auth.this.token
}
# Use terraform_data inputs to create resources after compute
provider "helm" {
kubernetes {
host = terraform_data.eks_cluster_after_compute.input.cluster_endpoint
cluster_ca_certificate = base64decode(terraform_data.eks_cluster_after_compute.input.cluster_certificate_authority_data)
token = data.aws_eks_cluster_auth.this.token
}
}
# Use module outputs directly to create resources before compute
provider "kubectl" {
apply_retry_count = 10
host = module.eks.cluster_endpoint
cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
load_config_file = false
token = data.aws_eks_cluster_auth.this.token
}
# Force a dependency between the EKS cluster auth inputs, and the compute resources
resource "terraform_data" "eks_cluster_after_compute" {
input = {
cluster_endpoint = module.eks.cluster_endpoint
cluster_certificate_authority_data = module.eks.cluster_certificate_authority_data
fargate_profiles = module.eks.fargate_profiles
eks_managed_node_groups = module.eks.eks_managed_node_groups
self_managed_node_groups = module.eks.self_managed_node_groups
}
}
And it does actually work to thread tags back through from the kubectl_manifest
to the node group. This will create the graph edge between the kubectl resource and the node group, so terraform can order actions correctly.
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.8.5"
# ...
eks_managed_node_groups = {
foo = {
# ...
tags = merge(
{ for label, config in kubectl_manifest.eni_config : "eni-config/${label}" => config.id },
local.tags
)
}
}
}
Though, for the moment, I chose not to thread the tags through that way because of the potential for recursive references, instead just relying on dataplane_wait_duration
. If I really wanted to establish this graph edge, I would probably not use the top-level node-group/compute options in the module, and instead separately call the node-group submodule. Makes the dependencies easier to visualize and manipulate.
Okay thanks and in the module thats just a normal eks resource right?
Yes, only trick there is that some of the module outputs have a dependency on the eks access resources to ensure cluster authentication is setup before the outputs become available. Check the module code and you’ll see what I mean
So you work for AWS?
Oh gosh no, not at all
Ohh I thought you did because this the Github for AWS Provider?
No not the AWS provider, just a community-maintained terraform module for eks
Ohh
How do you find the time?
Im always blooming working or looking after my little girl…
Umm, #newclient was already using the module, and I was brought in to fix/modernize some of their practices, and the module was broken so I fixed it as part of the client engagement
Ohh nice
Thats good work. Where I work, I am on this project with a tight deadline, then get moved to another tight deadline, then move on to another project. I never get to have time to give back, which Is something I want to do.
Yeah we try to use/publish open source terraform modules as part of all our engagements. Really helps improve reuse, and teaches our own folks good hygiene
Excellent, I code in Go and want to do some open-source stuff with Providers. That would be nice.
Follow apparentlymart on GitHub? He’s a terraform-core contributor, and frequently publishes niche providers. could be a good model to follow for working on your own…
I need help finding this bit data.aws_eks_cluster_auth.this.token
on this repo. The aws_eks_cluster_auth
part.
okay ill go look them up
Oh that’s just a data resource offered by the AWS provider. Check the docs. I don’t have that snippet handy
I think it takes the cluster id or something as input
okay is it essentially just the eks cluster giving out its token?
Yes
Cool ive seen something similar in AKS
This guy apparentlymart is a G
Im reading through his Go proposal
totally distracted now lol
Yes, dude is sharp as F, super thorough, writes well, and is crazy humble. Great role model
Is there a terraform registry on this terraform_data
as in what fields you can pass in as I can only find this on it: https://developer.hashicorp.com/terraform/language/resources/terraform-data#example-usage-null_resource-replacement
Retrieves the root module output values from a Terraform state snapshot stored in a remote backend.
That’s all there is to it. The input
argument takes any object or expression, and its value is available as an attribute
I couldnt get it to work with an AKS
The general idea should be agnostic to aks vs eks, I think. It’s only relying on terraform core features
I set a reminder for when I’m back to try and provide the equivalent setup for aks, based on your snippet above. I just can’t test it since I don’t have access to an azure environment at the moment
is this what you had?
# Use terraform_data inputs to create resources after compute
provider "helm" {
kubernetes {
host = terraform_data.aks_cluster_after_compute.input.host
username = terraform_data.aks_cluster_after_compute.input.username
password = terraform_data.aks_cluster_after_compute.input.password
client_certificate = base64decode(terraform_data.aks_cluster_after_compute.input.client_certificate)
client_key = base64decode(terraform_data.aks_cluster_after_compute.input.client_key)
cluster_ca_certificate = base64decode(terraform_data.aks_cluster_after_compute.input.cluster_ca_certificate)
}
}
# Force a dependency between the AKS cluster auth details, and the compute resources
resource "terraform_data" "aks_cluster_after_compute" {
input = {
host = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].host
username = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].username
password = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].password
client_certificate = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].client_certificate
client_key = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].client_key
cluster_ca_certificate = azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].cluster_ca_certificate
}
}
Yeh something like that
Didn’t work
“didn’t work” isn’t exactly something that can be investigated. what config, what command, what error?
Sorry, @loren. I was tired when I replied to this. Basically, for the AKS, no matter what, it has to read from the Kubeconfig file, and that does not update on a first run. With EKS, you can get around this by using the Token reference, but for AKS, there is no Token. For username and password, it still goes to the Kubeconfig file to get the Username and Password.
kinda curious, can you close the loop for me? what about this reference is “goes to the Kubeconfig file to get the username and password”? it certainly appears to me like it gets it from the cluster resource attribute, not from a file…
azurerm_kubernetes_cluster.r21_new_prod_kubernetes.kube_config[0].username
The way I understand it to work is the part where it says kube_config[0]
its reading from the kube_config file
2024-05-23
2024-05-24
Not sure what your preferred workflow is, but maybe worth sharing here. I believe since the latest aws provider was released the route53 alias module is no longer useable. Several teams internally have reported the same issue I had, so I’d be surprised if any implementations are not suffering the same. I opened #53 therefore.
Best to post PRs in #pr-reviews
2024-05-25
Hey! Been trying to create couple of sftp servers and users with the module https://github.com/cloudposse/terraform-aws-transfer-sftp This is my code:
module "sftp" {
source = "cloudposse/transfer-sftp/aws"
version = "1.3.0"
for_each = local.app_resources
domain = "S3"
s3_bucket_name = lookup(each.value, "bucket_name", null)
vpc_id = module.vpc[local.default_vpc_name].vpc_id
subnet_ids = local.private_subnet_ids
vpc_security_group_ids = [module.s3_sftp_sg.security_group_id]
domain_name = "${lookup(each.value, "namespace", null)}-sftp.${local.stage_dns_domain}"
zone_id = local.private_route53_id
sftp_users = {
for user, config in each.value.users :
user => {
user_name = config.user_name
public_key = config.public_key
}
}
delimiter = "-"
context = {
additional_tag_map = {}
attributes = []
delimiter = null
descriptor_formats = {}
enabled = true
environment = null
id_length_limit = null
label_key_case = null
label_order = []
label_value_case = null
labels_as_tags = []
name = "sftp"
namespace = null
regex_replace_chars = null
stage = "${each.key}"
tags = merge(
local.tags,
local.app_tag
)
tenant = null
}
tags = merge(
local.tags,
local.app_tag
)
}
And all works well except the endpoints and DNS names.
By default aws_transfer_server endpoint is [s-12345678.server.transfer.REGION.amazonaws.com](http://s-12345678.server.transfer.REGION.amazonaws.com)
And this doesn’t resolve in any IP address, and when there is a DNS CNAME created to it - it also has no IP addr behind:
According to the official doc - https://docs.aws.amazon.com/transfer/latest/userguide/transfer-file.html#openssh - we should use DNS names from the Endpoiont and they not the same as [s-12345678.server.transfer.REGION.amazonaws.com](http://s-12345678.server.transfer.REGION.amazonaws.com)
. And with these DNS names from the example everything works. How is it suppose to be with this module? Is there a way to get proper DNS name of the endpoint?
Our opinionated refarch implementation of an SFTP root module is here: https://github.com/cloudposse/terraform-aws-components/blob/main/modules/sftp/main.tf#L69-L88
data "aws_route53_zone" "default" {
count = local.enabled ? 1 : 0
name = "${var.stage}.${var.hosted_zone_suffix}"
}
module "sftp" {
source = "cloudposse/transfer-sftp/aws"
version = "1.2.0"
domain = var.domain
sftp_users = local.sftp_users
s3_bucket_name = data.aws_s3_bucket.default[local.default_global_s3_bucket_name_key].id
restricted_home = var.restricted_home
force_destroy = var.force_destroy
address_allocation_ids = var.address_allocation_ids
security_policy_name = var.security_policy_name
domain_name = var.domain_name
zone_id = one(data.aws_route53_zone.default[*].id)
eip_enabled = var.eip_enabled
2024-05-28
How can i enable ebs addon on eks cluster?
after adding i see issue related to iam role, “AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity”
am i missing anything
@Jeremy G (Cloud Posse)
@Narayanaperumal Gurusamy Yes, the EBS addon requires an IAM Role with a specific name and with a specific policy attached. See how Cloud Posse does it here.
The Amazon EBS CSI plugin requires IAM permissions to make calls to AWS APIs on your behalf.
resource "aws_iam_role_policy_attachment" "aws_ebs_csi_driver" {
count = local.ebs_csi_sa_needed ? 1 : 0
role = module.aws_ebs_csi_driver_eks_iam_role.service_account_role_name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy"
}
module "aws_ebs_csi_driver_eks_iam_role" {
source = "cloudposse/eks-iam-role/aws"
version = "2.1.1"
enabled = local.ebs_csi_sa_needed
eks_cluster_oidc_issuer_url = local.eks_cluster_oidc_issuer_url
service_account_name = "ebs-csi-controller-sa"
service_account_namespace = "kube-system"
context = module.this.context
}
2024-05-29
2024-05-30
2024-05-31
v1.9.0-beta1 1.9.0-beta1 (May 31, 2024) NEW FEATURES:
Input variable validation rules can refer to other objects: Previously input variable validation rules could refer only to the variable being validated. Now they are general expressions, similar to those elsewhere in a module, which can refer to other input variables and to other objects such as data resources. templatestring function: a new built-in function which is similar to templatefile but designed to render templates obtained dynamically, such…
1.9.0-beta1 (May 31, 2024) NEW FEATURES:
Input variable validation rules can refer to other objects: Previously input variable validation rules could refer only to the variable being validated. No…
New aws tf provider (using cloud control api which auto generates its resources using the api) just went ga
https://registry.terraform.io/providers/hashicorp/awscc/latest/docs
#terraform people do you fancy getting your stack overflow points to go up
And help a fellow participant out. Take a look at my question here: https://stackoverflow.com/questions/78560433/this-bash-script-will-not-run-in-terraform
Happy to post here if your not interested in growing your points on Stack Overflow
I have this piece of Terraform code to run a cli bash script to configure a cluster for me, but Terraform will not run the script. The script runs fine, with no errors outside of Terraform. The Err…
what does gpt say
I have this piece of Terraform code to run a cli bash script to configure a cluster for me, but Terraform will not run the script. The script runs fine, with no errors outside of Terraform. The Err…
Haha
Did a search through archives, and couldn’t find confirmation….
I pin all versions of resources in most things. However, if you are building a module for your org, do you version pin the provider in the module too? I’m assuming no issues with this as plan would just download both, but I’m rusty having been digging into pulumi in the last year and can’t recall if version pinning in the module itself is a bad practice.
Even though I’m not active here much right now I still refer all folks to you I can cause still one of the most useful/expert communities I’ve joined. You all rock
no pinning of provider versions in reusable modules. just set the min version.
in the root module, pin everything and use .terraform.lock.hcl
For child modules, it’s best to specify the minimum required version of a provider: https://developer.hashicorp.com/terraform/language/modules/develop/providers#provider-version-constraints-in-modules
on the other hand, for module references, always pin exact versions
nice. I can do that. Just have to go look at the renovate config settings and i’ll have our module project look at provider max (main version) and min being set.
yes, otherwise I always do exact version pinning and use renovate to process
just thought i remembered some issue with resuable modules and provider versions being exact being a problem but it’s been so long
yes, any given terraform state can only reference a single version of a provider at a time. so if you have two modules in the same config trying to exact-pin different versions of the same provider, it fails
do you’ll normally do just the minimum version instead of also the max range allowed? I see that in Cloudposse’s repos, but curious if you’ve seen benefit to do “between min and max versions” like >=3 , <= 4.0 when it’s on 3? Otherwise I’ll stick with minimum version and we’ll give up a little testability
i don’t see a ton of value in restricting the max version of the provider in reusable modules, since i’m pinning exact versions in the root config and also using dependabot/renovate against the root config to test updates to provider versions.
sure, the update may fail, but the project is still “healthy” and we can update the module for compatibility on our own pace
ok. I don’t have control downstream for consumers. Taking over part of the internal resuable modules and making sure i updated renovate config to the correct setup, so I’ll look disabling provider version updates only in my Azure DevOps project.
should be able to use required_provider depType based on examples in github and override this one project to make sure it doesn’t try to pin those. great advice thank you!
what you can do in your reusable module to get an “early warning” of an issue, is write test configs as root modules and pin provider versions there, and use renovate to update the test configs
yes! Structure is aligned to normal cloudposse files in base, examples directory, so I can match this and ignore pinning anything in root directory which isn’t typical for root modules
of course, you need a test harness with credentials that can apply and destroy the test configs
yeah part of that is in place with Nuke and maybe I’ll replace with either the native new test features in terraform or just write it like I did before in Go with some terratest calls. Either way it’s on the list!
depending on the resources in use, localstack can be a reasonable option in tests to avoid real credentials and resources/costs
we have credentials for azure, but i’ll check. haven’t seen if localstack does azure.
What I want is to just use pulumi and move on I’m using it for all my independent projects, but it’s hard to get interest in it so i’ll keep grumbling at terraform
Hi all,
This is Ryan (ex Spacelift, now Resourcely).
We are looking to document and test integrating Resourcely with additional CICD and other terraform runners.
Resourcely is a configuration engine focusing on paved roads for infrastructure as code.
It’s kind of like an AWS Service Catalog, but 10x better and with support across cloud providers and platforms. You can define safe & sane blueprints, then present developers with a nice quick and clean interface for selecting those things, instead of the naked cloud console. Multi-cloud support out of the gate makes for a pretty compelling story, and users really like the security implications of default-deny when it comes to resources (developers can select from blessed service blueprints, which ensures they aren’t accidentally choosing a $30/hr instance type or an unapproved region or a software version that hasn’t been approved by the compliance folks, etc.).
Our team currently has documented a number of integrations with “terraform runners” / TACOS solutions found here: https://docs.resourcely.io/integrations/terraform-integration
We need to create some documentation and run a few end to end tests with more real world use cases tested. I’d love to figure out a way to make it beneficial to the community and any individuals who want to assist.
Is there anyone interested in collaborating with us?
Feel free to comment or send me a DM, then we can connect next week.
Support Terraform Integrations