#refarch (2024-07)
Cloud Posse Reference Architecture
2024-07-02
Hi, how would I use assume_role_conditions
in the iam-role
module to set a condition to require an STS external ID for role assumption? https://github.com/cloudposse/terraform-aws-components/tree/main/modules/iam-role#input_assume_role_conditions
@Dan Miller (Cloud Posse)
Take a look at the module here: https://github.com/cloudposse/terraform-aws-iam-role/blob/main/variables.tf#L64-L78
Then you can list the condition with test
, variable
, and values
following the Terraform resource documentation here:
https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document#source_policy_documents
All given conditions defined by var.assume_role_conditions
will be included for the iam policy doc using the given allow actions defined by var.assume_role_actions
variable "assume_role_actions" {
type = list(string)
default = ["sts:AssumeRole", "sts:TagSession"]
description = "The IAM action to be granted by the AssumeRole policy"
}
variable "assume_role_conditions" {
type = list(object({
test = string
variable = string
values = list(string)
}))
description = "List of conditions for the assume role policy"
default = []
}
2024-07-03
@Marat Bakeev following up from office hours today
can you please summarize the issue with the webhook secret? I will rope in our SME on actions runner controller
cc @Jeremy G (Cloud Posse)
Sure. So, the issue is - we’re using actions-runner-controller. Our webhook logs state that there is no webhook secret configured:
2024-07-03T2022Z INFO -github-webhook-secret-token and GITHUB_WEBHOOK_SECRET_TOKEN are missing or empty. Create one following https://docs.github.com/en/developers/webhooks-and-events/securing-your-webhooks and specify it via the flag or the envvar
Yes, sorry, that is a known issue, too, should be fixed this week.
We have created the webhook and it’s secret in github, and also placed the secret into SSM. And it’s available there. But the secret controller-manager does not have the key for this webhook secret
ah, okay, no worries. thanks
2024-07-08
2024-07-18
Do you guys have plans to update ArgoCD version? I think the one you have enabled (2.5.9) has a security issue, plus later versions add support for ApplicationSet Progressive Syncs.
All our updates are (financially) sponsored by customers, or open source contributors
We are open to updating everything :-)
@Michael maybe something interesting for you?
This is a great recommendation! I’ll take a look into it today!
@Marat Bakeev Just to confirm, this is in reference to the CloudPosse packages repositories?
(same ones you’re using, I think)
Created a GitHub issue so I can track the work on this! https://github.com/cloudposse/terraform-aws-components/issues/1079
Describe the Feature
Argo versions 0.1.0 through 2.10.0-rc1, v2.9.3, v2.8.7, v2.7.15 are affected by CVE-2024-22424, a CSRF attack when the attacker has the ability to write HTML to a page on the same parent domain as Argo CD.
Expected Behavior
Propose that we update the default values for Argo’s chart from:
argo/argo-cd 5.19.12 v2.5.9
to an unaffected version patched after 2.10-rc2
, 2.9.4
, 2.8.8,
2.7.16
Use Case
N/A
Describe Ideal Solution
Update default value for:
variable “chart_version” { type = string description = “Specify the exact chart version to install. If this is not specified, the latest version is installed.” default = “5.19.12” }
And validate it works as intended
Alternatives Considered
No response
Additional Context
No response
And here is the PR! https://github.com/cloudposse/terraform-aws-components/pull/1081
what and why
• Argo versions 0.1.0
through 2.10.0-rc1
, v2.9.3
, v2.8.7
, v2.7.15
are affected by CVE-2024-22424, a CSRF attack when the attacker has the ability to write HTML to a page on the same parent domain as Argo CD.
• Propose that we update the default values for Argo’s chart from:
argo/argo-cd 5.19.12 v2.5.9
to an unaffected version patched after 2.10-rc2, 2.9.4, 2.8.8, 2.7.16
notable changes
• Argo CD 2.10 upgraded kubectl from 1.24 to 1.26. This upgrade introduced a change where client-side-applied labels and annotations are no longer preserved when using a server-side kubectl apply • Note that bundled Helm version has been upgraded from 3.13.2 to 3.14.3 • Starting with Argo CD 2.10.11, the NetworkPolicy for the argocd-redis and argocd-redis-ha-haproxy dropped Egress restrictions. This change was made to allow access to the Kubernetes API to create a secret to secure Redis access
testing
• This version has been tested and verified to work with the existing component configuration
references
2024-07-19
2024-07-22
2024-07-24
2024-07-25
@Marat Bakeev was this fully answered? https://github.com/orgs/cloudposse/discussions/12 any thing we can mark as the answer?
/github subscribe cloudposse/community discussions
:white_check_mark: Subscribed to cloudposse/community. This channel will receive notifications for issues
, pulls
, commits
, releases
, deployments
, discussions
/github unsubscribe cloudposse/community pulls commits releases deployments issues
Spinning my wheels a bit on this one so figured I’d ask.
In the baseline steps there is a note:
The IAM User for SuperAdmin will be granted access to Terraform State by principal ARN. This ARN is passed to the tfstate-backend stack catalog under allowed_principal_arns. Verify that this ARN is correct now. You may need to update the root account ID.
And possibly related:
With the addition of support for dynamic Terraform roles, our baseline cold start refarch layer now depends on/requires that we have aws-teams and aws-team-roles stacks configured. This is because account-map uses those stacks to determine which IAM role to assume when performing Terraform in the account, and almost every other component uses account-map (indirectly) to chose the role to assume.
However, none of the steps in the baseline seem to provision the roles for accessing the tfstate. Tracing through it looks like the -var=access_roles_enabled=false
prevents these roles from being created in the baseline tfstate backend workflow. The full deploy/tfstate workflow isn’t run until later in the identity
phase.
The result is that the atmos workflow deploy/accounts -f accounts
workflow cannot run and does not create the account-map due to an error:
Error: error configuring S3 Backend: IAM Role (arn:aws:iam::1234567890:role/xxx-core-gbl-root-tfstate) cannot be assumed.
The role doesn’t exist so the error is clear, however, passing access_roles_enabled=true doesn’t work since the account-map needs to be created.
@Jeremy White (Cloud Posse)
Spinning my wheels a bit on this one so figured I’d ask.
In the baseline steps there is a note:
The IAM User for SuperAdmin will be granted access to Terraform State by principal ARN. This ARN is passed to the tfstate-backend stack catalog under allowed_principal_arns. Verify that this ARN is correct now. You may need to update the root account ID.
And possibly related:
With the addition of support for dynamic Terraform roles, our baseline cold start refarch layer now depends on/requires that we have aws-teams and aws-team-roles stacks configured. This is because account-map uses those stacks to determine which IAM role to assume when performing Terraform in the account, and almost every other component uses account-map (indirectly) to chose the role to assume.
However, none of the steps in the baseline seem to provision the roles for accessing the tfstate. Tracing through it looks like the -var=access_roles_enabled=false
prevents these roles from being created in the baseline tfstate backend workflow. The full deploy/tfstate workflow isn’t run until later in the identity
phase.
The result is that the atmos workflow deploy/accounts -f accounts
workflow cannot run and does not create the account-map due to an error:
Error: error configuring S3 Backend: IAM Role (arn:aws:iam::1234567890:role/xxx-core-gbl-root-tfstate) cannot be assumed.
The role doesn’t exist so the error is clear, however, passing access_roles_enabled=true doesn’t work since the account-map needs to be created.
@Erik Osterman (Cloud Posse) I don’t think we encountered this, no. But we didn’t start with a clean env, maybe it was already created for us when we were trying to reverse-engineer refarch ourselves
2024-07-26
I have deployed a AWS x-ray daemonset on our cluster without any node selector or tolerations , but its deployed only on few nodes, not every node in the cluster , I also don’t see any pods in pending state if the resources are insufficient, Wanted to know thoughts regarding what can be the issue. screenshot
@Jeremy G (Cloud Posse)
I have deployed a AWS x-ray daemonset on our cluster without any node selector or tolerations , but its deployed only on few nodes, not every node in the cluster , I also don’t see any pods in pending state if the resources are insufficient, Wanted to know thoughts regarding what can be the issue. screenshot
When a DaemonSet is first deployed, it is only deployed to Nodes that have enough resources for it, unless you set up a PriorityClass with preemption and assign the DaemonSet to that class. See https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption.
FEATURE STATE: Kubernetes v1.14 [stable] Pods can have priority. Priority indicates the importance of a Pod relative to other Pods. If a Pod cannot be scheduled, the scheduler tries to preempt (evict) lower priority Pods to make scheduling of the pending Pod possible. Warning:In a cluster where not all users are trusted, a malicious user could create Pods at the highest possible priorities, causing other Pods to be evicted/not get scheduled.
2024-07-28
2024-07-29
After completing nearly all of the identity
steps in the refarch I’m hitting an issue where atmos
doesn’t seem to be switching from the planner
IAM role to the terraform
IAM role in some workflows.
For example this fails with a few permission denials since the planner role cannot actually create resources (KMS keys, etc.):
terraform deploy cloudtrail -s core-gbl-root
I’ve pull the above out of the deploy
workflow from baseline
.
Example error:
Error: creating KMS Key: operation error KMS: CreateKey, https response error StatusCode: 400, RequestID: x, api error AccessDeniedException: User: arn:aws:sts::x:assumed-role/x-core-gbl-root-planner/aws-go-sdk-1722266462231447631 is not authorized to perform: kms:TagResource because no identity-based policy allows the kms:TagResource action
The account-map seems to have the correct roles set for plan vs. apply.
No AWS Teams should have access to apply Terraform in the core-root account.
I see now that the managers
Team does have terraform
access in core-root
. Do you know which AWS Team you have assumed before running Terraform?
Within your infra geodesic shell, run this to check:
√ . [foo-identity] (HOST) infrastructure ⨠ aws sts get-caller-identity { “UserId”: “ABCD1234:foo-identity”, “Account”: “1234567890”, “Arn”: “arn:aws:sts::1234567890:assumed-role/foo-core-gbl-identity-devops/foo-identity” }
For example here I am using the devops
team, so I would only have planner
access in core-root
2024-07-30
Is there a way in the refarch to set up S3 event notifications to go to SNS/SQS? Can’t find anything in the documentation about it
@Dan Miller (Cloud Posse)
I dont believe we have anything existing for it, but it looks like it shouldnt be too hard to set up like this:
https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket_notification
I’d recommend creating a new component that pulls the aws_sns_topic.topic.arn
and aws_s3_bucket.bucket.id
by remote state from the s3-bucket / sns-topic components and then adds the notifications. But ofc there’s many ways to do it