SweetOps #aws for April, 2022

Discussion related to Amazon Web Services (AWS)

Archive: https://archive.sweetops.com/aws/

2022-04-01

Luke Hobbs

Hey all, how does everyone manage the initial IAM Role creation at your organizations?

Specifically we’ve got a multi-account structure and looking to setup some Organization Access Roles that we can use to create other IAM Roles with more restrictive permissions. It seems like this will need to be run from a developer/administrator laptop using IAM User credentials, but curious how others approach initial IAM Role setup to enable developers in new AWS Accounts.

loren

07:38:31 PM

lambda that triggers on the CreateAccount event, assumes the role created by Organizations in the new account, and creates an initial role

Jesus Martinez

01:22:39 PM

AWS Organizations + SSO

Jesus Martinez

01:27:22 PM

https://help.okta.com/en/prod/Content/Topics/DeploymentGuides/AWS/aws-configure-saml-sso.htm

Jesus Martinez

01:27:54 PM

initally we need to create the roles but then is automatic thru third party integration

Chris Wash

05:05:06 PM

@loren where do you run that Lambda? For us, I think we had a one-time setup done with a partner who helped us design our AWS org. They set this kind of stuff up initially from their laptop, and then tore down the over-privileged users. We still have the IAC for it, but it’s not in a point where it could run in a pipeline. The problem for us now is figuring out how we maintain some of these more foundational pieces. We don’t have changes very often, but it does change.

loren

05:05:26 PM

it runs in the Organizations account

Chris Wash

05:05:39 PM

That’s what I thought just didn’t want to assume

loren

05:06:16 PM

it kinda has to, because the role that Organizations creates in the new account only trusts the Organizations account…

Chris Wash

05:06:54 PM

Right — it’s a good idea - do you follow this approach yourself or is it just an idea you had to address the question?

loren

05:08:44 PM

we actually use it https://github.com/plus3it/terraform-aws-org-new-account-iam-role

plus3it/terraform-aws-org-new-account-iam-role

loren

05:09:43 PM

we also have a simple “prereq” iam role config in terraform where we follow up and import the role that gets created that way, to maintain positive control over it

Chris Wash

09:48:22 PM

Awesome - thanks for the advice!

2022-04-04

2022-04-05

Soren Jensen

03:09:23 PM

Follow up question to the above. We also use AWS Org+SSO. When we create a new account we obviously get a root account created in the new account. To be CIS compliant we need to enable MFA ideally hardware MFA for that account, has anyone managed to automate that? Ideally with terraform

loren

03:39:25 PM

just silence that CIS alert

Luke Hobbs

03:39:31 PM

Does this module meet your requirements? (fair warning I’ve never used it) https://github.com/terraform-module/terraform-aws-enforce-mfa

terraform-module/terraform-aws-enforce-mfa

Enforce MFA policy creation and enforcing on groups.

Matt Gowie

06:32:15 PM

I haven’t seen any module or automation that does this (but haven’t looked far). That said… Doesn’t look like it would be impossible though?

https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_virtual_mfa_device

Matt Gowie

06:32:25 PM

cc @matt as he would likely know and I’m sure he’s run into this before.

Soren Jensen

06:33:27 PM

Thanks

Matt Gowie

09:27:13 PM

@Soren Jensen if you come up with a solution using Terraform — I’d love to hear about it!

Soren Jensen

09:30:46 PM

@Erik Osterman (Cloud Posse) Do you have any experience with this?

Erik Osterman (Cloud Posse)

09:54:53 PM

I believe we use our terraform-provider-awsutils to silence the hardware requirement since it’s not our best practice. our best practice is putting the MFA into 1password for business continuity.

Erik Osterman (Cloud Posse)

09:55:44 PM

https://github.com/cloudposse/terraform-provider-awsutils/blob/main/internal/service/securityhub/resource_control_disablement.go

package securityhub

import (
	"fmt"
	"log"

	"github.com/aws/aws-sdk-go/aws"
	"github.com/aws/aws-sdk-go/service/securityhub"
	"github.com/cloudposse/terraform-provider-awsutils/internal/conns"
	"github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema"
)

func ResourceSecurityHubControlDisablement() *schema.Resource {
	return &schema.Resource{
		Description: `Disables a Security Hub control in the configured region.

It can be useful to turn off security checks for controls that are not relevant to your environment. For example, you 
might use a single Amazon S3 bucket to log your CloudTrail logs. If so, you can turn off controls related to CloudTrail 
logging in all accounts and Regions except for the account and Region where the centralized S3 bucket is located. 
Disabling irrelevant controls reduces the number of irrelevant findings. It also removes the failed check from the 
readiness score for the associated standard.`,
		Create:        resourceAwsSecurityHubControlDisablementCreate,
		Read:          resourceAwsSecurityHubControlDisablementRead,
		Update:        resourceAwsSecurityHubControlDisablementUpdate,
		Delete:        resourceAwsSecurityHubControlDisablementDelete,
		SchemaVersion: 1,
		Schema: map[string]*schema.Schema{
			"id": {
				Description: "The ID of this resource.",
				Type:        schema.TypeString,
				Computed:    true,
			},
			"control_arn": {
				Description: "The ARN of the Security Hub Standards Control to disable.",
				Type:        schema.TypeString,
				ForceNew:    true,
				Required:    true,
			},
			"reason": {
				Description: "The reason the control is being disabed.",
				Type:        schema.TypeString,
				Optional:    true,
				Default:     "",
			},
		},
	}
}

func resourceAwsSecurityHubControlDisablementCreate(d *schema.ResourceData, meta interface{}) error {
	conn := meta.(*conns.AWSClient).SecurityHubConn
	controlArn := d.Get("control_arn").(string)
	reason := d.Get("reason").(string)

	input := &securityhub.UpdateStandardsControlInput{
		StandardsControlArn: &controlArn,
		ControlStatus:       aws.String("DISABLED"),
	}

	if reason != "" {
		input.DisabledReason = &reason
	}

	if _, err := conn.UpdateStandardsControl(input); err != nil {
		return fmt.Errorf("error disabling security hub control %s: %s", controlArn, err)
	}

	d.SetId(controlArn)

	return resourceAwsSecurityHubControlDisablementRead(d, meta)
}

func resourceAwsSecurityHubControlDisablementRead(d *schema.ResourceData, meta interface{}) error {
	conn := meta.(*conns.AWSClient).SecurityHubConn
	controlArn := d.Get("control_arn").(string)

	control, err := FindSecurityHubControl(conn, controlArn)
	if err != nil {
		return fmt.Errorf("error reading security hub control %s: %s", controlArn, err)
	}
	log.Printf("[DEBUG] Received Security Hub Control: %s", control)

	if !d.IsNewResource() && *control.ControlStatus != "DISABLED" {
		log.Printf("[WARN] Security Hub Control (%s) no longer disabled, removing from state", d.Id())
		d.SetId("")
		return nil
	}

	if err := d.Set("reason", control.DisabledReason); err != nil {
		return err
	}

	return nil
}

func resourceAwsSecurityHubControlDisablementUpdate(d *schema.ResourceData, meta interface{}) error {
	if d.HasChanges("reason") {
		conn := meta.(*conns.AWSClient).SecurityHubConn
		controlArn := d.Get("control_arn").(string)
		_, new := d.GetChange("reason")
		reason := new.(string)

		input := &securityhub.UpdateStandardsControlInput{
			StandardsControlArn: &controlArn,
			ControlStatus:       aws.String("DISABLED"),
			DisabledReason:      aws.String(reason),
		}

		if _, err := conn.UpdateStandardsControl(input); err != nil {
			return fmt.Errorf("error disabling security hub control %s: %s", controlArn, err)
		}
	}

	return nil
}

func resourceAwsSecurityHubControlDisablementDelete(d *schema.ResourceData, meta interface{}) error {
	conn := meta.(*conns.AWSClient).SecurityHubConn
	controlArn := d.Get("control_arn").(string)

	input := &securityhub.UpdateStandardsControlInput{
		StandardsControlArn: &controlArn,
		ControlStatus:       aws.String("ENABLED"),
		DisabledReason:      nil,
	}

	if _, err := conn.UpdateStandardsControl(input); err != nil {
		return fmt.Errorf("error updating security hub control %s: %s", controlArn, err)
	}

	return nil
}

Soren Jensen

10:23:48 PM

Very good, thanks a million.. Happy to know it’s not a unique problem we got, and we are following best practice with 1password.

Chin Sam

09:03:53 PM

hi everyone! i just join your slack, i am using your module and new to Terraform, can someone please give a hint, can i use rate / cron expression with the module you provider ? https://github.com/cloudposse/terraform-aws-cloudwatch-events/tree/0.5.0 thank you very much

Matt Gowie

09:25:56 PM

I’d post this over in #terraform and try to explain your problem more e.g. What are you trying to accomplish? Where do you want to use rate, etc.

Bhavik Patel

09:05:06 PM

Has anyone had any issues trying to EXEC into a Fargate instance? I’m getting the following error and our team is pretty stumped with this one.

An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later.

Of the 6/7 ECS clusters we have, we are able to exec in. The only material difference this cluster has is that we are using a NLB instead of an ALB … This is a recent issue for us without any changes to our infrastructure

Matt Gowie

09:26:48 PM

I’ve never seen that. I would imagine it’s an issue with the Fargate host — Old version or something along those lines.

Do you have business support? I’d open a ticket with AWS.

Andrea Cavagna

10:39:43 AM

I deal with this problem with System Manager Session manager, it happens to me whenever the instance (ec2 in my case) was not ready, like in a initializing status

2022-04-06

Dhia

11:46:45 AM

Hello, team!

Wil

03:03:02 PM

Howdy y’all… question. I’m using the terraform-aws-ec2-instance module and want to add some lifecycle –> ignore_changes options so boxes don’t rebuild.

I went and created these options and was going to do a pull request and found that you cannot add variables inside the lifecycle stanza. So… how are people getting around this?

Andriy Knysh (Cloud Posse)

03:04:58 PM

creating two resources with count - one for ignore_changes enabled, the other for ignore_changes disabled

Wil

03:06:06 PM

That’s ugly. sigh

Andriy Knysh (Cloud Posse)

03:06:19 PM

yes, not fun

Wil

03:06:52 PM

been open for 2 1/2 years too!

https://github.com/hashicorp/terraform/issues/22544

Terraform Version

Terraform v0.12.6

Terraform Configuration Files

locals {
  test = true
}

resource "null_resource" "res" {
  lifecycle {
    prevent_destroy = locals.test
  }
}

terraform {
  required_version = "~> 0.12.6"
}

Steps to Reproduce

terraform init

Description

The documentation notes that

[…] only literal values can be used because the processing happens too early for arbitrary expression evaluation.

so while I’m bummed that this doesn’t work, I understand that I shouldn’t expect it to.

However, we discovered this behavior because running terraform init failed where it had once worked. And indeed, if you comment out the variable reference in the snippet above, and replace it with prevent_destroy = false, it works - and if you then change it back it keeps working.

Is that intended behavior? And will it, if I do this workaround, keep working?

Debug Output

λ terraform init
2019/08/21 15:48:54 [INFO] Terraform version: 0.12.6
2019/08/21 15:48:54 [INFO] Go runtime version: go1.12.4
2019/08/21 15:48:54 [INFO] CLI args: []string{"C:\\Users\\Tomas Aschan\\scoop\\apps\\terraform\\current\\terraform.exe", "init"}
2019/08/21 15:48:54 [DEBUG] Attempting to open CLI config file: C:\Users\Tomas Aschan\AppData\Roaming\terraform.rc
2019/08/21 15:48:54 [DEBUG] File doesn't exist, but doesn't need to. Ignoring.
2019/08/21 15:48:54 [INFO] CLI command args: []string{"init"}
There are some problems with the configuration, described below.

The Terraform configuration must be valid before initialization so that
Terraform can determine which modules and providers need to be installed.

Error: Variables not allowed

  on main.tf line 7, in resource "null_resource" "res":
   7:     prevent_destroy = locals.test

Variables may not be used here.


Error: Unsuitable value type

  on main.tf line 7, in resource "null_resource" "res":
   7:     prevent_destroy = locals.test

Unsuitable value: value must be known

Wil

03:07:13 PM

Thank you.

Andriy Knysh (Cloud Posse)

03:31:35 PM

this is an example of how we did it

Andriy Knysh (Cloud Posse)

03:31:36 PM

https://github.com/cloudposse/terraform-aws-eks-node-group/blob/master/main.tf#L101

# Because create_before_destroy is such a dramatic change, we want to make it optional.

Andriy Knysh (Cloud Posse)

03:31:47 PM

https://github.com/cloudposse/terraform-aws-eks-node-group/blob/master/main.tf#L185

# WARNING TO MAINTAINERS: both node groups should be kept exactly in sync

Andriy Knysh (Cloud Posse)

03:31:53 PM

see the comments

Wil

03:45:21 PM

good technique, i’ll have to stir on it a bit

Wil

03:45:36 PM

appreciate the quick and detailed response

Dev Jadhav

06:22:57 PM

Does anyone help me to setup airflow in EKS using terraform?

Sherif

09:45:47 PM

FINALLLLLY

https://aws.amazon.com/blogs/aws/announcing-aws-lambda-function-urls-built-in-https-endpoints-for-single-function-microservices/

Announcing AWS Lambda Function URLs: Built-in HTTPS Endpoints for Single-Function Microservices | Amazon Web Services attachment image

Organizations are adopting microservices architectures to build resilient and scalable applications using AWS Lambda. These applications are composed of multiple serverless functions that implement the business logic. Each function is mapped to API endpoints, methods, and resources using services such as Amazon API Gateway and Application Load Balancer. But sometimes all you need is a […]

bradym

09:49:37 PM

excited

Announcing AWS Lambda Function URLs: Built-in HTTPS Endpoints for Single-Function Microservices | Amazon Web Services attachment image

managedkaos

01:12:14 AM

loving it

2022-04-07

Shreyank Sharma

10:44:46 AM

Dotnet lambda runtime version error Hi All, We are running a serverless lambda application running dotnet core 2.1 runtime(deployed through dotnet lambda deploy-serverless). Couple of days back, a person in our team accidentally tried to delete the cloud formation stack used to deploy this application. But as he did not have the required permissions, the stack status changed to DELETE_FAILED. As the status of stack became DELETE_FAILED, we were not able to update the application through cloud formation. Our deployment failed. So we deleted the stack manually and redeployed using dotnet lambda deploy-serverless command. But we got the following error: Resource handler returned message: “The runtime parameter of dotnetcore2.1 is no longer supported for creating or updating AWS Lambda functions. We recommend you use the new runtime (dotnet6 while creating or updating functions.) As aws was not allowing us to create a lambda with donetcore2.1, we changed the code to dotnetcore3.1 in seperate git branch and deployed again, which worked. There were still some bugs in 3.1 code, so we were still debugging. But yesterday on of our developer deployed the code in different branch, which had the stack runtime as 2.1. Now the runtime changed from 3.1 back to 2.1. We became confused, AWS says after end of support its not possible to create, update or rollback to unsupported runtime(https://docs.aws.amazon.com/lambda/latest/dg/runtime-support-policy.html). Even though cloudformation did not allow us to create the application in runtime dotnetcore2.1 but allowed us to change the running dotnetcore3.1 application to dotnetcore2.1. We tested the same in python also, Deployed the application running python3.9 and changed the runtime to unsupported version 2.7 It did not allow us to create in 2.7 but change the application runtime from 3.9 to 2.7. Now our question is as we are back to running our application in dotnetcore2.1, is it fine to continue with it for some time, as the code update is working. Or will AWS one day suddenly stops allowing code updates and its to better to move our application to dotnetcore3.1 now itself?

Runtime support policy - AWS Lambda

Learn how Lambda deprecates runtimes and which runtimes are reaching end of support.

muhaha

03:05:28 PM

Hey Question.. Are You using custom NAT instances for private subnets ? We have quite big spending for outbound traffic via NAT gateway, so wondering if custom NAT instances could do the work too. Ofc its possible, but is there any “cloud-native” way ? Routing via Squid deployed in Kubernetes with observability support, or some VyOS auto scaling group ? Ideas? Thanks

loren

03:09:51 PM

This is promising… https://github.com/AndrewGuenther/fck-nat/issues/8

If you’re interested in high availability support for fck-nat, follow this issue. All PRs and subtasks will be linked here. This issue will not be closed until the 1.2 release of fck-nat launches.

kris

01:23:05 PM

Nat instance would do just fine… I’m using it in my apps, so far no problem

Erik Osterman (Cloud Posse)

05:59:08 AM

Our subnet modules support a feature flag which can toggle between gateways/instances

2022-04-10

2022-04-11

2022-04-13

Vlad Ionescu (he/him)

01:38:34 PM

“Scaling containers on AWS in 2022” is out and may be of interest to y’all!

Scaling containers on AWS in 2022

Comparing how fast containers scale up in 2022 using different orchestrators on AWS

Erik Osterman (Cloud Posse)

06:20:42 PM

This is a must read!

Scaling containers on AWS in 2022

Comparing how fast containers scale up in 2022 using different orchestrators on AWS

Alex Jurkiewicz

05:27:48 AM

The Lambda scaling up to 3k only seems like it hits the account limit for burst concurrency: https://docs.aws.amazon.com/lambda/latest/dg/invocation-scaling.html

If you pay enough, this limit can be changed

DaniC (he/him)

07:36:33 AM

nice write up @Vlad Ionescu (he/him)

Vlad Ionescu (he/him)

03:33:51 PM

As requested, here’s the link to the Containers from the Couch livestream I’ll be doing later: https://twitter.com/rothgar/status/1514627176804470787

Today on #ContainerFromTheCouch @realadamjkeller and I will be talking with @iamvlaaaaaaad about his awesome post about scaling containers

1200PT/1500ET

http://twitch.tv/aws or https://youtu.be/WmdauESk5JA

https://www.vladionescu.me/posts/scaling-containers-on-aws-in-2022

2022-04-14

jose.amengual

05:06:13 PM

EventBridge vs SQS vs SNS can someone please tell me why EventBridge is "better" than SQS from the Architectural pint of view and maybe development code of view? I got asked some questions yesterday about details about messages/events per second etc