SweetOps #aws for June, 2024

Discussion related to Amazon Web Services (AWS)

Archive: https://archive.sweetops.com/aws/

2024-06-03

Enrique Lopez

Hi guys, I’m preparing an slide in a training to some mentees, I wanna present this info to explain the AWS Developer Tools in a single chart, and this summary:

• CodeStar <- Just an interface to manage several pipelines

• Cloud9 <- Just an IDE, like VSCode but in the cloud

• CodeBuild <- Similar to Github Actions

• CodePipeline <- A group of codebuilds/codedeploys

• CodeDeploy <- To deploy your code, usually to move your code from S3 to EC2

• CodeCommit <- Like GitHub My question is: does this makes sense to you, can I make it clearer? What would you change?

loren

06:30:48 PM

i feel like CodeBuild is actually pretty strong, though certainly feels different than github actions. CodeCommit is pretty awful/weak though

loren

06:32:39 PM

i have loads of use cases where i prefer codebuild over github actions, particularly anything that requires credentials or network-level access to vpcs

Darren Cunningham

06:33:06 PM

yeah I think CodeBuilder is a lot more powerful than GitHub Actions…it’s just that you have to build out everything. So in turn, GitHub Actions is easier to get started with and covers most scenarios for most teams and therefore is “better” in most cases.

loren

06:33:53 PM

there’s even a slick new feature that blurs the lines, where your github action runs the codebuild job directly

Enrique Lopez

06:35:36 PM

Ok, we can probably say that CodeBuild is harder, but not weaker

loren

06:35:58 PM

https://aws.amazon.com/about-aws/whats-new/2024/04/aws-codebuild-managed-github-action-runners/

loren

06:37:21 PM

i mean, is it hard to run a shell command? that’s basically what a buildspec is doing. just a bunch of shell commands

Enrique Lopez

06:38:03 PM

Ok, so we can probably remove that comment, to avoid bias

loren

06:38:52 PM

i suppose if you think of github actions as a library of published “modules” that are on their own running some defined set of shell commands, then maybe so. but then of course you run into the situation where the action doesn’t support your use case. and now you have to fix the upstream, or fork it and run your own, or fall back to shell commands anyway

Enrique Lopez

06:40:15 PM

yeah that could be a relevant thing

Adnan

05:57:43 AM

just wanted to say thanks for sharing your thoughts. i never used any aws ci/cd tools, i always thought “why would i besides github actions”, but your comments made me very curious.

Chris Wahl

08:25:17 PM

The secret is that most of these tools have parity in one way or another, they’re just aligned and opinionated to a sub-set of use cases. Except Jenkins, that’s fairly universally disliked.

Adnan

06:41:31 AM

For me the biggest advantage of GH Actions is the reusability of actions and workflows (let’s not talk about security right now) and also that someone else is managing the infrastructure .

Darren Cunningham

01:33:01 PM

hah, you just hit on the two biggest reasons that teams choose not to use GitHub Actions.

GHA marketplace is a significant supply chain attack risk. best mitigation is to enforce version pinning (to the hash, not just version tag), but brand-jacking and typo-squatting are possible too. People rarely actually check the code of the action they’re using. People assume nobody would put malicious code into OSS, <insert obligatory xz reference>.

Managed infrastructure is great…until you realize how much cheaper it can be at scale to run “your own”. I had one pipeline that was going to be like $4k/month on GHA (needed the largest workers they offered), set up self-hosted runners with EC2 Spot fleet and it was like $200/month. That and BYO can immensely speed up pipeline run times on occasion.

Adnan

06:20:35 AM

That’s why I was jokingly saying let’s not talk about security . But I didn’t necessarily mean marketplace actions. I just meant the ability to easily reuse actions and workflows. The cost is different for different orgs. In my case it’s much cheaper than your anecdote.

Darren Cunningham

12:06:06 PM

I get, I was just expanding for the lurkers

2024-06-04

andrei n

03:25:44 PM

Hello! How can I add for the msk-apache-kafka-cluster terraform module custom server configs for kafka e.g.: kafka_configuration_properties = { “auto.create.topics.enable”: true }

andrei n

03:26:33 PM

Error: Unsupported argument │ │ on main.tf line 91, in module “kafka”: │ 91: kafka_configuration_properties = { │ │ An argument named “kafka_configuration_properties” is not expected here.

Hao Wang

03:42:45 PM

which version of the module is used?

andrei n

06:58:46 AM

source = “cloudposse/msk-apache-kafka-cluster/aws” version = “2.4.0”

Hao Wang

11:45:05 AM

yeah, kafka_configuration_properties is not a variable for this module, do you use a tutorial?

andrei n

12:02:32 PM

I am not using any tutorial. In order to make the cluster publicly accessible, the following setting is required:

allow.everyone.if.no.acl.found = false

andrei n

12:02:37 PM

how to achieve this ?

Hao Wang

12:52:03 PM

this variable may be what you’re looking for, https://github.com/cloudposse/terraform-aws-msk-apache-kafka-cluster/blob/2.4.0/variables.tf#L204

variable "properties" {

Hao Wang

12:52:33 PM

yeah, confirmed it is

andrei n

12:56:04 PM

lovely, thanks alot

Hao Wang

02:37:22 PM

2024-06-05

maarten

03:41:03 PM

Has anyone ever been confronted with: “Parameter: SpotFleetRequestConfig.IamFleetRole is invalid. ” when doing a spot-request ? The Role, Trust Policy and Policy all look fine to me. Some Redditor had the same unsolved question. It works in one region, but not in the other, it does not look like a policy issue to me.

Hao Wang

11:45:13 AM

https://stackoverflow.com/questions/54416670/running-into-error-when-launching-spot-instance-request-with-iam-account

Running into error when launching spot instance request with IAM account

I’m trying to create spot EC2 instance with IAM user account.

I got this error message and I can’t go further

Parameter: SpotFleetRequestConfig.IamFleetRole is invalid. It seems like administrator

2024-06-06

2024-06-07

Matt Gowie

06:16:27 PM

Would appreciate any on this insane AWS Amplify Hosting issue: https://github.com/aws-amplify/amplify-hosting/issues/2563

#2563 miss cloud front on browser

Before opening, please confirm:

☑︎ I have checked to see if my question is addressed in the FAQ. ☑︎ I have searched for duplicate or closed issues. ☑︎ I have read the guide for submitting bug reports. ☑︎ I have done my best to include a minimal, self-contained set of instructions for consistently reproducing the issue.

App Id

d2joh8jz57nkvq

Region

ap-northeast-2

Amplify Console feature

Performance

Describe the bug

Hello .

I installed Next.js following the AWS guide and hosted it on amplify.

However, every time I made a request to my service, the response was so slow. So, looking at the response header, the x-cache: Miss from cloudfront header is always present on browser. So I followed the instructions and enabled performance mode on my brunch in amplify but I’m still having the same problem.

The curious thing is that if you look at the x-cache header with the curl command, it was hit.

curl -X HEAD -i <https://v.place.hitit.xyz/store/80e0a902-490f-4d96-b18d-988c852b2975> -s | grep -Fi x-cache x-cache: Hit from cloudfront

I suspect this is region related. Could you please check this as well ?

lambdaEdge : us-east-1
lambda : us-east-1
s3 : us-east-1
amplify : ap-northeast-2

Expected behavior

I don’t have any customHttp.yml . it was problem ?

Reproduction steps

just enter my website.

https://v.place.hitit.xyz/
https://v.place.hitit.xyz/store/80e0a902-490f-4d96-b18d-988c852b2975

Build Settings

No response

Additional information

No response

Erik Osterman (Cloud Posse)

08:48:17 PM

cc @mike186

#2563 miss cloud front on browser

Before opening, please confirm:

App Id

d2joh8jz57nkvq

Region

ap-northeast-2

Amplify Console feature

Performance

Describe the bug

Hello .

I installed Next.js following the AWS guide and hosted it on amplify.

The curious thing is that if you look at the x-cache header with the curl command, it was hit.

curl -X HEAD -i <https://v.place.hitit.xyz/store/80e0a902-490f-4d96-b18d-988c852b2975> -s | grep -Fi x-cache x-cache: Hit from cloudfront

I suspect this is region related. Could you please check this as well ?

lambdaEdge : us-east-1
lambda : us-east-1
s3 : us-east-1
amplify : ap-northeast-2

Expected behavior

I don’t have any customHttp.yml . it was problem ?

Reproduction steps

just enter my website.

https://v.place.hitit.xyz/
https://v.place.hitit.xyz/store/80e0a902-490f-4d96-b18d-988c852b2975

Build Settings

No response

Additional information

No response

mike186

09:08:07 PM

Thank you!

2024-06-10

omkar

07:24:16 AM

Issue: Application Performance Explanation: We have deployed all our microservices on AWS EKS. Some are backend services that communicate internally (around 50 services), and our main API service, “loco,” handles logging and other functions. The main API service is accessed through the following flow: AWS API Gateway -> Nginx Ingress Controller -> Service. In the ingress, we use path-based routing, and we have added six services to the ingress, each with a corresponding resource in a single API gateway. Our Angular static application is deployed on S3 and accessed through CloudFront. The complete flow is as follows: CloudFront -> Static S3 (frontend) -> AWS API Gateway -> VPC Link -> Ingress (Nginx Ingress Controller with path-based routing) -> Services -> Container. Problem: Occasionally, the login process takes around 6-10 seconds, while at other times it only takes 1 second. The resource usage of my API services is within the limit. Below are the screenshots from Datadog traces of my API service:

• Screenshot of the API service when it took only 1 second

• Screenshot of the API service when it took 6-10 seconds Request for Help: How should I troubleshoot this issue to identify where the slowness is occurring?

Gabriela Campana (Cloud Posse)

03:58:07 PM

@Jeremy White (Cloud Posse)

Dale

04:37:51 PM

I take it you have reviewed the flame graph of the login service and done a profile of it to rule out the login service itself being the bottleneck?

Dale

04:40:18 PM

I only ask because on your second image there are 4 times as many spans being indexed, so it is making me wonder whether between the two screenshots something has invalidated a cache your app relies on and it is having to rebuild that? Maybe a new pod of that service has been spun up from a scaling event and your containers don’t come with the cache prewarmed?

Gabriela Campana (Cloud Posse)

10:22:53 PM

@Jeremy White (Cloud Posse) bumping this up

sharma.mohit332

08:30:22 AM

Also, as per the screenshot live-locobuzz-api-sql-server is taking almost 5x response time. Did you had a chance to check which query is expensive in the later one?

Jeremy White (Cloud Posse)

08:55:14 PM

Yeah, to sort of condense what’s mentioned above, APM allows you to instrument code, effectively setting timers at different points of execution which resolve when calls return.

Jeremy White (Cloud Posse)

08:55:51 PM

You’ll want to study spans captured to see which ones have the majority of time, and then further dig into (instrument) those spans

Jeremy White (Cloud Posse)

08:57:04 PM

If you cannot instrument any deeper into the call (i.e. the span leaves to another service), then you’ll need to see if you can instrument that resource/dependency

Jeremy White (Cloud Posse)

08:57:25 PM

if you post more about the calling functions and their dependencies, we can advise how to proceed

Juan Pablo Lorier

03:14:40 PM

Hi, I’m trying to understand why the ecs cluster module is trying to recreate the policy attachments every time I add more than one module instance via a for_each. The plan shows the arn will change, but it’s a AWS managed policy, so it won’t change:

update policy_arn : “arniam:policy/AmazonSSMManagedInstanceCore”

change to Known after apply Forces replacement

the resource address is:

module.ecs_clusters[“xxx”].module.ecs_cluster.aws_iam_role_policy_attachment.default[“AmazonSSMManagedInstanceCore”]

Gabriela Campana (Cloud Posse)

03:58:28 PM

@Jeremy White (Cloud Posse)

Erik Osterman (Cloud Posse)

08:18:20 PM

probably best to use #terraform for terraform questions

Juan Pablo Lorier

08:39:30 PM

@Erik Osterman (Cloud Posse) sorry, I thought this was the terraform aws channel. Will post in terraform then

ecatevatis

08:55:46 PM

I’m getting an error when creating ec2-instance, trying to reference the private subnet for a dynamic_subnet I created, any ideas how to reference the private_subnet_id into ec2?

subnet = module.dynamic_subnets.private_subnet_id

ecatevatis

08:57:17 PM

That was the wrong screenshot please see below.

ecatevatis

09:16:38 PM

well, this was one short-term fix: subnet = element(module.dynamic_subnets.private_subnet_ids, 0)

ecatevatis

09:17:28 PM

I guess it randomly chooses which AZ it goes into?

2024-06-11

Mehak

06:09:57 AM

This policy we have to enforce mutilaz on elasticache clusters. Do we have some such policy to enforce Multi-AZ in RDS Aurora and Elasticsearch?

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Deny",
            "Action": [
                "elasticache:CreateCacheCluster",
                "elasticache:CreateReplicationGroup"
            ],
            "Resource": [
                "arn:aws:elasticache:us-east-1:4852:replicationgroup*",
                "arn:aws:elasticache:us-east-1:4852:cluster*"
            ],
            "Condition": {
                "StringNotEqualsIgnoreCase": {
                    "elasticache:MultiAZEnabled": true
                }
            }
        }
    ]
}

Mehak

06:56:48 AM

We are creating AWS resources using Terraform which is using Terraform role. So we donot want to create datastores if Multi-AZ is not enabled on them. So if we have such policy for RDS Aurora and ElasticSearch then that would be great!!

Andriy Knysh (Cloud Posse)

01:29:37 PM

RDS supports this condition in IAM policies https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonrds.html#amazonrds-rds_MultiAz

Actions, resources, and condition keys for Amazon RDS - Service Authorization Reference

Lists all of the available service-specific resources, actions, and condition keys that can be used in IAM policies to control access to Amazon RDS.

Mehak

02:01:52 PM

@Andriy Knysh (Cloud Posse) But this doesn’t work in RDS Aurora

Andriy Knysh (Cloud Posse)

02:04:29 PM

i just searched for that. You can search for IAM policy keys in RDS Aurora

Mehak

02:12:34 PM

I couldn’t find any such parameters there. Do we have some other way to enforce such rule?

Andriy Knysh (Cloud Posse)

02:23:19 PM

looks like there is no such policy conditions for Aurora (see https://stackoverflow.com/questions/73164178/iam-policy-to-force-enable-aurora-read-replica), although the comment was from 2 years ago

IAM policy to force enable Aurora Read Replica

I’d like to enforce IAM user when create Aurora Postgres cluster, they have to stick "Create an Aurora Replica or Reader node in a different AZ" in Multi-AZ deployment option. So I create…

Andriy Knysh (Cloud Posse)

02:30:26 PM

https://stackoverflow.com/questions/31976527/why-does-aws-rds-aurora-have-the-option-of-multi-az-deployment-when-it-does-re

Why does AWS RDS Aurora have the option of "Multi-AZ Deployment" when it does replication across different zones already by default?

When launching an Aurora instance I have the option of “Multi-AZ Deployment”, which it describes as “Specifies if the DB Instance should have a standby deployed in another Availability Zone.”

Howe…

Mehak

06:39:24 AM

I am thinking to go for sentinel policies

Mehak

06:39:48 AM

or if you have idea about open policy agent. which one would be better?

Andriy Knysh (Cloud Posse)

01:06:20 PM

where are you thinking to run the OPA agent?

Mehak

05:06:04 PM

in tf cloud

Gabriela Campana (Cloud Posse)

08:13:02 PM

@Mehak do you need further assistance here?

Mehak

08:03:45 AM

@Gabriela Campana (Cloud Posse) yes

Gabriela Campana (Cloud Posse)

05:23:16 PM

@Andriy Knysh (Cloud Posse)

Andriy Knysh (Cloud Posse)

05:25:21 PM

i’m not familiar with TF cloud sentinel policies. There are many docs about that, e.g. https://developer.hashicorp.com/terraform/cloud-docs/policy-enforcement/sentinel. Maybe other people can help here

Defining Policies - Sentinel - HCP Terraform | Terraform | HashiCorp Developer attachment image

Learn how to use Sentinel policy language to create policies, including imports to define rules, useful functions, and more.

Gabriela Campana (Cloud Posse)

05:31:14 PM

I could not find anything related to TF cloud sentinel policies in Cloud Posse projects history. @Erik Osterman (Cloud Posse) to confirm if we have any SME on TF cloud sentinel policies

2024-06-12

Mehak

06:39:09 AM

Can someone help me with sentinel policy to enforce multi-az on rds aurora and elasticsearch clusters. I will create policy in TF cloud?

2024-06-13

2024-06-17

Alex Atkinson

04:36:30 PM

I don’t think the updated cert chain will be added to this npm module before August 22. https://github.com/mysqljs/mysql/blob/master/lib/protocol/constants/ssl_profiles.js

Alex Atkinson

04:55:41 PM

It’ll be the kick that some need to get on the mysql2 module.

2024-06-19

2024-06-20

2024-06-21

Sudheer

01:25:22 PM

Hi Folks, Have you ever wanted a generative AI assistant that could go through S3, Redis, RDS, Confluence, or an internet web crawler and answer questions about your product using generative AI? If you’re building this from scratch, think again. Check out Amazon Q! and How did First Orion optimize their workflow with Amazon Q? Check the link above for a detailed post describing the architecture and other aspects. Feel free to comment and share your views.

2024-06-30

ecatevatis

12:21:56 AM

Terrascan isn’t properly identifying any of the cloudposse modules for compliance. Is there a scanner that works with cloudposse modules?

Erik Osterman (Cloud Posse)

12:17:17 PM

The problem is complaince is based on the parameters you pass, and standards can be contradictory, for example requiring different retention periods.

Erik Osterman (Cloud Posse)

12:18:12 PM

We ensure our modules are sufficiently parameterized, but the end user needs to pass the parameters