#aws (2024-09)

aws Discussion related to Amazon Web Services (AWS)

aws Discussion related to Amazon Web Services (AWS)

Archive: https://archive.sweetops.com/aws/

2024-09-02

Emmanuel O avatar
Emmanuel O

Hello, I’m currently facing an issue with aws load balancer. I have an ecs fargate cluster with about five tasks. However, I noticed that these instances dont scale pass 10 users during a load test. Upon further debugging, I had to ssh into each of these instances and did a htop to see the cpu and memory utilization of these five tasks. I noticed that one of these tasks had 100% cpu utilization and the rest had no cpu utilization. This makes the cpu utilization get very high and makes the ecs tasks unhealthy and unable to receive more traffic. This image shows the ecs cpu utilization for one instance. How can i ensure this traffic is evenly distributing the traffic to all tasks in the ecs service. Upon checking my load balancer access logs, I also noticed that a lot of requests came from one ip address. I tried modifying the load balancer traffic distribution style to round robin but it still doesnt distribute traffic evenly across all my tasks. What can I do to ensure scalability of my application? Has anyone faced this ?

Gabriela Campana (Cloud Posse) avatar
Gabriela Campana (Cloud Posse)

@Jeremy White (Cloud Posse)

Michael Galey avatar
Michael Galey

you likely have session stickiness on on the load balancer, which means each unique session goes to the same instance, to allow for things like safe deploys (once you hit app version 2.0, you no longer hit 1.0). Your load tester is 1 session, so it always hits 1 server. You could temporarily turn off session stickiness during the load test, or else you might have to lookup how it determines that stickiness, and randomize that on the load tester side. AWS Load balancer sets a cookie for that stickiness, so you’d need to clear cookies. Cookie name is AWSALB at least on mine

Emmanuel O avatar
Emmanuel O

Thanks @Michael Galey Currently stickiness if turned off. The problem is that one ecs task takes 87% cpu while the remaining four tasks have 0% utilization. So this has an impact on the scalability of the system. Do you know how I can resolve this ?

Jeremy White (Cloud Posse) avatar
Jeremy White (Cloud Posse)

usually I think about python applications as needing something to share requests. There are a few ways to do this, but a common couple to try first are gunicorn and uwsgi . Do you have any application that’s sharing the listener port with your server threads/PIDs?

Jeremy White (Cloud Posse) avatar
Jeremy White (Cloud Posse)
Jeremy White (Cloud Posse) avatar
Jeremy White (Cloud Posse)

if you are using one of those tool already, could you share your config? Doesn’t have to be all the gory details, but at least some notion of how it decides to spawn your application on a request

Michael Galey avatar
Michael Galey

the above would be per server, he’s already at full utilization on one server. His load balancer is not balancing the load, at least not from a single source.

Michael Galey avatar
Michael Galey

if you do 2-3 load tests of smaller size from a few diff ips, does it go to that number of ecs tasks? I’d suggest trying least-request if that’s an option for the load balancer, otherwise maybe just start simple, follow some tutorial for a basic load balancer + hello world, and compare the load balancer config / target group config against yours. I don’t see how it’s the app’s fault here, I think it’s the load balancer config + single origin ip

Jeremy White (Cloud Posse) avatar
Jeremy White (Cloud Posse)

I better understand now. I’m not sure what’s up, but are you using target groups? Do all the tasks show as healthy?

2024-09-03

2024-09-04

2024-09-05

Veerapandian M avatar
Veerapandian M

Hi, Team. I am looking for help with Azure DevOps repository + AWS Amplify deployment.

bradym avatar

You’re more likely to get help if you ask questions.

Zing avatar

https://github.com/aws/containers-roadmap/issues/474 hey there, how are you all working around aws’ silly limitation on EKS access entries not supporting wildcards? it’s a nightmare for permission set arns, since they have that random string at the end of the permission set role

#474 [EKS] [request]: EKS authentication rolearn wildcard support aka improved support for AWS Identity Center SSO

Tell us about your request
Support basic glob wildcard rolearn matching for aws-auth configmap that controls iam role eks auth.

Which service(s) is this request for?
EKS

Tell us about the problem you’re trying to solve. What are you trying to do, and why is it hard?
Trying to avoid hardcoding lots of IAM role arns into the aws-auth configmap. It would be useful if basic glob wildcard matching worked in the rolearn field of each role mapping:

apiVersion: v1
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system
data:
  mapRoles: |
    - groups: [AcmeCorp]
      rolearn: arn:aws:iam::111122223333:role/teams/*
      username: AcmeCorp

Are you currently working around this issue?
Individually specifying each rolearn and updating the configmap everytime these roles change:

apiVersion: v1
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system
data:
  mapRoles: |
    - groups: [AcmeCorp]
      rolearn: arn:aws:iam::111122223333:role/teams/SomeTeam
      username: SomeTeam
    - groups: [AcmeCorp]
      rolearn: arn:aws:iam::111122223333:role/teams/AnotherTeam
      username: AnotherTeam

Additional context
I tried using a * on a working rolearn field and the role became unable to authenticate with the api server. EKS version (Im not sure what component handles this auth delegation, so I dont know of another relevant version to check for that):

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T14:25:20Z", GoVersion:"go1.12.7", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.10-eks-5ac0f1", GitCommit:"5ac0f1d9ab2c254ea2b0ce3534fd72932094c6e1", GitTreeState:"clean", BuildDate:"2019-08-20T22:39:46Z", GoVersion:"go1.11.13", Compiler:"gc", Platform:"linux/amd64"}

RB avatar

Codify each arn in aws auth config

#474 [EKS] [request]: EKS authentication rolearn wildcard support aka improved support for AWS Identity Center SSO

Tell us about your request
Support basic glob wildcard rolearn matching for aws-auth configmap that controls iam role eks auth.

Which service(s) is this request for?
EKS

Tell us about the problem you’re trying to solve. What are you trying to do, and why is it hard?
Trying to avoid hardcoding lots of IAM role arns into the aws-auth configmap. It would be useful if basic glob wildcard matching worked in the rolearn field of each role mapping:

apiVersion: v1
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system
data:
  mapRoles: |
    - groups: [AcmeCorp]
      rolearn: arn:aws:iam::111122223333:role/teams/*
      username: AcmeCorp

Are you currently working around this issue?
Individually specifying each rolearn and updating the configmap everytime these roles change:

apiVersion: v1
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system
data:
  mapRoles: |
    - groups: [AcmeCorp]
      rolearn: arn:aws:iam::111122223333:role/teams/SomeTeam
      username: SomeTeam
    - groups: [AcmeCorp]
      rolearn: arn:aws:iam::111122223333:role/teams/AnotherTeam
      username: AnotherTeam

Additional context
I tried using a * on a working rolearn field and the role became unable to authenticate with the api server. EKS version (Im not sure what component handles this auth delegation, so I dont know of another relevant version to check for that):

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T14:25:20Z", GoVersion:"go1.12.7", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.10-eks-5ac0f1", GitCommit:"5ac0f1d9ab2c254ea2b0ce3534fd72932094c6e1", GitTreeState:"clean", BuildDate:"2019-08-20T22:39:46Z", GoVersion:"go1.11.13", Compiler:"gc", Platform:"linux/amd64"}

RB avatar

Now days the aws auth config file is deprecated

RB avatar

You can’t use wildcards/globs in aws auth config or in eks access entries

RB avatar

One way to do it, if youre using terraform, whether with aws auth config or access entries, you can use the data source aws iam roles, specify a wildcard to retrieve all the arns, and then populate the arns in your config map or access entries

RB avatar

https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/eks_access_entry

resource "aws_eks_access_entry" "default" {
  for_each = data.aws_iam_roles.default

  cluster_name      = aws_eks_cluster.default.name
  principal_arn     = each.value.arn
  kubernetes_groups = ["group-1", "group-2"]
  type              = "STANDARD"
}
Zing avatar

yeah, i saw that workaround in the thread

Zing avatar

but its so hacky

Zing avatar

i also can see it breaking since we use terragrunt, and it’s hard to pass in data calls into module inputs

Zing avatar

so we’d have to just rely on “in-module” access entries, and ignore the terragrunt layer, i think (maybe not - haven’t thought abt it enough)

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Yes, we’ve run into this as well. It’s one of the reasons we also implement the aws-teams and aws-team-roles architecture in our reference architecture and allow permission sets to assume them. This allows us to have consistent roles for both programmatic/machine access (e.g. GitHub OIDC) as well as for developers.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

We talk more about our approach here https://docs.cloudposse.com/layers/identity/

Identity and Authentication | The Cloud Posse Reference Architecture

Setup fine-grained access control for an entire organization

Zing avatar

thanks! i’ve been considering going with this approach, but i’m hesitant because of the extra assume role hop (for human users)

Zing avatar

especially non-technical users

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Yes, the extra assume role hop is really for technical users

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

But sounds like you’re using EKS. You have technical users.

Zing avatar

yeah, but we have a fair amount of non technical users (clickops in the console), and i’m not sure what we’d do for them. i think the extra assume role in the console would make dem go nuts but i guess a hybrid approach could work too…

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Jeremy G (Cloud Posse) might have some updated ideas on this we haven’t yet tried. He’s OoO though.

2024-09-06

2024-09-07

Mark avatar

Hey everyone, Recently we did a change on our ECS infrastructure. We’ve transitioned to using AWS service discovery and have configured our containers to use HTTPS on their hostnames. After resolving various issues with appsettings and Dockerfiles, the HTTPS port is now open. Previously, we used an ALB for each service in the ECS cluster. With the move to HTTPS and service discovery, we need to set HTTPS as the port for service health checks. The challenge we’re facing is that target groups don’t allow us to define a hostname for service discovery. You might wonder why we switched to HTTPS. The decision was driven by difficulties we encountered with service discovery, which we found were best addressed by using HTTPS. I’ve attached the task definition file for one of the services and the appsettings file. These should help illustrate the issue with the target group’s inability to accept a hostname. Just a note: I’m fairly new to DevOps—only been in this field for two months—and I’m really enjoying the learning process!

andrey.a.devyatkin avatar
andrey.a.devyatkin

here is the text version with links https://fivexl.io/blog/ecs-service-connect-encryption/

Keeping your data secure in transit with ECS Service Connect

Deep-dive into AWS ECS Service Connect. How startup can enable encryption in transit with ECS Service Connect and ECS Fargate deployment

Fizz avatar

Why don’t you let the ECS service manage registration with the target group for you? It’s allowed to use both service discovery AND allow the ECS service to register task IPs with the target group

Mark avatar

@andrey.a.devyatkin Thank you so much for the video! @Fizz Yup, this is what i chose to go with, during the creation process when defining the ECS service, i choose the load balancer and then i was prompted to create the Target Group & add it to the listener. All the ports of which i used use HTTPS now and it’s working perfectly, am just adding a route 53 entry and then we will go public!

Rishav avatar

This is super neat to see, and do share your experience during and after the process! I’m keen to implement ECS Service Connect within Fargate target groups using Terraform provisioning, as soon as I’m able wrap my mind around it.

andrey.a.devyatkin avatar
andrey.a.devyatkin

@Rishav checkout blog post and video above - they go into details of ECS Service Connect implementation for ECS/Fargate

Veerapandian M avatar
Veerapandian M

I am a team looking for help with the yml pipeline for Azure DevOps to Azure static Apps service in the nextjs application.

Hao Wang avatar
Hao Wang

I worked on Azure for a while, hope you’ve worked it out already

Veerapandian M avatar
Veerapandian M

Thank you for your response; I have resolved the pipeline issues; however, the deployment is taking time; I am working on skipping a few items.

Hao Wang avatar
Hao Wang

cool

Veerapandian M avatar
Veerapandian M

Hi

Hao Wang avatar
Hao Wang

Hi there, how is it going?

2024-09-08

2024-09-09

Dexter Cariño avatar
Dexter Cariño

Hello, how to deploy docker compose on aws fargate? I searched some but its outdated/retired.

Darren Cunningham avatar
Darren Cunningham

I think you’re looking for https://github.com/aws/amazon-ecs-cli

Dexter Cariño avatar
Dexter Cariño

will check on this.

Dexter Cariño avatar
Dexter Cariño

thank you

1
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Yea, unfortunately they deprecated docker compose deployments to ECS in the docker-compose CLI

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

We were really bummed about that

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

wow, @Darren Cunningham i didn’t know about this new ECS cli

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

And this amazon-ecs-cli is different from the other ECS cli by AWS https://aws.github.io/copilot-cli/

AWS Copilot CLI

Develop, Release and Operate Container Apps on AWS.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
Comment on #5987 Is copilot-cli still maintained?

Turns out there is a discussion about this aws/copilot-cli#5925. I have no idea when the repo doesn’t state the status in the readme

2
managedkaos avatar
managedkaos

Why can’t AWS keep an ECS CLI around? I’ve had better luck writing bash scripts that use the native AWS CLI

this1

2024-09-10

2024-09-11

Zing avatar

https://github.com/aws/containers-roadmap/issues/2411

Can we get some traction on this

Support for custom eks access entry policies

2024-09-12

2024-09-16

RB avatar

I came across this OWASP project recently that implements an open source version of AWS PrivateCA without the costs of PrivateCA

https://serverlessca.com/

Terraform module for serverless CA on AWSattachment image

Serverless CA in AWS with FIPS 140-2 level 3 CA key storage and cost typically under $5 per month

1
1
1
jose.amengual avatar
jose.amengual

well that is a HUGE different in price

Terraform module for serverless CA on AWSattachment image

Serverless CA in AWS with FIPS 140-2 level 3 CA key storage and cost typically under $5 per month

2
kevcube avatar
kevcube

i think more terraform needs to move toward this. fully packaged applications. as an infra guy, sure i can set up the VPC, ASG, ECS yada yada

but I think OSS devs/communities could benefit a ton from saying “just run this one auditable, fully configurable command in your AWS account and you get the application running.”

of course someone has to write that terraform, select plenty of opinionated defaults when doing it, but it’s far more collaboratively-approachable than a cloudformation template. also lends itself to cross-cloud translation.

2
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

(Discussed on office hours)

1

2024-09-17

2024-09-18

2024-09-23

2024-09-26

Sean Turner avatar
Sean Turner

Going deep on renovate lately in a move from cluster-branch ArgoCD Applications to ApplicationSets…

AWS Just released m8g instances. How do you all go about upgrading your Karpenter Manifests to pull in the newest instance type? Do you decaratively express family + version (e.g. m8g? Or perhaps just family (e.g. mg)?

1
Gabriela Campana (Cloud Posse) avatar
Gabriela Campana (Cloud Posse)

@Yonatan Koren @Jeremy G (Cloud Posse) @Jeremy White (Cloud Posse)

Jeremy G (Cloud Posse) avatar
Jeremy G (Cloud Posse)

@Sean for Karpenter, unless we have to conform to SCP restrictions, we generally limit by instance generation (Gt 5), architectures (amd64), and vCPUs (Gt 2, Lt 32) and

              - key: "karpenter.k8s.aws/instance-encryption-in-transit-supported"
                operator: "In"
                values: ["true"]
              # Requiring Nitro is redundant with Encryption in Transit, but we keep it for now.
              - key: "karpenter.k8s.aws/instance-hypervisor"
                operator: In
                values: ["nitro"]

then we get access to all the instances, and let Karpenter/AWS decide which is the best fit for our needs.

Sean Turner avatar
Sean Turner

Great, tyvm!

2024-09-27

Adarsh avatar

Has anyone worked with SRV record type , I have a private hosted zone , and had a dns records of Type A for my services deployed in ecs for inter-communication , I had to change one of the service record type from A to SRV to expose one of the route to public via api gateway , When i created SRV record it automatically created a type A record too , so SRV type record :- svc1.accept.com and A type record :- 678521378612382091734.svc1.accept.com , and on running dig command on svc1.accept.com it is pointing to 678521378612382091734.svc1.accept.com , although the service was exposed using the api gateway , but the other services in the cluster are failing to connect my service , I tried to replace the urls in the other services env files to :- 678521378612382091734.svc1.accept.com -> connection refused svc1.accept.com -> cannot resolve http://678521378612382091734.svc1.accept.com<i class="em em-8080|678521378612382091734.svc1.accept.com"</i>8080> -> connection refused I cannot change it back to A record because api gateway needs SRV type only

Scott Kaminski avatar
Scott Kaminski

Is your VPC Private DNS setting correct? Can you manually verify the endpoints with dig

IE: dig 678521378612382091734.svc1.accept.com SRV

2024-09-30

Shirisha Sudhakar Rao avatar
Shirisha Sudhakar Rao

Is it possible to use Cloudposse’s VPC module to also create the database and intra subnets (similar to the terraform-aws-modules/vpc/aws component)?

Gabriela Campana (Cloud Posse) avatar
Gabriela Campana (Cloud Posse)

@Jeremy G (Cloud Posse)

Jeremy G (Cloud Posse) avatar
Jeremy G (Cloud Posse)

@Andriy Knysh (Cloud Posse)

1
Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

subnets are created by https://github.com/cloudposse/terraform-aws-dynamic-subnets

the component https://github.com/cloudposse/terraform-aws-components/tree/main/modules/vpc uses both the cloudposse/vpc/aws module to create a VPC, and the cloudposse/dynamic-subnets/aws module to create the subnets

see this example on how to create multiple (named) subnets per AZ https://github.com/cloudposse/terraform-aws-dynamic-subnets/tree/main/examples/multiple-subnets-per-az

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

@Shirisha Sudhakar Rao

Jeremy G (Cloud Posse) avatar
Jeremy G (Cloud Posse)

So, to add to what Andriy explained, The VPC root module (which we call a “component”) and the dynamic-subnets module can create multiple named subnets. The current limitation on both is that there is only one flag for creating public subnets, so either all the subnets have both public and private allocations in each AZ or all the subnets are only private.

If you want to create some subnets that are both public and private, and some that are only private, you cannot easily use the VPC component because it assigns a CIDR range to the VPC and then divides it up among all the subnets it creates. You would use the component to create all the subnets that are both public and private and they would take up the entire primary CIDR block of the VPC. You would specify, to the VPC component, ipv4_additional_cidr_block_associations, and then separately use dynamic-subnets to allocate private-only subnets covering one of the additional CIDR blocks.

cloudposse/terraform-aws-dynamic-subnets

Terraform module for public and private subnets provisioning in existing VPC

Shirisha Sudhakar Rao avatar
Shirisha Sudhakar Rao

@Jeremy G (Cloud Posse) @Andriy Knysh (Cloud Posse) Thank you. I was able to setup the subnets.

1
    keyboard_arrow_up