#aws (2023-06)

aws Discussion related to Amazon Web Services (AWS)

aws Discussion related to Amazon Web Services (AWS)

Archive: https://archive.sweetops.com/aws/

2023-06-01

Adnan avatar

“A pull through cache is a way to cache images you use from an upstream repository. Container images are copied and kept up-to-date without giving you a direct dependency on the external registry. If the upstream registry or container image becomes unavailable, then your cached copy can still be used.” https://aws.amazon.com/blogs/containers/announcing-pull-through-cache-for-registry-k8s-io-in-amazon-elastic-container-registry/

Announcing pull through cache for registry.k8s.io in Amazon Elastic Container Registry | Amazon Web Servicesattachment image

Introduction Container images are stored in registries and pulled into environments where they run. There are many different types of registries from private, self-run registries to public, unauthenticated registries. The registry you use is a direct dependency that can have an impact on how fast you can scale, the security of the software you run, […]

1

2023-06-05

vicentemanzano6 avatar
vicentemanzano6

Hello! Im trying to create a 301 redirect with s3 and cloudfront. The main idea is to redirect traffic from support.example.com to our atlassian service desk portal url. Cloudfronts behaviour is to redirect http from https and theres a valid certificate from acm added to cloudfront for *.example.com, however, i get this error from chrome when trying to test the redirect: NET::ERR_CERT_COMMON_NAME_INVALID. Any ideas what could be wrong?

1
managedkaos avatar
managedkaos

Check the name of the bucket. IIRC, the name of the bucket has to match the domain name pretty much exactly. Its been a while and I haven’t looked at the docs yet but that’s one thing to consider.

Adnan avatar

For those who are using AWS Identity Center (SSO)

• Are you using primarily permission sets for cross-account access?

• Are you using primarily self managed roles for cross-account access? Why do you use one over the other?

Phil Hadviger avatar
Phil Hadviger

I hope I understand the question correctly, but I use permission-sets for cross account access that is focused around users and groups. When it comes to services, and automation I use self-managed rules.

1
Phil Hadviger avatar
Phil Hadviger

With permissions sets, it’s primarily a mix of inline-policies and AWS managed policies. So far haven’t really used the customer managed policies option of permission sets.

1
Adnan avatar

Yes, that’s what I am interested in. Do you run EKS clusters?

Phil Hadviger avatar
Phil Hadviger

I do not, the org I’m working in is primarily ECS.

Adnan avatar

With permission sets, AWS sets up those SSO roles with random numbers in the ARN’s across accounts. Do you use those anywhere to configure access or allow something else to assume the role?

Sergei avatar

We use sso roles with random numbers in terraform plans.The random numbers are appeneded to the role names so role names can still be used applying some basic regex

2023-06-08

SlackBot avatar
SlackBot
08:20:50 PM

This message was deleted.

1

2023-06-09

M Asif avatar

Hi. Want to import existing elasticache cluster into terraform. Not sure what is it about the replication group? Should I create a replication group and import cluster into it or what?

jimp avatar

Your best bet is to get the specific spec using describe_replication_group and then write the terraform resource to match. Then import it, yes. Do a terraform plan and if you note any deviations, then fix the resource definition to match.

1
M Asif avatar

Thanks @jimplet me do that

M Asif avatar

I described the cluster and got two clusters. Then I was trying to import both individually. But they did not succeed at terraform plan stage.

I’ll now follow what you have suggested.

Hao Wang avatar
Hao Wang

May Terraformer help?

1
M Asif avatar

Thanks let me check that.

1

2023-06-12

Oleh Kopyl avatar
Oleh Kopyl

Guys, how can I redeploy tasks after i pushed another Docker image to ECR?

When i press “Update service” and tick “force deploy”, it does not do anything.

So tired of going manually through deleting a service -> waiting until it’s finished the deletion (cause otherwise my tasks fail, probably due to the quota limit, so they have to finish deleting first) -> creating another service from scratch…

Nat Williams avatar
Nat Williams

Are you updating the Task Definition before pressing “update service”?

Nat Williams avatar
Nat Williams

to reference the new docker image

Oleh Kopyl avatar
Oleh Kopyl

@Nat Williams no, i don’t. But i have set some tag like …[us-east-1.amazonaws.com/test-classifier-fargate//us-east-1.amazonaws.com/test-classifier-fargate:arm-fp16)

Oleh Kopyl avatar
Oleh Kopyl

the tag is always the same

Nat Williams avatar
Nat Williams

hmm. It sounds like the default ECS behaviour should be to check the repo for a newer version every time

Nat Williams avatar
Nat Williams

you’re on ECS, right?

Nat Williams avatar
Nat Williams

I just sort of assumed that from the “Update service” and “force deploy” verbiage

Oleh Kopyl avatar
Oleh Kopyl

@Nat Williams ECS Fargate

Hao Wang avatar
Hao Wang

normally need to use a different image tag

Nat Williams avatar
Nat Williams

Yeah, ideally I guess you’d be updating the task definition with a specific version each time

Oleh Kopyl avatar
Oleh Kopyl

@Hao Wang why? :(

Is there any way to use the same without having to set up awful CodeDeploy?

Nat Williams avatar
Nat Williams

Sure, just create a new revision of the Task Definition and updating the service to use it

Oleh Kopyl avatar
Oleh Kopyl

@Nat Williams so it’s either i set up CodeDeploy or manually create another task revision? (or maybe via CLI somehow so it just duplicates it)

Hao Wang avatar
Hao Wang

Using different image tags is a best practice especially in prod env

Nat Williams avatar
Nat Williams

I mean, you’re already manually forcing the deploy, so it’s not that big a change

Oleh Kopyl avatar
Oleh Kopyl

@Hao Wang but i’d still need to update the task definition, right?

Hao Wang avatar
Hao Wang

hmm ECS should update image if forcing deploy

1
Nat Williams avatar
Nat Williams

yeah, it should. There is something weird going on

1
Hao Wang avatar
Hao Wang

yeah, if using a different tag

Hao Wang avatar
Hao Wang

AWS support would know more

Hao Wang avatar
Hao Wang

have run into some other issues and finally AWS support worked them out, it is a black box for us

Oleh Kopyl avatar
Oleh Kopyl

@Hao Wang thank you very much

Jut reached out to support with this issue :)

1
Michael Galey avatar
Michael Galey

ec2 hosted ecs might use the locally cached image of that tag to avoid the network cost. Fargate would pull fresh on a redeploy of the same tag. You might be able to disable image caching if using ec2, dunno how.

Oleh Kopyl avatar
Oleh Kopyl

@Michael Galey not using ECS + EC2, but ECS + Fargate

Michael Galey avatar
Michael Galey

then it’d work. Your ecr repo might have tag immutability checked? if so, your 2nd push wouldn’t overwrite the first one. Or theres something else that’s more user-side, if your force deployed tasks failed to pass health checks for instance, so checked the date of the task start, and view the logs of any recently stopped instances

Michael Galey avatar
Michael Galey

but in general i’d highly recommend using unique ids for deploys to solve possible confusion issues like this

Michael Galey avatar
Michael Galey

you could also pull the image locally and do docker inspect or connect to it to see if your recent change is there

Oleh Kopyl avatar
Oleh Kopyl

@Michael Galey could you please tell me where i can find that immutability tag? :)

Oleh Kopyl avatar
Oleh Kopyl

and how to use those unique ids :(

Michael Galey avatar
Michael Galey

how do you build images? codepipeline?

Oleh Kopyl avatar
Oleh Kopyl

@Michael Galey nope. I build locally, push locally and then Fargate takes my image from ECR and deploys

Michael Galey avatar
Michael Galey

add something like this to deploy script ? it assumes code is committed

      - COMMIT_HASH=`git rev-parse --short HEAD` 
      - IMAGE_TAG_VER=v-${COMMIT_HASH:=latest}
- docker build ... -t <repo url>:$IMAGE_TAG_VER
- docker push 
- <deploy command> <repo_url>:$IMAGE_TAG_VER
Oleh Kopyl avatar
Oleh Kopyl

@Michael Galey without <deploy command> <repo_url>:$IMAGE_TAG_VER

Oleh Kopyl avatar
Oleh Kopyl

have no idea what the deploy command should be

Michael Galey avatar
Michael Galey

whatever it is now, you’d just be looking for a parameter for the image, i haven’t used this stuff but quick googles show things like https://github.com/awslabs/fargatecli

awslabs/fargatecli

CLI for AWS Fargate

Oleh Kopyl avatar
Oleh Kopyl

@Michael Galey but if it doesn’t work in UI, how can it even work in CLI and even some third-party apps?

Oleh Kopyl avatar
Oleh Kopyl

I tried deploying with AWS cli. Does not work

Oleh Kopyl avatar
Oleh Kopyl

Something like

aws ecs update-service --cluster ${{ inputs.ecs_cluster_name }} --service ${{ inputs.ecs_service_name }} --force-new-deployment
Michael Galey avatar
Michael Galey

did you check the actual tasks are successfully starting?

Oleh Kopyl avatar
Oleh Kopyl

@Michael Galey they are up and running now

Michael Galey avatar
Michael Galey

“Fargate does not cache images, and therefore the whole image is pulled from the registry when a task runs.”

Oleh Kopyl avatar
Oleh Kopyl

everything works, LB handles requests to them just fine

Michael Galey avatar
Michael Galey

did you inspect the latest image?

Oleh Kopyl avatar
Oleh Kopyl

@Michael Galey locally?

Michael Galey avatar
Michael Galey

pull the image from ecr, and see if your intended change is in there

Michael Galey avatar
Michael Galey

and see if theres a date via docker inspect

Oleh Kopyl avatar
Oleh Kopyl

it’s there

Michael Galey avatar
Michael Galey

you’d have to clear your local cache

Michael Galey avatar
Michael Galey

and the code isn’t working in the deployed version?

Oleh Kopyl avatar
Oleh Kopyl

code is working. AWS is not redeploying on “force redeploy”

Michael Galey avatar
Michael Galey

do the tasks start date line up with the redeploy command?

Oleh Kopyl avatar
Oleh Kopyl

@Michael Galey Nope. Old date. They do not redeploy

Michael Galey avatar
Michael Galey

oh ok, not a cache thing then, not sure

Michael Galey avatar
Michael Galey

command looks right to me

Oleh Kopyl avatar
Oleh Kopyl

@Michael Galey thanks anyways

1
Oleh Kopyl avatar
Oleh Kopyl

@Michael Galey which command?

Michael Galey avatar
Michael Galey

aws ecs update-service --cluster <<cluster-name>> --service <<service-name>> --force-new-deployment --region <<region>>

Oleh Kopyl avatar
Oleh Kopyl

@Michael Galey i would rather make sure it works in UI, then try to get the command working

Oleh Kopyl avatar
Oleh Kopyl

AWS support is horrible

support: All restrictions on your account have been lifted. me: What were the restrictions? support: https://i.imgur.com/TuC0If7.jpg me: You said “All restrictions on your account have been lifted.”. So what were the restrictions? support: https://i.imgur.com/imFbdjL.jpg

support: “I understand”

are they on crack?

Hao Wang avatar
Hao Wang

oh it is common to have such restrictions, AWS should have sent some email to root account

Oleh Kopyl avatar
Oleh Kopyl

@Hao Wang but they’re refusing to tell me the restrictions…

Hao Wang avatar
Hao Wang

hard to know the details for this case which is not related to security

jonjitsu avatar
jonjitsu

Any thoughts/opinions on orgformation as an alternative to Control Tower/Landing Zones?

Hao Wang avatar
Hao Wang

great to know this project, just took a look, it is CFN wrapper with JS/TS

Hao Wang avatar
Hao Wang

feeling much better than CT for some insights into the black box of AWS

1
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
account | The Cloud Posse Developer Hub

This component is responsible for provisioning the full account hierarchy along with Organizational Units (OUs). It includes the ability to associate Service Control Policies (SCPs) to the Organization, each Organizational Unit and account.

3

2023-06-13

Patrick McDonald avatar
Patrick McDonald

anyone impacted from the current aws us-east-1 outage?

1
Wayne Jessen avatar
Wayne Jessen

Yup. We are all waiting around and can’t do anything about it.

mike avatar

us-east with lambdas api gateway both being down gonna be a bad time for a large swath of aws

Patrick McDonald avatar
Patrick McDonald

cloud formation and lambda are affected I wonder if this only impacts provisioning managed services

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
Status overviewattachment image

Realtime overview of issues and outages with all kinds of services. Having issues? We help you find out what is wrong.

2023-06-14

jose.amengual avatar
jose.amengual

Has anyone implement a TCP/UDP proxy for instances in AWS? pure forwarding of ports to different instances like port:1 to instance:1 , port2: instance2 etc ? I wonder if there is a container with nginx or something else that have this built in to make it easier instead of cooking my on image. I could use NLBs but NLBs are layer4 and do not support SGs so once is public then I need to use NACLs to close the access and I will like to avoid that

Michael Galey avatar
Michael Galey

I use socat to make some private things available to different clouds and it’s worked well so far, been running 6 months or so with no issues.

jose.amengual avatar
jose.amengual

mmm I think I have the same issue than nginx container, that I can’t pass multiple lines to the entypoint to create multiple configurations

loren avatar

would ssm port forwarding work for your use case?

1
jose.amengual avatar
jose.amengual

no, this users do not have AWS access, that is my first option

1
1
Alex Jurkiewicz avatar
Alex Jurkiewicz

use NLB. NACL management is less than management of an entire proxy

jose.amengual avatar
jose.amengual

that is true

2023-06-16

Oleh Kopyl avatar
Oleh Kopyl

Is there any easy way to launch a Docker container on AWS from ECR without a complex cluster + task + service setup on ECS?

If there is such a complex setup for just playing around with one server, there is no point in ECR – better set up manually EC2…

Fizz avatar
What is AWS App Runner? - AWS App Runner

Use the AWS App Runner service to go directly from an existing container image or a source code repository to a running web service in the AWS Cloud with CI/CD capability.

1
Fizz avatar

You can also deploy docker images to lambda

tommy avatar

Take a look at copilot, it still starts ecs cluster and other resources but in a simpler way.

tommy avatar

You no longer need to provision by yourself.

Oleh Kopyl avatar
Oleh Kopyl

Lambda? I don’t want to configure api gateway so I could just make requests

Oleh Kopyl avatar
Oleh Kopyl

@tommy thanks

Oleh Kopyl avatar
Oleh Kopyl

@Fizz thanks

Fizz avatar

Lambda can be accessed via http now

Fizz avatar
Lambda function URLs - AWS Lambda

Add a dedicated HTTP(S) endpoint to your Lambda function using a function URL.

Oleh Kopyl avatar
Oleh Kopyl

@Fizz it’s still not the best option. Lambda requires specific Docker requirements, whereas EC2 does not

Oleh Kopyl avatar
Oleh Kopyl

@Fizz App Runner costs $0.064 per hour? It’s crazy. EC2 costs around $10 a month whereas this would cost almost $50.

Oleh Kopyl avatar
Oleh Kopyl

@tommy

Copilot such an awful tool.

Takes a lot of time to deploy just one small Docker image.

It takes less time to deploy a Fargate task manually.

Stoor avatar

You could use AWS Lightsail for this. 7$ for the nano sized per month. (0,25 vCPU, 0,5GB RAM)

Oleh Kopyl avatar
Oleh Kopyl

@Stoor awful tool as well. Don’t remember why tho

Oleh Kopyl avatar
Oleh Kopyl

@Stoor but thank you very much

Stoor avatar

Awful tool? What do you mean? It’s doing exactly what it needs to be doing.

Oleh Kopyl avatar
Oleh Kopyl

@Stoor the UX is awful in the first place…

2
John Bedalov avatar
John Bedalov

this elasticache module doesn’t seem to conveniently handle global clusters https://github.com/cloudposse/terraform-aws-elasticache-redis - is that correct?

cloudposse/terraform-aws-elasticache-redis

Terraform module to provision an ElastiCache Redis Cluster

John Bedalov avatar
John Bedalov

right now I’m using this module to create 1 cluster then bring in the global cluster resources with ordinary aws terraform resources

cloudposse/terraform-aws-elasticache-redis

Terraform module to provision an ElastiCache Redis Cluster

jose.amengual avatar
jose.amengual

hi John , my guess is people do not use much global clusters in Redis li it was in the rds module a while ago

jose.amengual avatar
jose.amengual

but that can be implemented

jose.amengual avatar
jose.amengual

PRs are welcome

John Bedalov avatar
John Bedalov

John Bedalov avatar
John Bedalov

Just making sure I’m going down the right path with using the module and resources for the rest. The secondary inherits most of the primary’s settings so it basically works.

jose.amengual avatar
jose.amengual

I think taking the approach of the RDS module and do a for_each for the replication group , or different count logic and add

resource "aws_elasticache_global_replication_group" "example" {
  global_replication_group_id_suffix = "example"
  primary_replication_group_id       = aws_elasticache_replication_group.primary.id
}

should be sufficient

John Bedalov avatar
John Bedalov

Thanks Pepe!

1
Oleh Kopyl avatar
Oleh Kopyl

Anybody has experience in AWS Lightsai?

Is there any way to make the launch script work? If i execute these commands in a shell script by running ./install.sh with these commands, i can then find that my packages are installed, whereas with this – when I ssh into an instance – there are no such packages…

Oleh Kopyl avatar
Oleh Kopyl

How to select arm/x86 for a lightsail container service?

Oleh Kopyl avatar
Oleh Kopyl

Do you know why does my Lightsail deployemnt fails?
No launch command, environment variables, or open ports specified.
It does have CMD in my Dockerfile

Oleh Kopyl avatar
Oleh Kopyl

One of the worst development experience with Lightsail too

Oleh Kopyl avatar
Oleh Kopyl

And why a deployment on Lightsail take such a lot of time? Crazy…

Oleh Kopyl avatar
Oleh Kopyl

Deployment on Lightsail takes way more time than to Fargate…

Oleh Kopyl avatar
Oleh Kopyl

Seeing this in Lightsail service deployment logs. Why? So for Fargate this Docker image is good and for Lightsail it’s not, correct?

Oleh Kopyl avatar
Oleh Kopyl

It’s maybe because the image is built for ARM but it deploys on x86.

Amazing, what an awful UX, Amazon.

Alex Jurkiewicz avatar
Alex Jurkiewicz

Lightsail tho

Alex Jurkiewicz avatar
Alex Jurkiewicz

Blaming incorrect architecture on the platform is something though. Maybe try running a windows executable

2023-06-17

Oleh Kopyl avatar
Oleh Kopyl

Yo. Please help me. I created a new task definition with .5 vCPU instead of .25 vCPU.

How can I update all the current tasks im a service to .5 vCPU?

I pressed “Update service” button, then checked “Force new deployment”, then pressed “Update” button, got “Service updated:” message and my tasks are still .25 vCPU. Why?

Oleh Kopyl avatar
Oleh Kopyl

I checked “Deployments” tab and all the last deployment is “In progress” for eternity now…

It’s faster to remove a service and create a new service to update tasks… What the hell…

Oleh Kopyl avatar
Oleh Kopyl

Guys, could you please recommend any Fargate autoscaling tutorials?

Don’t recommend AWS docs please.

I set it up and it does not work…

Hao Wang avatar
Hao Wang

I’ve got many clients in the same situation, it is frustrating, let us calm down first I think the cause is some basic thing is ignored like the image architecture as Alex mentioned

Oleh Kopyl avatar
Oleh Kopyl

Cause of what exactly? :)

Hao Wang avatar
Hao Wang

cannot know the cause for now with the shared information, do you have source codes or doc?

Oleh Kopyl avatar
Oleh Kopyl

@Hao Wang source code of what?)

Hao Wang avatar
Hao Wang

source codes or instructions you followed

Oleh Kopyl avatar
Oleh Kopyl

I did not have any instructions :(

cool-doge1
1
Hao Wang avatar
Hao Wang

I was frustrated when I started using Docker at 2014, version 0.9

Hao Wang avatar
Hao Wang

You are at the psychology turning point of learning tech. Don’t give up but try different platforms, AWS may not be a good fit for you

Hao Wang avatar
Hao Wang

The other platform will have the similar issue since this seems not an AWS issue though

Hao Wang avatar
Hao Wang

So back to the first point, which is some basic stuff was overseen… Do you have a writeup or source codes?

Oleh Kopyl avatar
Oleh Kopyl

Please tell me how to test docker lambda locally.

I found this https://docs.aws.amazon.com/lambda/latest/dg/images-test.html

Ran a docker image like this docker run -p 9000:8080 ...

I got: 18 Jun 2023 01:34:24,542 [INFO] (rapid) exec '/var/runtime/bootstrap' (cwd=/var/task, handler=)

Went to http://127.0.0.1:9000/2023-06-18/functions/function/invocations , got nothing

Oleh Kopyl avatar
Oleh Kopyl

figured out. You have to use that exact URL:

http://localhost:9000/2015-03-31/functions/function/invocations

1
Oleh Kopyl avatar
Oleh Kopyl

Why does the hell local Lamda has different event ‘s structure from pushed function’s event?

Oleh Kopyl avatar
Oleh Kopyl

Here is my Lanbda’s handler:

def handler(event, context):
    image_url = event.get("image_url")
    token = event.get("headers", {}).get("Authorization")
    return {
        "headers": {"Content-Type": "application/json"},
        "statusCode": "status_code",
        "body": image_url,
        "token": token,
        "event": event,
    }

Here is my Python’s request:

import requests


payload = {
    "image_url": 'https://...',
    "return_version": True,
}
headers = {
    "Authorization": "tes",
}
res = requests.post("https://...", json=payload, headers=headers)
print(res.json())

Here is what i get in Python from deployed Lambda invocation:

None

Here is what i get from local Lambda invocation:

{'headers': {'Content-Type': 'application/json'}, 'statusCode': 'status_code', 'body': 'https://...', 'token': None, 'event': {'image_url': 'https://...', 'return_version': True}}

Here is what i get when i press “Test” in AWS console: https://i.imgur.com/Vmuy01T.png

Why the hell it’s the same function, but 3 different outputs? How am i supposed to test it locally if it gives me the different result in the deployed state?

Oleh Kopyl avatar
Oleh Kopyl

And what is even this None???

Darren Cunningham avatar
Darren Cunningham
02:40:33 AM

woosha

Darren Cunningham avatar
Darren Cunningham

I think you’re suffering from the same thing that I suffered from as I was getting started, slow down and approach the problem a little more pragmatically. If you’re attempting to implement something that has been done before then if it’s not working the problem is not the language/framework/platform, it’s your understanding thereof. That’s not a judgement statement, but an encouragement to slow down and breakdown the problem into digestible bits.

Darren Cunningham avatar
Darren Cunningham

I’m not trying to be crude/discouraging but just trying to help break things down in a meaningful way. If the technology was truly as lacking or ass backwards as you’re making it out to be, then AWS wouldn’t be nearly as successful as they are.

Oleh Kopyl avatar
Oleh Kopyl

@Darren Cunningham don’t you agree that making different environment in dev and prod is such a stupid thing of Lambda?

1
Oleh Kopyl avatar
Oleh Kopyl

Is there any way to have faster Lambda responses? My lambda processes requests for too long as opposed to a regular EC2…

Darren Cunningham avatar
Darren Cunningham

are you able to provide some more detail?

• “too long” — what’s the limit you want to stay under?

• what is the response time Lambda vs EC2?

• what runtime? ◦ what memory allocation are you giving it?

• what is the access pattern? meaning, how are you invoking the Lambda vs EC2 ◦ is this a local invocation or remote?

• what does the lambda do? aka, does it access resources within the VPC? if so, are you deploying it into the VPC?

Darren Cunningham avatar
Darren Cunningham

Generally it’s not surprising that an EC2 is going to process a single or a few hundred requests faster than a Lambda. Lambda shines in it’s ability to “scale infinitely” (besides rate limits, etc).

EC2 is like owning a car, you can get it in and go wherever you want/whenever you want without waiting. But you own the maintenance thereof. Lambda is a taxi, it’s not always faster (sometimes is) but it’s worry free…outside of the cost and the more you know how to work the system, the more value you get out of it.

Darren Cunningham avatar
Darren Cunningham

sticking with my car analogy, you can get a 1000 taxis to show up at your door step a lot easier, faster and cheaper than if you tried to rent 1000 cars for a night

Oleh Kopyl avatar
Oleh Kopyl

@Darren Cunningham seems like i figured it. I had to increase the memory so it implicitly increases CPU and makes responses faster. But thank you (:

Oleh Kopyl avatar
Oleh Kopyl

Generally i want to stay under 2s while not paying so much for 2 CPU lambda

Oleh Kopyl avatar
Oleh Kopyl

Lambda responds in 2s on 2048 memory.

EC2 responds in under 1s on 1 vCPU. But only on pure flask multi-threaded without any WSGI thing like gunicorn (makes responses longer)

Oleh Kopyl avatar
Oleh Kopyl

Runtime: public.ecr.aws/lambda/python:3.10-arm64

Oleh Kopyl avatar
Oleh Kopyl

gave it 512 memory. On 2048 it’s faster, but i don’t need that much, i only need more CPU, not more memory

Oleh Kopyl avatar
Oleh Kopyl

access pattern: Function URL

Oleh Kopyl avatar
Oleh Kopyl

remote invocation

Oleh Kopyl avatar
Oleh Kopyl

does not access resources. Makes AI CPU inference (yolov8 ONNX)

z0rc3r avatar

If your lambda is called infrequently, increased latency is result of slow lambda start-up, which is kinda expected for lambdas is general. You might want to play with provisioned concurrency and see if it helps for your case https://aws.amazon.com/blogs/compute/new-for-aws-lambda-predictable-start-up-times-with-provisioned-concurrency/

New for AWS Lambda – Predictable start-up times with Provisioned Concurrency | Amazon Web Servicesattachment image

Since the launch of AWS Lambda five years ago, thousands of customers such as iRobot, Fender, and Expedia have experienced the benefits of the serverless operational model. Being able to spend less time on managing scaling and availability, builders are increasingly using serverless for more sophisticated workloads with more exacting latency requirements. As customers have […]

Oleh Kopyl avatar
Oleh Kopyl

It’s still to expensive, but thanks

2023-06-18

Oleh Kopyl avatar
Oleh Kopyl

Guys, could you please recommend any good course video or article on auto-scaling Fargate for beginners?

Everything i read and watch so far gives me me more questions that answers…

james192 avatar
james192

Isn’t it a case of updating the task definition to increase or decrease counts via CloudWatch alarms?

Oleh Kopyl avatar
Oleh Kopyl

@james192 don’t know. Which counts?

Oleh Kopyl avatar
Oleh Kopyl

Could you please recommend any good tutorials/courses on deploying Docker images to EKS and setting up auto-scaling?

Again, so tired of watching videos and reading articles which give me more questions than answers…

Hao Wang avatar
Hao Wang

to be frank, stop learning AWS would be a good option for you for now. picking it up in a few months would be a good strategy

1
Oleh Kopyl avatar
Oleh Kopyl

@Hao Wang why? I need to deploy my app to it now, not in a few months

2023-06-19

2023-06-20

Oleh Kopyl avatar
Oleh Kopyl

Guys, do you know how to edit a cloudwatch alarm for EC2 auto-scaling?

Was unable to Google anything on it (maybe i Googled it wrong…)

I need it to scale up if CPU is over 70% for 30 seconds and scale down if CPU is under 70% for 30 seconds

kunalsingthakur avatar
kunalsingthakur

To edit a CloudWatch alarm for EC2 auto-scaling, you can follow these steps:

  1. Sign in to the AWS Management Console and open the Amazon CloudWatch service.

  2. In the navigation pane, click on “Alarms” under the “CloudWatch” section.

  3. Locate the alarm that is associated with your EC2 auto-scaling group and select it.

  4. Click on the “Actions” dropdown menu and choose “Edit.”

  5. In the “Create/Edit Alarm” wizard, you can modify the alarm configuration to match your requirements.

    • Under the “Conditions” section, select the “Static” option for the “Threshold type.”
    • For the “Whenever” condition, choose “Greater” and enter “70” in the text box.
    • Set the “Period” to 30 seconds.
    • Enable the “Consecutive periods” option and set it to “1.”
    • Choose the appropriate “Statistic” (e.g., “Average” CPU utilization) and adjust the “Datapoints to alarm” if needed.
  6. Under the “Actions” section, click on the “Add notification action” button if you want to receive notifications when the alarm state changes.

  7. Optionally, you can configure auto-scaling actions when the alarm state is triggered.

    • Click on the “Add Scaling Action” button.
    • Choose the appropriate scaling policy for scaling up and scaling down.
    • Configure the desired scaling adjustments, such as the number of instances to add or remove.
    • Save the scaling actions.
  8. Review your changes and click on the “Save” button to update the alarm.

The edited CloudWatch alarm will now trigger scaling actions for your EC2 auto-scaling group based on the specified CPU utilization thresholds and duration.

Oleh Kopyl avatar
Oleh Kopyl

did you read my message?

kunalsingthakur avatar
kunalsingthakur

this is from chatgpt

2
Oleh Kopyl avatar
Oleh Kopyl

How do you add CloudFlare to AWS Load Balancer? Spent an hour trying to figure out how to make it work – no luck.

Without NGINX forwarding of course, since it seems like a huuuge redundant overhead

Oleh Kopyl avatar
Oleh Kopyl

@kunalsingthakur chatGPT does not give proper solution for this

rofl1
Oleh Kopyl avatar
Oleh Kopyl

Hell. I tried refreshing instances in auto-scaling group.

I thought that the logic for this is following

  1. Create a new instance
  2. Make sure that it’s port 80 is accessible
  3. Drop old instance, remove it from the auto-scaling group

But the logic is like this

  1. Remove old instance from the auto-scaling group
  2. Create a new instance
  3. Drop old instance

How to make it work like it should (1st case?)

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

the purpose of an auto-scaling group is not to manipulate instances manually

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

if you need more instances launched, increase the min size

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

if you want to refresh all running instances (for any reason), there is a menu item in the AWS console “Refresh Instances” (if you click on it, the ASG will replace all old instances with new ones making sure the min desired size is always running)

Hao Wang avatar
Hao Wang

This reminds me of the termination policy of AWS:

Hao Wang avatar
Hao Wang
If you did not assign a specific termination policy to the group, Amazon EC2 Auto Scaling uses the default termination policy. It selects the Availability Zone with two instances, and terminates the instance that was launched from the oldest launch template or launch configuration. If the instances were launched from the same launch template or launch configuration, Amazon EC2 Auto Scaling selects the instance that is closest to the next billing hour and terminates it.
Hao Wang avatar
Hao Wang

I don’t use the policy for a while but still remember the pain lol

Hao Wang avatar
Hao Wang

If investing too much into ASG, the beauty of k8s will be missed

Hao Wang avatar
Hao Wang

Learn k8s directly

Oleh Kopyl avatar
Oleh Kopyl

@Andriy Knysh (Cloud Posse) not manipulate maually? But how to update instances with new code? I have a launch script in the “user data” which clones github repo. How can i update all my instances automatically?

Oleh Kopyl avatar
Oleh Kopyl


if you want to refresh all running instances (for any reason), there is a menu item in the AWS console “Refresh Instances”
I did exactly this and accessing my load balancing was giving me http errors for a while, meaning it drop an existing instance from the group while another instance was not ready. Why?

Oleh Kopyl avatar
Oleh Kopyl

@Hao Wang what’s the ASG?)

Oleh Kopyl avatar
Oleh Kopyl

@Hao Wang does k8s launches Docker inside EC2 instances? Or Creates EC2 instances already as a Docker image without inner Docker overhead?

Alex Jurkiewicz avatar
Alex Jurkiewicz

It sounds like you could learn a lot from some introduction tutorials to auto scaling. AWS have lots of good video introductions to resources. If you feel masochistic you can also try to read the docs

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)


But how to update instances with new code? I have a launch script in the “user data” which clones github repo. How can i update all my instances automatically?

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

you use a Launch template, update any parameter in it, and the ASG will update all instances to use the new version of the Launch template

Andriy Knysh (Cloud Posse) avatar
Andriy Knysh (Cloud Posse)

``` resource “aws_launch_template” “default” { count = module.this.enabled ? 1 : 0

name_prefix = format(“%s%s”, module.this.id, module.this.delimiter)

dynamic “block_device_mappings” { for_each = var.block_device_mappings content { device_name = lookup(block_device_mappings.value, “device_name”, null) no_device = lookup(block_device_mappings.value, “no_device”, null) virtual_name = lookup(block_device_mappings.value, “virtual_name”, null)

  dynamic "ebs" {
    for_each = lookup(block_device_mappings.value, "ebs", null) == null ? [] : ["ebs"]
    content {
      delete_on_termination = lookup(block_device_mappings.value.ebs, "delete_on_termination", null)
      encrypted             = lookup(block_device_mappings.value.ebs, "encrypted", null)
      iops                  = lookup(block_device_mappings.value.ebs, "iops", null)
      kms_key_id            = lookup(block_device_mappings.value.ebs, "kms_key_id", null)
      snapshot_id           = lookup(block_device_mappings.value.ebs, "snapshot_id", null)
      volume_size           = lookup(block_device_mappings.value.ebs, "volume_size", null)
      volume_type           = lookup(block_device_mappings.value.ebs, "volume_type", null)
    }
  }
}   }

dynamic “credit_specification” { for_each = var.credit_specification != null ? [var.credit_specification] : [] content { cpu_credits = lookup(credit_specification.value, “cpu_credits”, null) } }

disable_api_termination = var.disable_api_termination ebs_optimized = var.ebs_optimized update_default_version = var.update_default_version

dynamic “elastic_gpu_specifications” { for_each = var.elastic_gpu_specifications != null ? [var.elastic_gpu_specifications] : [] content { type = lookup(elastic_gpu_specifications.value, “type”, null) } }

image_id = var.image_id instance_initiated_shutdown_behavior = var.instance_initiated_shutdown_behavior

dynamic “instance_market_options” { for_each = var.instance_market_options != null ? [var.instance_market_options] : [] content { market_type = lookup(instance_market_options.value, “market_type”, null)

  dynamic "spot_options" {
    for_each = (instance_market_options.value.spot_options != null ?
    [instance_market_options.value.spot_options] : [])
    content {
      block_duration_minutes         = lookup(spot_options.value, "block_duration_minutes", null)
      instance_interruption_behavior = lookup(spot_options.value, "instance_interruption_behavior", null)
      max_price                      = lookup(spot_options.value, "max_price", null)
      spot_instance_type             = lookup(spot_options.value, "spot_instance_type", null)
      valid_until                    = lookup(spot_options.value, "valid_until", null)
    }
  }
}   }

instance_type = var.instance_type key_name = var.key_name

dynamic “placement” { for_each = var.placement != null ? [var.placement] : [] content { affinity = lookup(placement.value, “affinity”, null) availability_zone = lookup(placement.value, “availability_zone”, null) group_name = lookup(placement.value, “group_name”, null) host_id = lookup(placement.value, “host_id”, null) tenancy = lookup(placement.value, “tenancy”, null) } }

user_data = var.user_data_base64

dynamic “iam_instance_profile” { for_each = var.iam_instance_profile_name != “” ? [var.iam_instance_profile_name] : [] content { name = iam_instance_profile.value } }

monitoring { enabled = var.enable_monitoring }

# https://github.com/terraform-providers/terraform-provider-aws/issues/4570 network_interfaces { description = module.this.id device_index = 0 associate_public_ip_address = var.associate_public_ip_address delete_on_termination = true security_groups = var.security_group_ids }

metadata_options { http_endpoint = (var.metadata_http_endpoint_enabled) ? “enabled” : “disabled” http_put_response_hop_limit = var.metadata_http_put_response_hop_limit http_tokens = (var.metadata_http_tokens_required) ? “required” : “optional” http_protocol_ipv6 = (var.metadata_http_protocol_ipv6_enabled) ? “enabled” : “disabled” instance_metadata_tags = (var.metadata_instance_metadata_tags_enabled) ? “enabled” : “disabled” }

dynamic “tag_specifications” { for_each = var.tag_specifications_resource_types

content {
  resource_type = tag_specifications.value
  tags          = module.this.tags
}   }

tags = module.this.tags

lifecycle { create_before_destroy = true } }

locals {
launch_template_block = {
id = one(aws_launch_template.default[*].id)
version = var.launch_template_version != “” ? var.launch_template_version : one(aws_launch_template.default[*].latest_version)
}
launch_template = (
var.mixed_instances_policy == null ? local.launch_template_block
null) mixed_instances_policy = ( var.mixed_instances_policy == null ? null : { instances_distribution = var.mixed_instances_policy.instances_distribution launch_template = local.launch_template_block override = var.mixed_instances_policy.override }) tags = { for key, value in module.this.tags : key => value if value != “” && value != null } }

resource “aws_autoscaling_group” “default” { count = module.this.enabled ? 1 : 0

name_prefix = format(“%s%s”, module.this.id, module.this.delimiter) vpc_zone_identifier = var.subnet_ids max_size = var.max_size min_size = var.min_size load_balancers = var.load_balancers health_check_grace_period = var.health_check_grace_period health_check_type = var.health_check_type min_elb_capacity = var.min_elb_capacity wait_for_elb_capacity = var.wait_for_elb_capacity target_group_arns = var.target_group_arns default_cooldown = var.default_cooldown force_delete = var.force_delete termination_policies = var.termination_policies suspended_processes = var.suspended_processes placement_group = var.placement_group enabled_metrics = var.enabled_metrics metrics_granularity = var.metrics_granularity wait_for_capacity_timeout = var.wait_for_capacity_timeout protect_from_scale_in = var.protect_from_scale_in service_linked_role_arn = var.service_linked_role_arn desired_capacity = var.desired_capacity max_instance_lifetime = var.max_instance_lifetime capacity_rebalance = var.capacity_rebalance

dynamic “instance_refresh” { for_each = (var.instance_refresh != null ? [var.instance_refresh] : [])

content {
  strategy = instance_refresh.value.strategy
  dynamic "preferences" {
    for_each = (length(instance_refresh.value.preferences) > 0 ? [instance_refresh.value.preferences] : [])
    content {
      instance_warmup        = lookup(preferences.value, "instance_warmup", null)
      min_healthy_percentage = lookup(preferences.value, "min_healthy_percentage", null)
    }
  }
  triggers = instance_refresh.value.triggers
}   }

dynamic “launch_template” { for_each = (local.launch_template != null ? [local.launch_template] : []) content { id = local.launch_template_block.id version = local.launch_template_block.version } }

dynamic “mixed_instances_policy” { for_each = (local.mixed_instances_policy != null ? [local.mixed_instances_policy] : []) content { dynamic “instances_distribution” { for_each = ( mixed_in…

Hao Wang avatar
Hao Wang

@Oleh Kopyl ASG = AutoScaling Group, yeah you can understand docker in EC2 for now, but under the hood it is much more complex

Hao Wang avatar
Hao Wang

Docker is just a middle man(run time) which can be replaced by others, like containerd

Oleh Kopyl avatar
Oleh Kopyl

@Alex Jurkiewicz docs? no, thank you very much.

If you could share a couple of good videos which actually helped you personally, i would appreciate it

Oleh Kopyl avatar
Oleh Kopyl

@Andriy Knysh (Cloud Posse) so it does the update automatically upon each template version creation?

Oleh Kopyl avatar
Oleh Kopyl

@Andriy Knysh (Cloud Posse) i don’t know Terraform. But should I?

Oleh Kopyl avatar
Oleh Kopyl

@Hao Wang i mean that i don’t want to Have Docker in Docker. So if messing around with user data of a launch template is the only option – i better do this than pay for 1 CPU i don’t use (which is used under the hood to have Docker inside Docker))

Hao Wang avatar
Hao Wang

It is not DnD

Oleh Kopyl avatar
Oleh Kopyl

@Hao Wang k8s?

Hao Wang avatar
Hao Wang

yeah

Hao Wang avatar
Hao Wang

lol better to dig in more, before I didn’t have ChatGPT…

Oleh Kopyl avatar
Oleh Kopyl

@Hao Wang thanks :)

Hao Wang avatar
Hao Wang

My pleasure, keep learning

Satish Tripathi avatar
Satish Tripathi

@Oleh Kopyl If you just want to refresh the instances in the ASG , this could be helpful. https://docs.aws.amazon.com/autoscaling/ec2/userguide/asg-instance-refresh.html

Satish Tripathi avatar
Satish Tripathi

if you are using terraform, you can add below setting into your asg configuration : https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/autoscaling_group#instance_refresh

Oleh Kopyl avatar
Oleh Kopyl

@Satish Tripathi thank you

1

2023-06-22

Oleh Kopyl avatar
Oleh Kopyl

i launch my ec2 instance with a launch script (user data). it ends with “python3 launch-app.py” Where are the logs of it in Ubuntu? Meaning where is the stderr and stdout?

Hao Wang avatar
Hao Wang

can be found under /var/log, it should be managed by systemd otherwise cloud-init will never finishes running

Satish Tripathi avatar
Satish Tripathi

i am not very sure but i remember i had a ubuntu server running with similar setup as yours and was trying to find stderr and stdout logs but i ended up finding nothing until you write those logs to a file. This article says the same. https://askubuntu.com/questions/1030938/where-do-i-find-stderr-logs#<i class="em em-~"</i>text=Command%20output%2C%20i.e.%20stdout%20and,writing%20it%20to%20a%20file> This could be helpful you can send your user data logs: https://repost.aws/knowledge-center/ec2-linux-log-user-data

Satish Tripathi avatar
Satish Tripathi

Another option is to look at the cloudinit logs.

Oleh Kopyl avatar
Oleh Kopyl

@Hao Wang yeah, but there are multiple files. So i was wondering what is the one for the logs i need…

I could as well check all the files, since there are not so many and avoid posting here.

Thank you very much any was

Hao Wang avatar
Hao Wang

Is checking all the files fun? Lol starting with cloud-init*

Hao Wang avatar
Hao Wang

if python3 launch-app.py run, cloud-init will hang forever

Oleh Kopyl avatar
Oleh Kopyl

@Hao Wang not fun, true

Oleh Kopyl avatar
Oleh Kopyl


if python3 launch-app.py run, cloud-init will hang forever
@Hao Wang what do you mean by “hang forever”?

Hao Wang avatar
Hao Wang

yeah, forever

jonjitsu avatar
jonjitsu

cloud init is not meant to run long running services. it’s meant for doing EC2 first run bootstrapping. It won’t even run on reboot. Your cloud init script should create a systemd service file, enable the service and start the service.

jonjitsu avatar
jonjitsu

if you did run the app at the end of the launch script then all the stdout/err will be in the /var/logs/cloud* files.

Oleh Kopyl avatar
Oleh Kopyl


cloud init is not meant to run long running services
Sure, but why can’t it?

Oleh Kopyl avatar
Oleh Kopyl

@jonjitsu

jonjitsu avatar
jonjitsu

cloud init is an agent which runs setup which eventually ends. It is not a service manager. For example it won’t run on reboot so if you try to run a service at the end of a cloud-init script it won’t run. All linuxes come with service managers. Ubuntu’s is systemd, so in your cloud init you can create the systemd service file and systemctl enable then systemctl start it and your service will be properly managed including in the case of reboots.

jonjitsu avatar
jonjitsu

I don’t know what happens when you try to run a never ending process at the end of a cloud init script or if that is even valid according to the cloud init specs.

Oleh Kopyl avatar
Oleh Kopyl

@jonjitsu just yesterday rewrote it to systemd since i needed restarts when an app fails (OOM kill)

#!/bin/bash
git clone https://[email protected]/kopyl/crossdot-yolo.git && \
mv crossdot-yolo/onnx /home/ubuntu/onnx && \
rm -r crossdot-yolo && \
mv /home/ubuntu/onnx/* /home/ubuntu/ && \
rm -r /home/ubuntu/onnx/ && \
echo "wget <https://bootstrap.pypa.io/get-pip.py> && \
python3 get-pip.py && \
python3 -m pip install --ignore-installed flask && \
python3 -m pip install psycopg2-binary && \
python3 -m pip install onnxruntime==1.13.1 && \
python3 -m pip install opencv-python-headless==4.7.0.72 " >> i.sh && \
chmod 777 i.sh && \
./i.sh && \
echo "[Unit]
Description=Flask app
After=multi-user.target
[Service]
Environment=AWS_POSTGRES_DB_HOST=.
Environment=AWS_POSTGRES_DB_USER=.
Environment=AWS_POSTGRES_DB_PASSWORD=.
Environment=ONNX_MODEL_PATH=/home/ubuntu/model.onnx
Environment=PYTHONUNBUFFERED=1
Type=simple
Restart=always
ExecStart=/usr/bin/python3 /home/ubuntu/flask-server-postgres.py
[Install]
WantedBy=multi-user.target" >> /etc/systemd/system/crossdot-flask-inference.service && \
systemctl daemon-reload && \
systemctl enable crossdot-flask-inference.service && \
systemctl start crossdot-flask-inference.service && \
systemctl status crossdot-flask-inference.service
Oleh Kopyl avatar
Oleh Kopyl

@jonjitsu
agent which runs setup which eventually ends
Do you have any proofs? Not trying to be rude, just really want to read it from some official source

Oleh Kopyl avatar
Oleh Kopyl

@jonjitsu by the way, is it going to start on its own on reboot or you need some additional configs?

jonjitsu avatar
jonjitsu

I don’t really have any proof. It might be out there though, you’ll have to check the docs.

jonjitsu avatar
jonjitsu

It should start on it’s own after reboot because you ran systemctl enable ...

Oleh Kopyl avatar
Oleh Kopyl

@jonjitsu was not able to find it in docs. How can you know then if you have no proofs?

2023-06-23

2023-06-26

Nishant Thorat avatar
Nishant Thorat

AWS S3 Buckets never cease to amaze me with their peculiar nature. https://www.cloudyali.io/blogs/aws-s3-bucket-creation-date-discrepancy-in-master-and-other-regions

AWS S3 creation date may not be consistent in all regions

AWS S3 bucket creation date may be reported differently in different regions. To get the S3 bucket creation date correctly call list api in us-east-1.

3
Alex Jurkiewicz avatar
Alex Jurkiewicz

Good post. Straightforward and brings receipts

AWS S3 creation date may not be consistent in all regions

AWS S3 bucket creation date may be reported differently in different regions. To get the S3 bucket creation date correctly call list api in us-east-1.

1

2023-06-27

Oleh Kopyl avatar
Oleh Kopyl

I have an issue with Fargate. It scales up fast (from 1 to 60 instances in 1 minute), but scales down tooo slow (from 60 to 1 instance in 59 min, meaning it scales 1 instance per 1 minute).

Can i have more control over it? I need it to scale up in 1 minute and down in 1 minute too (from whatever amount instances i have to whatever amount instances are needed at a moment be it 1 or 30 or anything else(

jonjitsu avatar
jonjitsu

if you change the service desired count to 1 you are saying it is taking 59 mins to remove all the extra containers?

Oleh Kopyl avatar
Oleh Kopyl

I’m not changing it to 1

Oleh Kopyl avatar
Oleh Kopyl

It changes to 1 on its own

Oleh Kopyl avatar
Oleh Kopyl

Reducing amount of “desired” by 1 per minute on its own

Oleh Kopyl avatar
Oleh Kopyl

I need it to reduce to one in 1 minute, not in 59

jonjitsu avatar
jonjitsu

why does it go from 1 to 60 back down to 1 in one minute? Is this a batch job or something? what mechanism are you using to scale it from 1 to 60?

Oleh Kopyl avatar
Oleh Kopyl
  • Is it possible to have app restarts on Fargate?

As if you launch it with systemd as a service…

I was getting 5xx errors from Fargate probably due to OOM kills. App restart would fix this crap….

jonjitsu avatar
jonjitsu

when a container dies it’s not getting restarted by fargate?

Oleh Kopyl avatar
Oleh Kopyl

@jonjitsu what do you mean by “dies” exactly?

Oleh Kopyl avatar
Oleh Kopyl

@jonjitsu i was getting 5xx HTTP error, so restarts wasn’t working obviously

Ibansal avatar
Ibansal

You can try fore deployment option in ecs servcie.

Vlad Ionescu (he/him) avatar
Vlad Ionescu (he/him)

@Oleh Kopyl how are you controlling the scaling? Scaling down should not take than long

Oleh Kopyl avatar
Oleh Kopyl

With CloudWatch metrics

Oleh Kopyl avatar
Oleh Kopyl

Does SageMaker Real-Time Inference scale up and down?

So I don’t pay for instances which are not used

Oleh Kopyl avatar
Oleh Kopyl
Deploy a Machine Learning Model for Inference - Amazon Web Services

Use this step-by-step, hands-on guide to learn how to deploy a trained machine learning model to a real-time inference endpoint.

Sami avatar

Hey all.

I’m pondering over a project I’m working on at the moment and was hoping to get some advice or thoughts from other people.

I have to design the architecture for a 3 tier nodejs application which consists of a simple web front-end, and API component, and a database. My initial thoughts are to go serverless and deploy the web and API components on Lambda and try to keep things light and quick. I am concerned here though about the potential lack of flexibility with the front-end. I understand that you can have a Lambda function return HTML but I don’t know how well it would work for the application’s progression in the furture.

Alternatively, I can containerise both the web and API components and move them onto ECS which would cost more but allow for greater flexibility and if need be a migration to Kubernetes if required down the track.

Has anybody got any thoughts on this? Have you deployed front-end on Lambda and had it work well or poorly?

Alex Jurkiewicz avatar
Alex Jurkiewicz

I think “light and quick” correlates more with using the technologies you know well, rather than using specific tools that have buzz

3
Sami avatar

I think that’s some wisdom I needed to hear

Oleh Kopyl avatar
Oleh Kopyl

Some say Fargate does restarts. But does it restart the whole image or just an app from CMD?

I was getting some 5xx errors from Fargate due to OOM Kills. OOM Kills are okay, but 5xx errors are not. With my EC2 instance (no Docker) systemd always restarts my Python app and since i have a load balancer, i never get 5xx errors (the worst what i can get - .5s delay on a request.

Meaning that Fargate seemingly makes whole image restarts instead of just my python app restarts (the thing which was in CMD like ["python", "main.py"]

If it’s really the case, is there any way to force Fargate to just restart my app (in the same way as systemd does it). I was trying to get systemd to work in Docker but was getting this error: /bin/sh: 4: systemd: not found. Even after i did apt update && apt install --reinstall systemd -y, i was still getting errors like this:

System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down

I tried using service instead of systemctl, but it had errors too:

# service crossdot-flask-inference.service on
crossdot-flask-inference.service: unrecognized service
Alex Jurkiewicz avatar
Alex Jurkiewicz

when a container fails healthcheck, the entire thing is replaced

Alex Jurkiewicz avatar
Alex Jurkiewicz

containers aren’t meant to be touched beyond the initial process being started. If you want the app to automatically restart, add a supervisor.

Systemd is one approach, but it’s pretty heavyweight. You could also use a shell script which contains something like

while true; do python main.py ; done
Oleh Kopyl avatar
Oleh Kopyl

What is supervisor? This thing? https://dev.to/rabeeaali/install-supervisor-on-aws-eb-with-laravel-5g8a

Is my python app going to be relaunched in a case of OOM kill if i launch it with the shell script?

Oleh Kopyl avatar
Oleh Kopyl


when a container fails healthcheck, the entire thing is replaced
Damn.. Not something i need…

Alex Jurkiewicz avatar
Alex Jurkiewicz

no, supervisor is “init system”. Generally, containers run without one. init systems are designed for long-running servers, but that’s not how containers work. As you see, they follow the philosophy “if it stops working, kill it and get a new one”

1

2023-06-28

Balazs Varga avatar
Balazs Varga

aws serverless v1. how vcan I restore from backup and not from snapshot?

Alex Jurkiewicz avatar
Alex Jurkiewicz

What do you mean by backup if not a snapshot?

Balazs Varga avatar
Balazs Varga

In rds I see snapshots and backups. I wanted to restore from backup. Guess that is good only for backup point in time

Gabriela Campana (Cloud Posse) avatar
Gabriela Campana (Cloud Posse)

@Balazs Varga do you still need to restore from backup? Maybe @Alex Jurkiewicz can help

Balazs Varga avatar
Balazs Varga

yes, to a different location. I think I can use the aws backup to copy over the backups to different region and then restore from that backup

1
Gabriela Campana (Cloud Posse) avatar
Gabriela Campana (Cloud Posse)

Ok, lmk if you need help

1
Balazs Varga avatar
Balazs Varga

thanks

2023-06-29

Daniel Ade avatar
Daniel Ade

Is anyone well versed in github actions and aws, i’m having an issues deploying my container image to ecr i keep getting Error: Not authorized to perform sts:AssumeRoleWithWebIdentity

Warren Parad avatar
Warren Parad

We have significant experience, that sounds like there is an issue with your trust policy of the role

Daniel Ade avatar
Daniel Ade

I gave the role admin access, i know its not good practice but i just wanted to make sure it was working

Warren Parad avatar
Warren Parad

that’s not the trust policy

Warren Parad avatar
Warren Parad

what’s the trust policy

Daniel Ade avatar
Daniel Ade

{ “Version”: “2012-10-17”, “Statement”: [ { “Effect”: “Allow”, “Principal”: { “Federated”: “arnawsiam:oidc-provider/token.actions.githubusercontent.com” }, “Action”: “sts:AssumeRoleWithWebIdentity”, “Condition”: { “StringLike”: { “http://token.actions.githubusercontent.com<i class="em em-aud|token.actions.githubusercontent.com"</i>aud>”: “sts.amazonaws.com”, “http://token.actions.githubusercontent.com<i class="em em-sub|token.actions.githubusercontent.com"</i>sub>”: “repo:dandiggas/firstwebapp” } } } ] }

Warren Parad avatar
Warren Parad

I think you are missing a *

Warren Parad avatar
Warren Parad

repo:dandiggas/firstwebapp:*

Daniel Ade avatar
Daniel Ade

ok ill try that now

Daniel Ade avatar
Daniel Ade

Yeah still not working. keep getting this error message Run aws-actions/configure-aws-credentials@v2 https://github.com/Dandiggas/FirstWebApp/actions/runs/5412372323/jobs/9836381905#step<i class="em em-3"6 (node:1742) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. https://github.com/Dandiggas/FirstWebApp/actions/runs/5412372323/jobs/9836381905#step<i class="em em-3"7

https://github.com/Dandiggas/FirstWebApp/actions/runs/5412372323/jobs/9836381905#step<i class="em em-3"8 Please migrate your code to use AWS SDK for JavaScript (v3). https://github.com/Dandiggas/FirstWebApp/actions/runs/5412372323/jobs/9836381905#step<i class="em em-3"9 For more information, check the migration guide at https://a.co/7PzMCcy https://github.com/Dandiggas/FirstWebApp/actions/runs/5412372323/jobs/9836381905#step<i class="em em-3"10 (Use node --trace-warnings ... to show where the warning was created) https://github.com/Dandiggas/FirstWebApp/actions/runs/5412372323/jobs/9836381905#step<i class="em em-3"11 Error: Not authorized to perform sts:AssumeRoleWithWebIdentity

Warren Parad avatar
Warren Parad

What happened when you tried removing the condition altogether?

Daniel Ade avatar
Daniel Ade

It worked thats what happened lol

Daniel Ade avatar
Daniel Ade

Thanks a lot! I’m quite new to this so making a lot of errors

Warren Parad avatar
Warren Parad

so you need some sort of condition, but that means there is a mismatch between your condition and the reality

Daniel Ade avatar
Daniel Ade

Yeah i just fixed it with the condition

Warren Parad avatar
Warren Parad

what’s the new condition?

Daniel Ade avatar
Daniel Ade

{ “Version”: “2012-10-17”, “Statement”: [ { “Effect”: “Allow”, “Principal”: { “Federated”: “arnawsiam:oidc-provider/token.actions.githubusercontent.com” }, “Action”: “sts:AssumeRoleWithWebIdentity”, “Condition”: { “StringLike”: { “http://token.actions.githubusercontent.com<i class="em em-aud|token.actions.githubusercontent.com"</i>aud>”: “sts.amazonaws.com”, “http://token.actions.githubusercontent.com<i class="em em-sub|token.actions.githubusercontent.com"</i>sub>”: “repo:Dandiggas/FirstWebApp:*” } } } ] }

Daniel Ade avatar
Daniel Ade

I didn’t realise that it was case sensitive

1
Daniel Ade avatar
Daniel Ade

Thanks Warren

Warren Parad avatar
Warren Parad

no problem

hamiltondjh avatar
hamiltondjh

Hey. Earlier this year we began using the terraform-aws-sso module to manage our human access to our AWS accounts. It works really well and has been a lifesaver, so upfront thank you to everyone who added to it. .

However I think I am missing something as only recently did we have a need to make a new account assignment and because I have a depends_on for my Okta moduole to make sure the okta groups are created before the account assignment is attempted, terraform is forcing a replacement of all account assignments despite the code only adding one.

Removing the depends_on fixes it in my plan, but I worry it will fail because it isn’t aware of the dependency on my okta module.

I did some searching and I think that this PR addressed this issue already by adding a variable to handle the dependency issue.

The variable identitystore_group_depends_on description states the value should be “a list of parameters to use for data resources to depend on”.

I don’t understand what parameters it’s referring to? Is it a list of all Okta groups I create?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

I suggest cross posting link to this message in #terraform

hamiltondjh avatar
hamiltondjh

Will do. Thanks Eric.

hamiltondjh avatar
hamiltondjh

@Simon Weil you clearly know what you’re doing since it’s your PR and the code in your example is super clean. Any ideas on how to fix this?

1

2023-06-30

Balazs Varga avatar
Balazs Varga

I have few question related to orangizations:

• I know I need to select a management account, but with delegated role can I have a user in member account to manage organization?

• can I limit this delegated role to OU ?

• if I delete the management account will it delete the all other aws accounts in organization ?

Gabriela Campana (Cloud Posse) avatar
Gabriela Campana (Cloud Posse)

@Ben Smith (Cloud Posse)

Ben Smith (Cloud Posse) avatar
Ben Smith (Cloud Posse)
  1. Todo this you’d have to use the aws organizations delegation to delegate Organization management to a delegated role. So you could have your AWS Org delegated to an aws-team . By default your management account is the root account, and contains your state bucket and is where your SuperAdmin role deploys the accounts component, which creates the member accounts.
  2. The aws organizations delegated role can be limited to an OU, meaning you just need to specify in the roles permissions
  3. I’ve never done this, as this is where we deploy our TFState and manage organization security policies. It appears from the aws docs that it would delete the organization, resulting in your member accounts becoming standalone accounts
Deleting an organization - AWS Organizations

Learn how to delete an AWS organization that you no longer need.

1
    keyboard_arrow_up