SweetOps #aws for July, 2024

Discussion related to Amazon Web Services (AWS)

Archive: https://archive.sweetops.com/aws/

2024-07-01

2024-07-02

2024-07-04

Dhamodharan

11:41:33 AM

Hi All, Seeking suggestions for a AWS POC,

Setting up a small AWS POC, planning to setup 1uat machine 1 prod machine and 1 Jenkins machine to build and deploy to both uat and prod.

To ensure the security, planning to go with aws organisation and keep 3accounts to keep all 3 servers. Is it good approach or any other approach to set it up? interms of security and cost effective.

Thanks in advance.

theherk

11:45:25 AM

An account per machine, seems like more overhead than required for a POC, but in general this seems like a good separation. Using jenkins seems like a bummer.

Dhamodharan

12:00:11 PM

We may be moving the same setup to live, so I am thinking this way..

Dhamodharan

12:01:06 PM

Also not sure about the costing with AWS, Is the aws accounts costs extra?

theherk

12:15:34 PM

No. You pay per resource used.

theherk

12:16:31 PM

Even organizations doesn’t cost extra, just the resources within the accounts attached to the organization.

Dhamodharan

12:26:33 PM

thanks for the info @theherk, I will implement the same approach then…

managedkaos

05:33:14 PM

For a POC… multiple accounts would be overkill.

However if your POC is to demonstrate account separation for a larger project, then yes, go for it. The org and the accounts are free.

I would think your breakout would be:

Production account for all production resources
UAT account for all non-production resources
Deployment account for automation. one thing that would be really great to acheive with this set up is only allowing deployments into Production or UAT via the services in the deployment account. That is, no manual changes unless absolutely necessary.

Using the UAT account resources as a deployment target, you would also realize all you would need to do to allow access to the production account resources — VPCs, Security Groups, Systems Manager connections, etc — from the deployment account.

However, if your POC is to only demonstrate deploying from Jenkins into two “environments” (not accounts) then the multi-account approach is overkill.

Dhamodharan

07:41:22 PM

hi @managedkaos thanks for the response, We would move the same setup to live if everything is good. So i thought this approach in that longrun. By keeping that in mind, hope this approach is good? Or you are suggesting some other option?

managedkaos

05:42:58 PM

Your approach is good, indeed! Not suggesting another option.

2024-07-05

Sairam

02:53:24 PM

Hi Everyone, need help in python runtime upgrade in aws lambda, I have deployed datadog as aws lambda application with python runtime as 3.7 a while ago. Have a lot of env vars in it. How do we upgrade the application with python 3.11 runtime. thanks in advance.

I did try by just manually upgrading the lambda function runtime to python3.11 but it breaks

theherk

03:47:29 PM

When you say it breaks, what do you mean. It might be that your code needs some changes to work with 3.11.

Sairam

10:27:34 AM

thanks for the reply. I get the below error. I used https://github.com/DataDog/datadog-serverless-functions/tree/master/aws/logs_monitoring#<i class="em em-~~~"https://github.com/DataDog/datadog-serverless-functions/tree/master/aws/logs_monitoring#:~~~ext=tabs%20%3E%7D%7D%20%[…]dFormation,-Log%20into%20your for installation.

[ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': cannot import name 'formatargspec' from 'inspect' (/var/lang/lib/python3.11/inspect.py)
Traceback (most recent call last):

theherk

11:08:08 AM

See What’s new in Python 3.11. With respect to formatargspec:

The formatargspec() function, deprecated since Python 3.5; use the inspect.signature() function or the inspect.Signature object directly. And since that guide you shared says to use Python 3.10, perhaps you should. That would be before a feature it is using was removed. And once it imports it will pass, and maybe (probably) succeed at importing lambda_function which I presume is your entry point.

What’s New In Python 3.11

Editor, Pablo Galindo Salgado,. This article explains the new features in Python 3.11, compared to 3.10. Python 3.11 was released on October 24, 2022. For full details, see the changelog. Summary –…

theherk

11:29:14 AM

I just stumbled across that tab again and noticed it list varying runtime requirements based on the version you’re running. So while it says “Create a Python 3.10 Lambda function”, the version here is actually based upon version. So if you run version Upgrade an older version to +3.107.0 it would support Python 3.11, meaning it probably won’t try to import formatargspec.

Sairam

02:30:38 PM

Thanks, even I think upgrading the datadog application itself helps rather than i only upgrade the python runtime alone.

Sairam

02:30:55 PM

I will keep you posted

Sairam

04:10:10 PM

Hi, Post upgrade of the datadog application, I get this error

[ERROR]	2024-07-11T15:51:44.602Z	2d1a12b4-8337-4fec-af18-7aebda4d3a58	[dd.trace_id=597347809750705622 dd.span_id=5039695315866320206]	Failed to get log group tags from cache
Traceback (most recent call last):
  File "/opt/python/caching/cloudwatch_log_group_cache.py", line 125, in _get_log_group_tags_from_cache
    response = self.s3_client.get_object(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lang/lib/python3.11/site-packages/botocore/client.py", line 553, in _api_call
    return self._make_api_call(operation_name, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lang/lib/python3.11/site-packages/botocore/client.py", line 1009, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist.

Please suggest what is it? Thanks in advance

theherk

04:46:43 PM

Looks like you’re going to need to troubleshoot why that key isn’t there or why your lambda can’t see it.

Sairam

09:56:11 PM

This is part of the baseline code of the Datadog Forwarder.

According to that method, it handle that exception. Im not sure how to add the key… before upgrading there was no issue

            response = self.s3_client.get_object(

2024-07-08

2024-07-11

2024-07-15

Prasad

06:46:01 PM

I have a ALB in a source acct routing to a NLB in a target account at the moment …we have a use case to have Private Link setup from another source account …can the endpont link be setup with the same NLB in target account by creating endpoint service.. i want both routes to work

Gabriela Campana (Cloud Posse)

06:56:21 PM

@Jeremy White (Cloud Posse)

Jeremy White (Cloud Posse)

02:48:29 PM

I don’t immediately see why that wouldn’t work. I don’t recall ever having that same scenario, however.

Prasad

11:18:08 PM

Thanks Yeah it works

2024-07-19

Sean Turner

05:43:47 PM

Hey all, curious what you all think.

Jupyterhub Notebooks on EKS has a worst case scenario cold start where a Data Scientist needs to wait for a Node to spin up and for the large Docker Image to pull.

The thinking is that we can largely eliminate (or at least reduce) the Docker Image pull time by creating AMIs with the Docker Image on them (with Image Builder pulled as ec2-user). Jupyterhub would then launch workloads (notebook servers) onto these AMIs as Nodes managed by Karpenter with Taints/Tolerations and Node Affinity.

However, it seems like ec2-user and the kubelet (or containerd?) have different docker storage (there’s only one EBS volume attached). This is causing EKS to pull images that should already be available to it because the image was previously pulled by ec2-user.

Running a docker images command run on the node (via SSH as ec2-user) shows a couple our latest tag which was pulled while building the AMI. Launching a Notebook with a specific tag “foo” caused a docker pull to occur. When it was finished, running docker images via SSH again did not show foo in the output.

Conversely, pulling a different tag bar as ec2-user and then launching a Notebook Server with bar caused EKS to pull the Image again.

Any ideas?

Sean Turner

06:27:43 PM

Interesting, looks like the Images are in the output of ctr -n [k8s.io](http://k8s.io) images list. Seems like I’ll need to get Image Builder to pull my image to that namespace with ctr

Sean Turner

10:22:47 PM

This is the solution I came up with. Haven’t tested it yet (as in launched a notebook) but I think it works (it’s pulling my image successfully to the same namespace that EKS uses)

phases:
  - name: build
    steps:
      - name: pull-machine-prospector
        action: ExecuteBash
        inputs:
          commands:
            - password=$(aws ecr get-login-password --region us-west-2)
              # Redirecting stdout because the process creates thousands of log lines.
            - sudo ctr --namespace k8s.io images pull --user AWS:$password acct.dkr.ecr.us-west-2.amazonaws.com/app:latest > /dev/null
            - sudo ctr --namespace k8s.io images list
  - name: test
    steps:
      - name: confirm-image-pulled
        action: ExecuteBash
        inputs:
          commands:
            - set -e
            - sudo ctr --namespace k8s.io images list | grep app

Sean Turner

10:51:52 PM

Didn’t seem to work, image still needed to pull.

2024-07-23

Yangci Ou

01:25:15 AM

Hey guys, I see that CloudPosse prefers using Ecspresso as the ECS cli tooling. I’m curious to hear why that’s the case, and what do you look for? - and if there’s benefits y’all see in using this versues other tools like ecs-deploy , or even plain AWS commands as a script as as this https://github.com/silinternational/ecs-deploy/tree/develop? From what I’m seeing, ecspresso definitely has better task definition control.

Gabriela Campana (Cloud Posse)

01:20:15 AM

@Ben Smith (Cloud Posse) @Igor Rodionov

Erik Osterman (Cloud Posse)

08:30:13 PM

Sorry for the delay, I meant to share this thread: https://archive.sweetops.com/aws/2024/04/#019d7c5e-1226-4a6b-ac03-5f4af8968f7d

SweetOps #aws for April, 2024

SweetOps Slack archive of #aws for April, 2024. Discussion related to Amazon Web Services (AWS)

Erik Osterman (Cloud Posse)

08:30:51 PM

TL;DR: we’re still looking for the silver bullet

Erik Osterman (Cloud Posse)

08:31:08 PM

:crossed_fingers: for AWS copilot CLI for ECS

Erik Osterman (Cloud Posse)

08:32:42 PM

We’ve had multiple iterations of our solution for ECS, and ecspresso is pretty nice. We chose it over other tools because it’s compiled as a single binary, easy to install, supports task definition templates out of the box, works with data sources to fetch data used in templates

Erik Osterman (Cloud Posse)

08:33:10 PM

Most of the family of ecs-deploy commands (there’s probably a half dozen or more) are scripts (shell, python, etc)

Erik Osterman (Cloud Posse)

08:34:49 PM

Escpresso also has a nice YAML-based configuration.

Erik Osterman (Cloud Posse)

08:36:09 PM

In the long run, want to see this win: https://aws.github.io/copilot-cli/

AWS Copilot CLI

Develop, Release and Operate Container Apps on AWS.

Yangci Ou

10:23:56 PM

Oooh yeah the copilot CLI is interesting, was looking at it earlier and it manages everything so having custom stuff through Terraform might be hard.

Yangci Ou

10:25:25 PM

because it’s compiled as a single binary This is a nice one

Erik Osterman (Cloud Posse)

03:06:41 PM

https://github.com/aws/amazon-ecs-cli

aws/amazon-ecs-cli

The Amazon ECS CLI enables users to run their applications on ECS/Fargate using the Docker Compose file format, quickly provision resources, push/pull images in ECR, and monitor running applications on ECS/Fargate.

Erik Osterman (Cloud Posse)

03:06:47 PM

Somehow I missed this entirely

Erik Osterman (Cloud Posse)

03:06:51 PM

cc @Matt Gowie

Matt Gowie

03:46:25 PM

Yeah, always interested in this discussion – Thanks for the ping @Erik Osterman (Cloud Posse).

Still feels like no silver bullet yet. My team did a recent re-implementation that we like. We should write it up into a blog post or video so we can share it out. I don’t think it’s perfect, but it at least doesn’t introduce drift and enables us to have a speedy CD for our downstream application team. I’ll follow up if we push forward with a writeup or video.

Matt Gowie

03:48:09 PM

Ahaaa @Yangci Ou I was just about to cc you on this thread and then I come back to it and see you started it. I read all your comments and didn’t read the name

Erik Osterman (Cloud Posse)

05:05:33 PM

I’d like to hear what you came up with

Erik Osterman (Cloud Posse)

05:05:53 PM

I like the idea of just using Docker Compose files though. That way you get the same experience locally as you do when you deploy.

2024-07-24

2024-07-27

2024-07-29

2024-07-31

Dexter Cariño

08:06:10 AM

any body here have an idea on how to get the live data of dynamodb? planning to stream the data from dynamodb to bigquery or dynamodb to s3 to bigquery.

any insights or idea without using a third party tool. thank you so much.

Darren Cunningham

11:41:53 AM

I know you said without using a third party tool, but I’d suggest considering Airbyte for the job rather than rolling your own. Otherwise the “simplest” solution is probably DynamoDB Streams -> Lambda. CDC/ETL jobs have a ton of different factors though (why Snowflake is named what it is), so it’s going to be hard for those without intimate knowledge thereof to be accurate. Aka, I could be wrong.

Gabriela Campana (Cloud Posse)

06:38:41 PM

@Jeremy White (Cloud Posse)

Jeremy White (Cloud Posse)

07:39:58 PM

I’d second that. Using a lambda is the easiest way. You could also try using a PITR with s3 export but there will be a lag time during which any deletes/updates to the data being restored would potentially get dropped. Being careful of that might work best, but again it has the risk of data integrity loss.

DynamoDB data export to Amazon S3: how it works - Amazon DynamoDB

DynamoDB offers a fully managed solution to export your data to Amazon S3 at scale. This allows you to perform analytics and complex queries using other AWS services like Amazon Athena, AWS Glue, and Amazon EMR. Exports can be full or incremental, and are charged based on the size of the data. Your data is encrypted end-to-end, and you can export to an S3 bucket owned by another AWS account or Region.

Dexter Cariño

01:05:35 AM

thank you so much for your insights. cheers!

shayyanmalik13

04:57:09 PM

Hi hi, anyone here uses prometheus thanos grafana stack? I have 4 AWS envs (different accounts) and want to set up prometheus in all these envs but only one thanos and grafana. Trying to see what the industry standard is for connecting it all together. Vpc peering or transit gateways seem insecure.

Joe Perez

05:02:13 PM

I’m not familiar with this set up, but privatelink might be what you’re looking for. The service provider being the Thanos installation inside that separate VPC/AWS account and each of the Prometheus VPCs/Accounts being service consumers

shayyanmalik13

05:05:03 PM

Got it. I just looked at privatelink and you’re right, it does allow specific resource exposure. But wouldn’t it also share the same security concerns as VPC peering for this specific use case because prometheus will be running on the same eks cluster as my app. So if I share this cluster through privatelink, app still gets exposed to other environments?

Joe Perez

05:10:21 PM

I’m not as close to the how security works within EKS, but if you can associate a specific security group with individual workloads, then you can limit which traffic can reach the VPC endpoint ENIs on the Prometheus side

shayyanmalik13

05:10:36 PM

^^ Disregard - I believe I can expose prometheus pod as a service through NLB and use endpoint here.

shayyanmalik13

05:10:50 PM

Thank you so much @Joe Perez - privatelink’s the way to go.

Joe Perez

05:11:23 PM

No problem. I also wrote a couple of articles on what I’ve learned about PrivateLink

Joe Perez

05:11:47 PM

https://www.taccoform.com/posts/aws_pvt_link_1/

AWS PrivateLink Part 1

Overview Your company is growing and now you have to find out how to allow communication between services across VPCs and AWS accounts. You don’t want send traffic over the public Internet and maintaining VPC Peering isn’t a fun prospect. Implementing an AWS supported solution is the top priority and AWS PrivateLink can be a front-runner for enabling your infrastructure to scale. Lesson What is AWS PrivateLink? PrivateLink Components Gotchas Next Steps What is AWS PrivateLink?

Joe Perez

05:12:12 PM

https://www.taccoform.com/posts/aws_pvt_link_2/

AWS PrivateLink Part 2

Overview In the previous PrivateLink post, we went through the separate resources that make up AWS PrivateLink. In this post, we will be provisioning a PrivateLink configuration which will allow resources in one VPC to connect to a web service in another VPC. You can use several different AWS services to accomplish the same goal, but PrivateLink can simplify some setups and meet security expectations with its standard one-way communication.

shayyanmalik13

05:13:35 PM

I’ll take a look! Appreciate it, Joe.