SweetOps #aws for January, 2021

Discussion related to Amazon Web Services (AWS)

Archive: https://archive.sweetops.com/aws/

2021-01-04

btai

theres no db slack channel, so I’m asking here since I’m using RDS (and theyre deprecating support for my postgres version). Anyone thats done the postgres 9 -> postgres 10/11 migration have any gotchas we should be concerned about when doing it?

Matt Gowie

09:51:08 PM

Ah geez — When are they deprecating 9 exactly?

jose.amengual

09:54:50 PM

we did a migration from 10-11 and it was fine but that all depends if you are using version specific features but for what I have seen this happens more on the mysql side then postgres

jose.amengual

09:54:58 PM

they usually do not have breaking changes

Darren Cunningham

10:05:03 PM

not helpful as to the “how”, but as to the “why”…when deciding to upgrade PostgreSQL this is a great resource: https://why-upgrade.depesz.com/show?from=9.6.20&to=12.4&keywords=

Erik Osterman (Cloud Posse)

10:19:08 PM

@rms1000watt

rms1000watt

10:21:29 PM

@btai we used pglogical to upgrade from 9.6 -> 11.9

rms1000watt

10:21:31 PM

https://medium.com/@ronakdotpatel/rds-postgres-migration-with-zero-downtime-247eb4157260

RDS Postgres Migration with Zero Downtime attachment image

Have you ever wondered what it would feel like to change planes while it is in flight? I for sure do not want to be on that flight, but…

rms1000watt

10:22:08 PM

this is the guts of it. we ended up enhancing this process using scripts

rms1000watt

10:22:20 PM

but it’s all predicated on “can you allow downtime.. or not?”

Alex Jurkiewicz

10:35:06 PM

9.5 is being deprecated right now. There’s a couple of weeks remaining before AWS mark it EOL

btai

10:45:49 PM

We’re on 9.5 — Feb 16 @Matt Gowie

Upgrade your RDS for PostgreSQL 9.5 databases before Feb 16, 2021
The RDS for PostgreSQL 9.5 end-of-life date is approaching. Your database is using a version that must be upgraded to 12 or higher as soon as possible. We plan to automatically upgrade RDS for PostgreSQL 9.5 databases to 12 starting February 16, 2021 00:00:01 AM UTC.

Matt Gowie

11:24:48 PM

Ah 9.5 — Thanks man

btai

11:27:28 PM

@rms1000watt how much resource pressure happened (read iops, throughput, etc) on the source DB during the 4 days of synchronization?

rms1000watt

11:28:51 PM

it wasn’t bad only because we overprovisioned to becareful of this situation

rms1000watt

11:28:59 PM

like we increased IOPS, CPU, etc

rms1000watt

11:29:56 PM

(to be fair, the most kudos need to be given to @Ronak for the first migration, and these subsequent two migrations. He’s the real mastermind.)

rms1000watt

11:30:36 PM

but there were a few tests beforehand to measure the effects of a migration on a DB of our size

rms1000watt

11:30:59 PM

(like stand up a second DB from a snapshot with the same resources and try the migration from that, just to test how it’d behave)

btai

11:31:08 PM

do you mind giving the details of your db size?

rms1000watt

11:31:27 PM

24x.large .. waaaaaay over provisioned

btai

11:31:50 PM

is that a typo?

btai

11:32:12 PM

jeez

btai

11:32:30 PM

didnt even know 24xlarges exist

rms1000watt

11:32:36 PM

btai

11:33:19 PM

so our 2xlarge will migrate in a much shorter amount of time

rms1000watt

11:33:20 PM

don’t worry, we were able to peg that DB at 100% cpu a couple times

rms1000watt

11:33:32 PM

yea, we had like.. 3TB?

btai

11:35:19 PM

We can afford downtime on the weekends as long as we put together some language for our customers. I’m tempted to spin up a second RDS instance w/ snapshot, upgrade the db to pg11 and do the cutover at the route53 level.

btai

11:37:01 PM

Did you guys run into any backwards incompatible changes at the application level?

rms1000watt

11:40:19 PM

na, we were good 9.6 -> 11.8

rms1000watt

11:40:37 PM

but if you can afford downtime, don’t even mess with pglogical

rms1000watt

11:43:37 PM

for the DBs we could afford downtime we just did:

• RDS upgrade from 9.6 -> 11.8

• Create Aurora Read Replicas from the 11.8

• Promote to a Aurora Read Replica

• Cutover the application configs to Aurora endpoint

rms1000watt

11:43:55 PM

actually we went to latest 11.9, but it’s the same process

btai

11:44:35 PM

awesome. your two way replication solution is definitely a thing of beauty. But I do think I can afford to spin up a second RDS instance over the weekend upgraded to pg11 and do the cutover at route53. my solution probably wont warrant a cool medium blog about it though.

Alex Jurkiewicz

11:50:02 PM

the eternal war between resume and actual-need-driven development

2021-01-05

Mohammed Yahya

04:34:23 PM

In https://github.com/cloudposse/reference-architectures#3-delegate-dns Can some one explains An available domain we can use for DNS-base service discovery (E.g. [ourcompany.co](http://ourcompany.co)). This domain must not be in use elsewhere as the master account will need to be the authoritative name server (SOA).

cloudposse/reference-architectures

[WIP] Get up and running quickly with one of our reference architecture using our fully automated cold-start process. - cloudposse/reference-architectures

Steve Wade (swade1987)

04:51:41 PM

does anyone have a “best practice” for where to have the S3 bucket for access logs for an account in the account itself or in the security account and replicate

Darren Cunningham

04:56:52 PM

I pretty much asked about this a few weeks ago…

https://sweetops.slack.com/archives/CCT1E7JJY/p1608486517167200

I’m thinking about creating an account dedicated to storage within my AWS Org –

My initial plan is to consolidate the S3 bucket sprawl I have between accounts into a single bucket within that account that provides my Org access to write and each account access to read it’s files using a prefix condition on the bucket policy.

Bucket sprawl for Service logs: VPC Flow, S3 Access, CloudFront logs, etc – application buckets would remain in their account….not looking to fix that sprawl.

Any words of wisdom or horror stories that I should be aware of before I embark on my quest?

Darren Cunningham

04:57:01 PM

the responses weren’t threaded though

Mohammed Yahya

05:42:03 AM

I loved the word “best practice” before, now I hate it, there is no best practice - I mean - one way to do a thing, there are patterns which mean multiple valid ways to do it. Every use case is unique, so the answer depends on your use case:

• single AWS account - use same account.

• multiple AWS account - create Access-log s3 bucket in the logging account/security account/or audit account and send data there. in both cases setup retention settings and archiving.

Steve Wade (swade1987)

09:44:46 AM

@Mohammed Yahya i completely agree with you i hate the word as well, i just wondered really what people do and why

sheldonh

05:07:25 PM

Anyone use AWS distributor for packages? I prefer choco and dsc but I need to create a datadog package and want to cover linux + windows. I’d like to know if various distros can be handled easily, configure etc. Overall if I any problems using or smooth sailing.

Otherwise I have to do a mix of ansible+ dsc and more and it’s unlikely others will be comfortable with that.

In addition while I’m a fan of ansible I primarily use AWS SSM to manage a mix of Windows and Linux instances. At this time AWS SSM only will run playbooks for Linux.

Erik Osterman (Cloud Posse)

09:25:04 PM

We use cloudsmith

Erik Osterman (Cloud Posse)

09:25:20 PM

For the packages themselves

2021-01-06

sheldonh

09:19:24 PM

bump going to tackle this in next couple days so would love to know your experience

MattyB

09:22:09 PM

https://docs.aws.amazon.com/systems-manager/latest/userguide/distributor.html

this? never heard of it but sounds interesting

AWS Systems Manager Distributor - AWS Systems Manager

Create, manage, and deploy software packages on Systems Manager managed instances.

sheldonh

09:59:39 PM

I have a system in place with choco + auto builds, but I’m trying to eliminate using multiple toolchains for each. If I can wrap up all into a single package I’ll probably have better luck maintaining it

Yoni Leitersdorf (Indeni Cloudrail)

09:45:45 PM

Having an internal debate and I’m curious what your guys’ thoughts are: How many people do you think there are (worldwide) who are using Terraform with AWS today? Please include your rationale for your answer.

Some stats:

• The AWS Provider github repo has 5k stars and 2k contributors.

• The AWS Provider has been downloaded 216.2M times.

• This channel has 2,221 members.

Darren Cunningham

09:49:10 PM

engineers who know and actively write HCL or total number of developers that inherently are “using” Terraform?

Darren Cunningham

09:49:54 PM

because I’d barter that those two numbers are dramatically different

Yoni Leitersdorf (Indeni Cloudrail)

09:50:18 PM

Oh great question! Let’s focus on the first one - those actually writing HCL. It can then be a proxy to the second one.

Darren Cunningham

09:51:52 PM

I’d guesstimate that it’s somewhere around 250:1 in regards to users:contributors – so half-million? (my initial 500:1 sounded silly after I thought about it)

Issif

09:55:42 PM

how many people is also different of how many infrastructures are managed. we’re ~8 in my company but we managed dozens of different stacks

Yoni Leitersdorf (Indeni Cloudrail)

09:56:50 PM

8 what - HCL developers, total developers, total people?

Issif

09:58:49 PM

8 people who write HCL every days

Issif

09:59:15 PM

for 150 developpers

Yoni Leitersdorf (Indeni Cloudrail)

10:01:27 PM

So I would count 8 users in this case. This is a theoretical question, but you can then derive a bunch of different things from it - like how many companies use AWS+TF, how many developers in the world indirectly use AWS+TF (like the 150 developers you mentioned), etc.

Once we can answer these questions, we can compare AWS+TF’s usage with that of other open source technologies and see what level of adoption it has.

Issif

10:05:46 PM

I think I’ve already answer to a survey with that kind of question

Yoni Leitersdorf (Indeni Cloudrail)

10:06:14 PM

Where?

MattyB

10:07:21 PM

I’m guessing the number is at least 100,000 but no more than a 1,000,000 just for everyday folks that use Terraform w/ AWS. I think you could easily cross the 2-3 mil threshold if you include folks who do simple POCs and make minor contributions to custom modules. I don’t know how to make an accurate guesstimate further than this. Is this how many people have used it over the lifetime of the products in question, or people that have used it within the last week, month, year?

Darren Cunningham

10:09:11 PM

according to the Google Machine there were 21m software developers in the world in 2016 – if you estimated 24m today…I’d be shocked to learn that more than 2% engineers write HCL actively – so I’m sticking with my 500k number

Issif

10:10:12 PM

@Yoni Leitersdorf (Indeni Cloudrail) can’t remember sorry

Yoni Leitersdorf (Indeni Cloudrail)

10:33:26 PM

At @Issif’s company, 5% of developers write HCL, but maybe they’re a more advanced company. Larger companies, with thousands of developers, probably have an even lower percentage of developers who write HCL. I wonder if 2% is too high even.

@MattyB good point, the focus is on “active” users, so let’s say they use it on a regular basis (weekly).

Alex Jurkiewicz

12:26:26 AM

Another vote for ~100k Terraform developers in the whole world. So perhaps 70k Terraform AWS devs

Yoni Leitersdorf (Indeni Cloudrail)

01:02:11 AM

@Alex Jurkiewicz what’s your rationale for that number?

Alex Jurkiewicz

01:08:10 AM

I estimate 1% of software developers write Terraform, plus Darren’s number of 20m software devs globally

Yoni Leitersdorf (Indeni Cloudrail)

02:09:22 AM

So I tried validating that 1% or 2%. My closest approach is to look at GitHub repos stats: https://github.com/oprogramador/github-languages#all-active

If you focus only on repos that have been active for the year before the report ran (given HCL is new and growing quickly), you’ll see that far far less than 1% of all repos have HCL.

Can this be a reliable proxy to determining the percentage of developers who work with HCL?

MattyB

02:12:18 AM

I don’t know. Multiple repos can be owned by a single developer. Maybe if you could extrapolate people who have contributed to all repos you could get a minimum number?

MattyB

02:13:41 AM

@Alex Jurkiewicz are you saying 100k terraform AWS developers or just terraform? If that’s just terraform then there should be far fewer terraform AWS since it’s used for a ton of providers

Alex Jurkiewicz

09:24:47 AM

@Yoni Leitersdorf (Indeni Cloudrail) that’s clever and on reflection yes I think 1% estimate was way too high.

Yoni Leitersdorf (Indeni Cloudrail)

03:45:58 PM

Continuing along the GitHub line of thinking, GitHub reported 40M users a year ago. Let’s say it’s closer to 45M now. 1% would be 450,000. As I said, I think it’s far less than 1%. At 0.25% for example, it would support the 100,000 number Alex used.

I took the table from the link above and summed the total number of repos, it’s 18.5m in total (yes, I know many repos can have multiple languages, but trying to get a broad stroke). Of those, 0.21% have HCL.

So 0.21% of 45M is 94,500, close to the 100,000 number.

Alex Jurkiewicz

09:48:41 PM

I think a lot of github metrics can underweight big “old” languages like Java and C++, and overweight modern github-first tools like Terraform. So IMO that 0.21% might still be an overestimate

Darren Cunningham

09:58:49 PM

I also think to attribute 100% of GitHub Users as people who are actively writing code is large assumption

Yoni Leitersdorf (Indeni Cloudrail)

10:21:11 PM

Good point @Darren Cunningham. I wonder if there’s a way to answer what % of GitHub Users are coders, and what % of coders have GitHub useres.

Darren Cunningham

10:21:51 PM

% are abandoned accounts

Darren Cunningham

10:22:11 PM

GitHub claiming 40m users is a publicity stunt IMO

Yoni Leitersdorf (Indeni Cloudrail)

10:22:23 PM

How do you define abandoned accounts?

Darren Cunningham

10:22:48 PM

My Digital Ocean account that I haven’t logged into since 2010

Darren Cunningham

10:22:58 PM

technically I’m “a user” of their platform

Yoni Leitersdorf (Indeni Cloudrail)

10:23:19 PM

LOL

Darren Cunningham

10:23:42 PM

how many people have just created a new account rather than updated their email in GitHub?

Darren Cunningham

10:23:48 PM

how many have 2 accounts…I know I do

Yoni Leitersdorf (Indeni Cloudrail)

10:24:19 PM

So do I actually. I also work with a guy that has 3 or 4.

Darren Cunningham

10:24:41 PM

using number of accounts seems like you’re creating a new complexity factor that’s way out of control

Yoni Leitersdorf (Indeni Cloudrail)

04:04:50 PM

Yeah, I agree. So if we go back to the 20m or so developers, and 0.21%, we’re looking at 42k HCL devs

Connor Gervin

11:14:01 PM

Hi All - any help greatly appreciated (wasn’t sure which channel was best for this one)

https://sweetops.slack.com/archives/CB6GHNLG0/p1609975378080100

Hi All - first post so excuse if silly :)

Wondering what’s best practice for creating kafka topics post cluster creation using the CloudPosse MSK Module?

AWS doesn’t appear to support anything directly on MSK and even references the apache shell scripts (here points to here)

If really cli only, is it possible to run a template file after the MSK Cluster is created to run the shell scripts? e.g.

$ bin/kafka-topics.sh --bootstrap-server localhost:9092 --create --topic my-topic --partitions 1 \
      --replication-factor 1 --config max.message.bytes=64000 --config flush.messages=1

Thanks for any help

2021-01-07

Babar Baig

02:30:44 PM

Hi I am trying to find a tool/vault to manage my passwords (a tool like roboform, lastpass). I am looking for an open source tool like this which I can configure on my Linux EC2 instance and access via a UI. Is there any tool like this?

Evangelos Karvounis

02:35:44 PM

have you checked https://bitwarden.com/?

Babar Baig

02:57:04 PM

Yes I explored

BitWarden
Passbolt
HyperVault What do you think is it a good option to self host the service? I mean we’ll probably need to provision an RDS database with the EC2 instance. I have a feeling that the cost may increase. Password managers like Roboform costs around 3 USD per user per month. So I am little confused here what to use. Self hosted or buy a plan for my organization?

Evangelos Karvounis

03:30:48 PM

it depends on your use case. how many ppl are you going to use it? I have oly used BW for personal usage so I cannot really speak enterprise wise

Babar Baig

03:31:32 PM

Around 20 to 50 users

Mohammed Yahya

07:44:41 PM

Hashicorp Vault = opensource + UI + production grade solution

Babar Baig

04:28:46 AM

That’s also one option. I’ll explore more @Mohammed Yahya Thanks.

2021-01-08

Babar Baig

10:35:51 AM

Hey I want to work with a subdomain on CloudFlare but unfortunately it does not support working with subdomains. What are my options? Will Route53/Cloudfront be useful?

Babar Baig

10:53:05 AM

Guys I’ve been using AWS SSM Parameter store for storing my credentials like RDS database credentials so that I can access them via API or CLI in my pipelines or infrastructure code. I am thinking to put my AWS credentials in SSM parameter store because my Rails application (which is deployed in ECS via Terraform) demands AWS keys for accessing an S3 bucket. Should I put AWS credentials in SSM? I just feel that it is not right way to deal with this problem.

Aleksandr Fofanov

11:16:49 AM

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-iam-roles.html The key points are:
With IAM roles for Amazon ECS tasks, you can specify an IAM role that can be used by the containers in a task.

Instead of creating and distributing your AWS credentials to the containers or using the EC2 instance’s role, you can associate an IAM role with an ECS task definition or RunTask API operation. So you need to:

Create IAM role for your ECS task, grant this role required S3 permissions
Update your task definition to use this IAM role
Make sure that your ruby app uses AWS SDK’s automatic configuration for credentials

IAM Roles for Tasks - Amazon Elastic Container Service

With IAM roles for Amazon ECS tasks, you can specify an IAM role that can be used by the containers in a task. Applications must sign their AWS API requests with AWS credentials, and this feature provides a strategy for managing credentials for your applications to use, similar to the way that Amazon EC2 instance profiles provide credentials to EC2 instances. Instead of creating and distributing your AWS credentials to the containers or using the EC2 instance’s role, you can associate an IAM role with an ECS task definition or

Babar Baig

11:27:45 AM

Nice. I did not know this option is available. Thanks.

Babar Baig

11:38:10 AM

When I am using this Task execution role, Will this allow me run AWS cli commands? For example Will I be able to run aws s3 ls inside the container running my Ruby app?

Aleksandr Fofanov

11:46:06 AM

You mixing up Task Execution Role (https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_execution_IAM_role.html) and Task Role. Task Execution Role is used by container agents. Task Role is used by containers in you Task. You grant Task Execution Role permissions to access your secrets, pull image from ECR, write logs etc. And you grant Task Role permissions which your applications need (e.g. s3:ListBucket etc.).
AWS CLI uses AWS SDK under the hood and can also auto-discover the credentials (https://docs.aws.amazon.com/cli/latest/topic/config-vars.html#using-aws-iam-roles). No need to hard-code them.

Amazon ECS task execution IAM role - Amazon Elastic Container Service

The task execution role grants the Amazon ECS container and Fargate agents permission to make AWS API calls on your behalf. The task execution IAM role is required depending on the requirements of your task. You can have multiple task execution roles for different purposes and services associated with your account.

Babar Baig

11:49:16 AM

In my code I created one role and passed the same role ARN for both Task Role and Task Execution Role. Thanks for clarifying I did not know the difference.

Babar Baig

11:52:13 AM

I logged into the ECS instance and then to my task (docker exec -it <containerid> bash) and tried curl 169.254.170.2$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI it returned the same Role ARN which is passed in task-role parameter in my infra. So I thought aws s3 ls should work but it returned aws command not found error.

Aleksandr Fofanov

11:53:47 AM

Well obviously your container image should have aws cli installed if you want to use it inside the container.

Babar Baig

11:57:17 AM

Yes I thought ECS AMI should have installed it. My issue is resolved. Thanks

2021-01-11

joey

07:42:12 PM

i’m seeing some funky issues with grpc and nlb’s when rolling pods in eks. anyone got experience in this area?

kskewes

08:40:41 AM

We get 500s when roll ingress nginx as pod process terminates before nlb health check times out Some work upstream around 1.19 but we are still on 1.15… :(

2021-01-12

joshmyers

04:42:35 PM

Anyone using on AWS App Mesh? Thoughts?

Darren Cunningham

05:22:50 PM

I think that it’s a limited implementation of envoy but it’s serving my needs which are also limited currently

jose.amengual

06:52:30 PM

I did a POC

jose.amengual

06:52:40 PM

only works on awsvpc mode with ECS

jose.amengual

06:53:02 PM

so that was a show stopper for me

jose.amengual

06:53:39 PM

but I moved it to awsvpc and then I had issues with Service discovery, the docs are far from great if you do not run on fargate

2021-01-13

bazbremner

11:51:55 AM

I’m having a bit of a wood-for-trees moment with ACM Private CA - it’s reasonably straightforward to set up a root and subordinate CA in a single AWS account and RAM share that out to other accounts that need certificates (although it seems necessary to share both the root and the subordinate for the subordinate to appear in the ACM PCA console in the target account). However, the best practice (https://docs.aws.amazon.com/acm-pca/latest/userguide/ca-best-practices.html) recommends having the root alone in its own account, and subordinate(s) in another account. My problem is that the process of signing the subordinate CA manually when the root CA is in another account is really not clear. The docs cover the case of both in the same account, or an using an external CA. Anyone done this before?

ACM Private CA Best Practices - AWS Certificate Manager Private Certificate Authority

Learn the best ways to use ACM Private CA.

Matt Gowie

06:36:56 PM

Has anyone ever seen the AWS ElasticSearch Service take over an hour to create a domain (i.e. Domain is in “Loading” state and nothing is accessible)? I’ve seen long update / creation times from this service before… but this seems absurd.

Yoni Leitersdorf (Indeni Cloudrail)

07:12:32 PM

Never timed it, but the long creation/update times are something I ran into in the past too.

Jonathan Le

12:07:15 AM

I remember doing ES Domain maintain activities took a very long time, esp. if I tuned on Automated backups

09:29:06 PM

Does anyone need to periodically restart the aws ssm agent ?

jose.amengual

09:35:51 PM

no sir

Matt Gowie

09:48:45 PM

Nope.

11:31:02 PM

nvm, found out it was an old ami from 2018. upgrading the ami seemed to fix the issue.

11:31:03 PM

¯_(ツ)_/¯

sheldonh

01:53:07 AM

[thread] Troubleshooting — EC2 Instance for Windows isn’t working with ECS tasks, ssm session manager, and RDP reporting User Profile service Can’t start

sheldonh

01:54:43 AM

The source for this AMI = packer built. It’s an ebs based image, encrypted by default 50gb.

I notice the launch configuration for this has no detail on volume, just uses the defaults from the image I believe.

in console I see this means the ebs volume is not optimized and not encrypted.

Is this a false track to go down? pretty sure the instance has permissions for kms + full ssm and more. I want to know if any other ideas before i call it a night

2021-01-14

Ofir Rabanian

05:08:10 PM

If anyone’s using AWS AppMesh here - do you know if there’s a way for the virtual gateway to NOT overwrite the host? we’re using the host in order to do some analytics in the container.. might also be similar for Istio users.

Darren Cunningham

07:48:09 PM

am I blind or is there really no way to get container details for a running tasks in the new ECS portal?

MattyB

08:00:41 PM

Going to check it out right now. What details are missing?

Darren Cunningham

08:03:25 PM

all the details that you could access from the collapse menus in the current version – not the best example since it’s an x-ray container, but you’ll get the point

MattyB

08:06:36 PM

Hah! That’s pretty bad. I’m seeing the same issue in my UI

jose.amengual

08:07:01 PM

new or old UI?

Darren Cunningham

08:07:05 PM

new

Darren Cunningham

08:08:28 PM

I left feedback, first time I’ve had to do that because of a UI change

MattyB

08:09:38 PM

I wonder if it’s because this information is a part of the task definition and they’re trying to not have it in multiple places?

Darren Cunningham

08:12:01 PM

but that’s how my team knows how to find the CloudWatch Log Stream associated with that task..

jose.amengual

08:12:51 PM

I do not like the new one

Darren Cunningham

08:15:12 PM

I can appreciate that they’re trying to streamline information, but they dropped the important bits

Darren Cunningham

08:18:28 PM

ok, glad I’m not the only one maybe there will be enough reactions to get them to fix it before it becomes the default view

Zach

10:21:19 PM

the new UI also removed the links to the monitoring/logs

2021-01-15

Vlad Ionescu (he/him)

02:51:10 PM

FYI for the containers people: the last run of COM-401 “Scaling containers on AWS” is in 15-ish minutes at https://virtual.awsevents.com/media/0_6ekffvm8 There are some massive Fargate improvements that got no other announcements as far as I know

Patrick Jahns

08:31:09 PM

Seems quite interesting - wondering if this relates to what hashicorp has done with their nomad testing on top of AWS

Santiago Campuzano

04:20:22 PM

morning everyone ! Quick question, is there any performance overhead/impact when live migrating an EBS volume from gp2 to gp3 ?

Babar Baig

04:23:26 PM

Hi. I have a question about CloudFront. My application is deployed on Heroku and I am using the Heroku endpoint in CloudFront. It works fine but when I try to open a page by specifying path in URL then the CloudFront URL is routed to use the Heroku endpoint. For example http://myherokuendpoint is the application link in Heroku and d829203example.cloudfront.net is my cloudfront address to access my app. When I try to access d829203example.cloudfront.net/admin it changes the address to http://myherokuendpoint/admin I tried adding origins but it did not work.

If I attach ALB link in CloudFront distribution it works fine. Is there a way I can make it work with Heroku link?

Tim Gourley

06:05:47 PM

Question about egress filtering - Ideally you don’t want processes to have the ability to reach out to the internet with the exception of specific cases like calling other AWS services, downloading yum updates, contacting other services like Trendmicro SaaS etc. Some AWS services support in VPC endpoints but last I checked this only worked for some services and generally only within the same region. IP filtering seems solid but it would be a huge pain to setup and maintain. DNS blocking would seem to be easier to maintain but would not prevent connections that don’t require DNS. Anyway, are there ~~~best practices~~~ecommendations for setting up egress filtering? Are there other options? Thanks! Tim

Yoni Leitersdorf (Indeni Cloudrail)

06:09:01 PM

What’s your budget? We see well funded enterprises use Next Generation Firewalls for this (CHKP, PANW, FTNT, etc)

Yoni Leitersdorf (Indeni Cloudrail)

06:09:14 PM

Another option is the new AWS Firewall

maarten

06:51:49 PM

Error: Error creating aggregator: OrganizationAccessDeniedException: This action can only be performed if you are a registered delegated administrator for AWS Config with permissions to call ListDelegatedAdministrators API.

Anyone had this before. the account I’m executing is actually delegated as such and can call ListDelegatedAdministrators succesfully.

Liam Helmer

11:15:21 PM

Hey All! I’m working on editing a cloudposse module for our use, but I’m having a weird issue. I think I added all the things in I need and got all the variables in correctly, but now I’m getting a super generic error and I’m unclear on how to troubleshoot it. It’s unfortunately not telling me at all what’s wrong with the construction, and I’d love ot know how to trouble shoot from here:

Liam Helmer

11:15:33 PM

2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_origin_access_identity.default”: visit complete 2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_origin_access_identity.default (expand)”: dynamic subgraph completed successfully 2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_origin_access_identity.default (expand)”: visit complete 2021/01/15 1505 [TRACE] Re-validating config for “module.cdn.aws_cloudfront_distribution.default[0]” 2021/01/15 1505 [TRACE] GRPCProvider: ValidateResourceTypeConfig 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 10 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_distribution.default[0]”: visit complete 2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_distribution.default”: dynamic subgraph encountered errors 2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_distribution.default”: visit complete 2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_distribution.default (expand)”: dynamic subgraph encountered errors 2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_distribution.default (expand)”: visit complete 2021/01/15 1505 [TRACE] dag/walk: upstream of “provider["registry.terraform.io/hashicorp/aws"] (close)” errored, so skipping 2021/01/15 1505 [TRACE] dag/walk: upstream of “module.cdn (close)” errored, so skipping 2021/01/15 1505 [TRACE] dag/walk: upstream of “meta.count-boundary (EachMode fixup)” errored, so skipping 2021/01/15 1505 [TRACE] dag/walk: upstream of “root” errored, so skipping 2021/01/15 1505 [INFO] backend/local: plan operation completed

Error: Required attribute is not set

on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {

Error: Required attribute is not set

on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {

Error: Required attribute is not set

on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {

Error: Required attribute is not set

on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {

Error: Required attribute is not set

on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {

Error: Required attribute is not set

on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {

Error: Required attribute is not set

on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {

Error: Required attribute is not set

on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {

Error: Required attribute is not set

on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {

2021/01/15 1505 [TRACE] statemgr.Filesystem: removing lock metadata file .terraform.tfstate.lock.info Error: Required attribute is not set

on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 2021/01/15 1505 [TRACE] statemgr.Filesystem: unlocking terraform.tfstate using fcntl flock 30: resource “aws_cloudfront_distribution” “default” {

2021-01-15T1505.461-0800 [WARN] plugin.stdio: received EOF, stopping recv loop: err=”rpc error: code = Unavailable desc = transport is closing” 2021-01-15T1505.465-0800 [DEBUG] plugin: plugin process exited: path=.terraform/providers/registry.terraform.io/hashicorp/aws/3.24.1/darwin_amd64/terraform-provider-aws_v3.24.1_x5 pid=99698 2021-01-15T1505.465-0800 [DEBUG] plugin: plugin exited

pjaudiomv

11:20:48 PM

Do you have a link to fork and parameters of call to it

Liam Helmer

11:22:02 PM

No, I hadn’t cleaned it up adequately yet

Liam Helmer

11:22:11 PM

Was still just trying to get it to work

Liam Helmer

11:30:06 PM

Is there any way to have it actually tell me what attribute isn’t set?

pjaudiomv

11:36:02 PM

Not that I’m aware of

Liam Helmer

11:41:53 PM

Hrm, thanks @pjaudiomv

Andriy Knysh (Cloud Posse)

01:31:11 AM

@Liam Helmer I’ve never seen error messages like that ^

Andriy Knysh (Cloud Posse)

01:31:19 AM

are you using terraform?

Andriy Knysh (Cloud Posse)

01:32:17 AM

in general, all Cloud Posse modules require at least one of the following inputs: namespace, environment, stage, name

Andriy Knysh (Cloud Posse)

01:33:04 AM

we use https://github.com/cloudposse/terraform-null-label to uniquely and consistently name all the resources

cloudposse/terraform-null-label

Terraform Module to define a consistent naming convention by (namespace, stage, name, [attributes]) - cloudposse/terraform-null-label

Andriy Knysh (Cloud Posse)

01:33:35 AM

so the ID of a resource will have the format like: namespace-environment-stage-name

Andriy Knysh (Cloud Posse)

01:33:57 AM

you can skip any of the parameters, but at least one is required

Liam Helmer

02:01:19 AM

Thanks @Andriy Knysh (Cloud Posse)! I figured it out…. It was a canadianism: I used the word “behaviour” instead of “behavior” in a couple of places deep in the module, and that caused it to error out like that. LOL

Liam Helmer

02:01:49 AM

I now have a working prototype at least, I’ll see how it goes from here!

jose.amengual

01:07:22 AM

ok, I figure it out

Steve Wade (swade1987)

02:53:28 AM

is a VPC name unique across regions in the same account?

e.g. can i have a dev VPC in Ireland and another dev VPC in Singapore within the same account?

loren

03:18:02 AM

The vpc name is just a tag, it’s not unique in any way. Can be the same name in the same account, in the same region. There’s no restriction on the name

2021-01-16

Steve

09:12:10 PM

hey all are there modules (or some best practices docs) for creating cloudwatch alarms (use case: alarm on CPU/disk space apache kafka (MSK))?

Mohammed Yahya

06:55:23 AM

Hey, you can use https://github.com/terraform-aws-modules/terraform-aws-cloudwatch/blob/v1.3.0/examples/multiple-lambda-metric-alarm/main.tf update namespace and dimension to match MSK read more here https://docs.aws.amazon.com/msk/latest/developerguide/monitoring.html

terraform-aws-modules/terraform-aws-cloudwatch

Terraform module which creates Cloudwatch resources on AWS - terraform-aws-modules/terraform-aws-cloudwatch

Monitoring an Amazon MSK Cluster - Amazon Managed Streaming for Apache Kafka

Learn how to monitor your Amazon MSK cluster.

Steve

07:04:18 AM

2021-01-17

2021-01-18

Igor Bronovskyi

02:45:25 PM

Can I run command on AWS Fargate when container go to deactivation ? https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-lifecycle.html

Task lifecycle - Amazon Elastic Container Service

When a task is started, either manually or as part of a service, it can pass through several states before it finishes on its own or is stopped manually. Some tasks are meant to run as batch jobs that naturally progress through from PENDING to RUNNING

Igor Bronovskyi

02:47:25 PM

Is it possible to add this to the task definition?

Task lifecycle - Amazon Elastic Container Service

Joe Niland

10:07:11 PM

The only thing I’ve seen (not an exhaustive search) task state change events

Do you need to run your command in the currently running container?

Amazon ECS events - Amazon Elastic Container Service

Container state change and task state change events contain two version fields: one in the main body of the event, and one in the detail object of the event. The following describes the differences between these two fields:

sahil kamboj

07:23:03 AM

Hi, I have a query about RSYNC i have an ec2 server with 3 diff apps folder 2 GB each(master server) i want to clone and maintain these folder to multiple ec2, lets say 5 ec2. if i setup cron job of rsync for every hour does it bottle neck my local bandwidth? and does it makes my cpu utilisation high ?

jose.amengual

07:26:12 AM

use EFS?

sahil kamboj

07:27:14 AM

cant, it is has python code so all the process become slow

jose.amengual

07:28:36 AM

well python once on pyc files loads in memory so it should not be slow

jose.amengual

07:29:15 AM

the data for whatever you are running can be in a localdisk but code can be in efs

sahil kamboj

07:30:30 AM

so my framework is bit old when i update some thing it will check for all the files and requirements which makes the whole process slow

jose.amengual

07:31:33 AM

use code deploy in that case if you are deploying code and such

jose.amengual

07:32:02 AM

but rsync is network/io dependant not cpu dependant

sahil kamboj

07:32:09 AM

i am using efs for now, my site goes down for 20 min when i update. i want to explore other options

jose.amengual

07:33:17 AM

codeDeploy is an aws product that does code deploys

jose.amengual

07:33:42 AM

you could use EFS, Ramdisk to load your code if is that IO dependant

sahil kamboj

07:33:51 AM

yes, but my framework do all the checks when we update the code

jose.amengual

07:38:32 AM

then check codeDeploy , it is like a fancy rsync

Ives Stoddard

04:32:25 PM

where EFS struggles is the latency on reading lots of small files. any forking processes which have to reload those files will suffer. anything on the front-end will likely see long request delays when loading those files for a new process (recycling a process handling requests).

Ives Stoddard

04:33:54 PM

as an alternative, you should also investigate the use of containers. the layers in the container are stored as larger files, so deployments from a container would be less of a concern if pulling the images locally to the ec2 instance and running them.

Ives Stoddard

04:34:30 PM

if you don’t need to manage that process yourself, look at ECS, EKS, or Fargate for a managed container cluster framework.

Ives Stoddard

04:35:31 PM

you would integrate the pushing and pulling of images with either your CI (push early) or CD pipeline (either early or on-demand).

Ives Stoddard

05:01:13 PM

if you don’t want to go the route of containers, you might also consider distributing files via tarball instead. in the ruby world, one might use something like capistrano to orchestrate the process.

Ives Stoddard

05:01:24 PM

some ideas to get you thinking about it…

Ives Stoddard

05:01:37 PM

https://blog.nuventure.in/2020/02/19/auto-deploying-a-django-app-using-capistrano-with-gitlab-ci-cd/

Auto Deploying a Django App using Capistrano with GitLab CI/CD - Nu-Blog attachment image

Yes, you read that right. We are using Capistrano - a very popular application deployment tool that is written in Ruby to deploy a Python Django app. In

Ives Stoddard

05:02:53 PM

https://github.com/dlapiduz/fabistrano

dlapiduz/fabistrano

Capistrano style deployments with fabric. Contribute to dlapiduz/fabistrano development by creating an account on GitHub.

Ives Stoddard

05:04:59 PM

i would seriously consider containers though. while a little extra overhead in getting up to speed, deployments and testing across your dev >> test >> staging >> production pipeline is a lot easier when they’re all guaranteed to run the same container image / code.

jose.amengual

05:18:08 PM

yes, smalls file in EFS is not the best (NFS4) but how big is you deployment? how big the code base could be? but at any rate, the share volume can be used to share the code and then copy local to a disk or ramdisk etc

jose.amengual

05:18:29 PM

I used capistrano for years, CodeDeploy is basically capistrano

Ives Stoddard

05:46:43 PM

@jose.amengual: EFS also has additional write latency (multiple-AZ synchronous write), so things like rsync can take a lot longer than self-managed NFS.

jose.amengual

05:48:57 PM

I understand but this techniques has been use for many years 20+ years…this is nothing new, capistrano must be 10+ years old? do not get me wrong, containers is the way to go I think but if you really need to share code as ( here is the artifact now deploy it) there is nothing too bad on NFS

jose.amengual

06:00:48 PM

you can download the artifact from S3 too( which will be faster)

2021-01-19

Santiago Campuzano

03:01:33 PM

Morning everyone ! Any experience migrating EBS volumes from gp2 to gp3 ? I have a large Kafka cluster with huge (1.5 TB) EBS volumes attached to every single broker

Ives Stoddard

04:36:49 PM

Be sure to read up on gp3 before you migrate.

Ives Stoddard

04:37:45 PM

While they offer higher potential IOPS for lower capacity volumes, at a slightly lower cost, the tradeoff is higher latency.

Santiago Campuzano

04:37:47 PM

Yes @Ives Stoddard I’ve been reading a lot about gp3

Santiago Campuzano

04:38:35 PM

How much is the difference in the latency ?

Ives Stoddard

04:40:21 PM

https://silashansen.medium.com/looking-into-the-new-ebs-gp3-volumes-8eaaa8aff33e

Looking into the new EBS gp3 volumes attachment image

Just recently, I found out that AWS has introduced a new type of Elastic Block Storage called gp3 in addition to the popular gp2 volume…

Santiago Campuzano

04:41:03 PM

Hmmmm… thanks for the info !!!

Ives Stoddard

04:41:21 PM

for most workloads, this likely isn’t an issue.

Santiago Campuzano

04:41:21 PM

not all that glitters is gold

Santiago Campuzano

04:41:40 PM

It’s Kafka…. I need to read

Ives Stoddard

04:41:42 PM

but if you’re pushing around millions / billions of files, that latency can add up.

Santiago Campuzano

04:41:53 PM

Kafka is I/O sensitive

Ives Stoddard

04:42:52 PM

as with all things at this layer, it’s usually best to do some load testing with your applications and workflows.

Ives Stoddard

04:43:25 PM

different patterns and workloads can have very different requirements.

Santiago Campuzano

04:43:36 PM

Absolutely right… well… what I like about AWS is that I can go back to gp2, if needed

Ives Stoddard

04:43:39 PM

are you using self-managed kafka, or one of the managed services?

Santiago Campuzano

04:43:51 PM

self-managed Kafka

Ives Stoddard

04:45:25 PM

with the newer volume types, you may be able to switch on the fly. if for some reason that ends up being a problem, you can always swap them via LVM volume replacement.

Ives Stoddard

04:45:49 PM

if you have multiple nodes, you can try running on gp3 for a while and see if there is a measurable impact to performance.

Santiago Campuzano

04:46:21 PM

Right… we have like 50 brokers … so I will try on some of them ….

Ives Stoddard

04:46:26 PM

depending on your write / read throughput, and application sensitivity to latency, gp3 might work just fine.

Santiago Campuzano

04:46:38 PM

And see if the performance is impacted somehow

Ives Stoddard

04:47:06 PM

latency of 1-2 ms on the tail end of a request is negligible, whereas stacking latency for rsync of 10 million files would be a different story.

Ives Stoddard

04:48:49 PM

for example, an additional 2 ms delay on 10 million file rsync could add up to 5.5 hours in latency (single process).

Ives Stoddard

04:49:33 PM

whereas a 2ms delay on asynchronous event fired from a front-end user request wouldn’t be perceptible to a user.

Ives Stoddard

04:51:04 PM

(or closer to 4ms for write + read delay, in the event of semi-synchronous events, like thumbnail generation on upload, or comment publishing, etc.)

Santiago Campuzano

04:52:00 PM

Wow… I really appreciate all your recommendations on this matter ….

Santiago Campuzano

04:52:14 PM

I will proceed with caution

Ives Stoddard

04:55:06 PM

one other area to be mindful of is local ec2 root volumes. if you leverage swap at all, that latency might slow down memory. likely not an issue for kafka, as your JVM is likely configured for (Xmx).

Santiago Campuzano

04:55:47 PM

Right…. with Kafka we avoid using SWAP at all ….

Santiago Campuzano

04:56:02 PM

We have instances with lots of RAM so, that should not be an issue

Ives Stoddard

04:56:16 PM

in those cases, consider emphemeral instance storage for swap.

Ives Stoddard

04:56:19 PM

Ives Stoddard

04:56:22 PM

good luck.

Santiago Campuzano

04:56:43 PM

Thanks @Ives Stoddard !!!! Have a great day !

2021-01-20

Asis

08:01:42 PM

Hello everyone , I am trying to save output of the terraformed eks cluster into a json file

Erik Osterman (Cloud Posse)

11:58:53 PM

do you mean to save the terraform outputs in a JSON format from a root module that provisions an eks cluster?

btai

02:00:46 AM

asking this here cause we didn’t get it it in office hours any tips for improving global s3 upload speed? (think india, hong kong, etc) what other optimizations could I possibly make after turning on s3 transfer acceleration and using multipart uploads?

Jonathan Le

04:23:04 AM

is this from an app or scripting? are you parallelizing the upload processes? is that possible in your app/script? not sure if you are talking about 1 large file or many smaller ones. multipart + multiple upload workers might help in some scenarios

https://pypi.org/project/s3-parallel-put/ or https://netdevops.me/2018/uploading-multiple-files-to-aws-s3-in-parallel/

Uploading multiple files to AWS S3 in parallel

Have you ever tried to upload thousands of small/medium files to the AWS S3? If you had, you might also noticed ridiculously slow upload speeds when the upload was triggered through the AWS Management Console. Recently I tried to upload 4k html files and was immediately discouraged by the progress reported by the AWS Console upload manager. It was something close to the 0.5% per 10s. Clearly, the choke point was the network (as usual, brothers!). Comer here, Google, we need to find a better way to handle this kind of an upload.

btai

06:22:38 AM

from a web app

btai

06:23:39 AM

not exactly large files. a 20mb file can take minutes to upload for users in bangladore.

ikar

07:28:55 AM

form my experience, using custom python boto-based code for downloading was never even close when it comes to the speed of CLI aws s3 cp ... - so maybe try uploading using aws-cli as a test first?

Jonathan Le

05:02:07 PM

multipart uploads and transfer acceleration are the easy ones, which you already have turned on. the only other things I can think of: (1) uploading to a regional s3 bucket and replicating the files back to the central/main bucket or (2) having another intermediary service that would first consume the upload and then put to where you wanted it e.g. write your own service that runs on some regional infra that closer to the enduser (ec2/fargate) or like https://www.dataexpedition.com/clouddat/aws/. i think in my head that the upload times would appear faster to the end user, but over all processing time of getting the file to where it needs to go might be longer if there’s more steps in the process to get the files to the workers that will actually do something with the file.

Jonathan Le

05:05:54 PM

Also, did you play around with the chunksize? https://boto3.amazonaws.com/v1/documentation/api/latest/reference/customizations/s3.html - multipart_chunksize – The partition size of each part for a multipart transfer.

if you have a 20MB file, you’ll only get 3 chunks with the default of 8mb. if you go to the 5MB min, maybe you’ll get another chunk or two uploading in parallel.

looks like you can play around with the s3 API with the AWS cli config, if you want to try it out before modifying the app:

[profile development]
aws_access_key_id=foo
aws_secret_access_key=bar
s3 =
  max_concurrent_requests = 20
  max_queue_size = 10000
  multipart_threshold = 64MB
  multipart_chunksize = 16MB
  max_bandwidth = 50MB/s
  use_accelerate_endpoint = true
  addressing_style = path

btai

06:38:11 PM

yeah thanks @Jonathan Le, ill play around w/ chunk size default first before. Are you also solving this problem? How are you actually testing it’s improving speed? To test, we RDP into an azure vm (in asia or europe) and try uploading from the vm. Unfortunately even testing that route hasn’t been giving us “realistic” timing (much faster) when being compared to our actual clients in those areas. My guess is because Azure servers might still have preferential routing to AWS servers when compared to someone w/ their ISP at home.

Liam Helmer

12:45:29 AM

Most of the time, upload speed is more to do with the client networks than yours. If you have access at that end, then tcp window scaling and QOS can help, depending on the specifics of their setups.

Jonathan Le

05:48:19 AM

@btai i had to deal an analogous issue a couple of years ago (before transfer acceleration even), but it wasn’t the same issue. my problem was needed to optimize a data pipeline replicating terabytes of data across regions for processes on a routine basis.

testing and getting realistic results will be hard, esp. with what @Liam Helmer brought up. running an end user test on optimized cloud networks won’t be 100% realistic….my guess is that you’ll need to get friendly with a couple of end users that can do a small number of localized before and after tests and take some measurements. probably not the best, but sometimes you work with the hand your dealt.

btai

06:57:12 PM

Yeah, I’ve reached out to some of our global customers that we have a good relationship with and so I’ll do that and try to get numbers as a pulse check. I didn’t think so, but I was curious if anyone else had other clever ways of testing.

Jonathan Le

08:30:19 PM

Oh. Hop on fiverr.com or something like that. create some disposable 1 day test accounts. Maybe there are some affordable software testers in the regions you need on home based internet and devices.

btai

09:30:02 PM

@Jonathan Le thats a great idea! im gonna run it by our engineering leadership

Jonathan Le

10:09:06 PM

You can thank a tiktoker and home gym deadlifts and TGIF for that idea!

2021-01-21

2021-01-22

2021-01-26

Babar Baig

11:05:34 AM

Hi everyone! I am deploying a Rails application in ECS. The application only allows to be accessed from a specified hostname. I am stuck in the healthcheck pass issues. Health checks keep failing while the application is working fine when accessed via Loadbalancer (because I passed LB hostname as an environment variable) I am assuming that, for healthcheck, target group hits the instance ip as the IP is not allowed hence the healthcheck fails. I can not specify the instance ip like I specified the loadbalancer because we can not get instance ip from the launch configuration. Is there any way to tackle this?

Ofir Rabanian

12:40:32 PM

What’s the advantage of using S3 SSE? is there an attack vector that it prevents?

Yoni Leitersdorf (Indeni Cloudrail)

02:01:37 PM

This was just discussed in another forum actually: (different Slack workspace)

Ofir Rabanian

10:54:17 AM

Toda!

Tomek

05:05:53 PM

with terraform, is it possible to create a aws_secretsmanager_secret_version resource that will merge its values with the current aws_secretsmanager_secret_version (only if one exists)?

Tomek

05:09:41 PM

data "aws_secretsmanager_secret_version" "current_secret" {
  secret_id = module.module_that_creates_a_secret.secret_id
}

resource "aws_secretsmanager_secret_version" "merging_secret_version" {
  secret_id = module.module_that_creates_a_secret.secret_name
  secret_string = jsonencode(
    merge(
      jsondecode(data.aws_secretsmanager_secret_version.current_secret.secret_string),
      {
        SECRET_TO_ADD     = "A new secret"
      }
    )
  )
}

Gets me to a point where it will merge when a current version exists but error if one doesn’t with Error: Secrets Manager Secret "the_secret_name" Version "AWSCURRENT" no Version "AWSCURRENT" not found

MrAtheist

07:04:36 PM

Anyone know if cloudwatch captures metrics around “Total # of EC2 instances running over time (per region)”?

Darren Cunningham

07:46:50 PM

aws cloudwatch list-metrics --namespace AWS/Usage --metric-name ResourceCount --dimensions Name=Service,Value=EC2 – I don’t believe that can be filtered to Region but I’m probably wrong as my CloudWatch metrics knowledge is weak sauce

MrAtheist

07:48:01 PM

TY! that works like a charm

sheldonh

06:05:12 PM

There’s also a service limit ed extracted that gives you percent utilization for instances in a region FYI I though that was cool as I got the historical instance count backfilled by 15 months by looking at that number.

Chris Fowles

02:59:04 AM

People running SQL Server on RDS - how do you handle backups? We need to have a longer retention than automated snapshots allow, but want to retain point in time recovery, so can’t only rely on native backup to s3. Was looking at AWS Backup, but that seems  compared to S3 storage. Also, database size is over 4Tb so it’s a lot of bits to be pumping around making mistakes.

sheldonh

06:04:12 PM

I’d recommend keeping it simple with AWS backup. if that doesn’t provide what you need then you’ll probably want to look at a third party vendor service that does RDS snapshot management but it’s probably going to be replicating something very similar to what AWS backup would already offer you.

What’s your retentionrequirements from the business and if longer than 35 have they considered GDPR? Having 35 days cleanup sorta makes adherence almost automatic with minimal fuss.

Chris Fowles

10:09:29 PM

no need for GDPR - .au fintech company servicing only local

Chris Fowles

10:09:48 PM

so long term retention is important

sheldonh

10:13:48 PM

What’s long term? Also pricing for aws backup that’s the problem? I thought AWS backup was free excepting storage which is all just s3 anyways. You could have it lifecycle to IA or glacier for long-term if you needed to as well.

Chris Fowles

10:14:12 PM

it’s snapshot storage which is ~ 3x on the price of s3 storage

Chris Fowles

10:14:24 PM

$0.095/Gb/month

Chris Fowles

10:15:46 PM

after crunching the numbers a bit more we’re going to go with AWS backup because I think we’ll end up saving money due to the way snapshots work with just taking diffs

Chris Fowles

10:16:14 PM

it’s a really difficult pricing model to estimate when you’re dealing with TB scale dbs tho

sheldonh

10:18:22 PM

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ExportSnapshot.html#USER_ExportSnapshot.Exporting

Til that supported dbs can extract out the data as well.

Exporting DB snapshot data to Amazon S3 - Amazon Relational Database Service

Export an Amazon RDS database snapshot.

Chris Fowles

10:19:07 PM

yeh, not for SQL server unfortunately tho

sheldonh

10:19:34 PM

I’m dealing with more ec2 self managed instances so it’s easier and harder depending on the point of view. I’ve left default backups going as is for now but later would have to revisit for aws backup if needed longer too. Good to know! Keep us updated if any interesting facts come up

Chris Fowles

10:19:52 PM

2021-01-27

2021-01-28

imran hussain

11:39:30 AM

Hi All, Glad to have been invited to the channel. I think that the work that you guys do is great and have used some of the modules in the past keep up the great work I wonder if anyone can help me with a problem I have. I am trying to serve multiple react apps from the same s3 bucket we have a single Domain i.e my.example.com: I have configured a CDN with an S3 backend with a policy that allows the CDN to access s3 objects and the s3 bucket is configured for static website hosting. The default page to serve is set to index.html and well as the default error page. The Bucket is then used as an host 2 react Applications under 2 different paths say /App1 /App2 The CDN has been set up with 2 origins and path patterns of /App1/* and /App2/* and /api/* which points to an API gateway. The Default Root Object is not set and this should defer the serving of the root document to S3 under the origin that has been set with the path set s3.bucket.domain/path i.e mybucket.s3.eu-west-2.amazonaws.com/App1 or mybucket.s3.eu-west-2.amazonaws.com/App2 or mybucket.s3.eu-west-2.amazonaws.com/api The behaviour are path pattern /App1/* has the origin that points to mybucket.s3.eu-west-2.amazonaws.com/App1 , path_pattern => /App2/ : origin => mybucket.s3.eu-west-2.amazonaws.com/App2/*, path_pattern => /api/: origin => [mybucket.s3.eu-west-2.amazonaws.com/api/](http://mybucket.s3.eu-west-2.amazonaws.com/api/) with the default path_pattern pointing to the origin mybucket.s3.eu-west-2.amazonaws.com/App1 in that order in the behaviours When ever I request anything form my.example.com/App2/ The default behaviours is applied the same is for mybucket.s3.eu-west-2.amazonaws.com/App1/. In fact every page request returns the same index.html page no matter what page I as for weather it be an image.png or blah.js. The files exist under the /App1 and App2 in the s3 bucket. But for mybucket.s3.eu-west-2.amazonaws.com/api/someapi/ it seems to work fine. There is no index.html at the root of the s3 bucket and all file are either in /App1/ or /App2/ in the bucket. Has anyone done this before or know of a way that I can get this to work. Note everything has to be served under a single domain. Any help would be welcome

sheldonh

05:51:09 PM

attention datadog pros

I need to distribute datadog across a fleet of machines. I can do this with SSM. I use chocolately for windows.

However, chef/puppet have modules for this that help configure the various yaml files on demand. I’m wondering if chef,puppet, salt have any simple to implement “masterless” approach like Ansible so I can implement with minimal fuss for a single package like this and leverage their config options.

The problem isn’t the install… It’s the config. Do I just have one giant config directory that everything uses including all possible log paths, or will this be throwing errors and causing needless failed checks constantly?

If I have to customize the collector per type of instance then I’m back in scripting a config and flipping from json to yaml. I don’t mind but if I can avoid this extra programmatic overhead and maintenance I want to figure that out.

Chris Fowles

10:19:26 PM

i use powershell and https://github.com/cloudbase/powershell-yaml

cloudbase/powershell-yaml

PowerShell CmdLets for YAML format manipulation. Contribute to cloudbase/powershell-yaml development by creating an account on GitHub.

sheldonh

10:20:15 PM

I have a finished blog post on an approach like this. I’m hoping though to NOT do this as I really want to avoid programmatic creation so others can just manage a yaml file if possible. Great minds think alike

sheldonh

07:10:39 PM

Anyone use aws-vault + docker? I can normally mount my Codespaces container with aws creds if I use the plain text file. I prefer aws-vault but becomes problematic with this docker based work environment.

thinking maybe I could use a file backend, mount this, and then in container install aws-vault to accomplish this?

I prefer aws-vault but since I’m remote, no one has access to my machine it might be just easier to stick with plain text cred file

roth.andy

07:16:23 PM

https://github.com/saic-oss/anvil/blob/main/docs/awscli.md

saic-oss/anvil

DevSecOps tools container, for use in local development and as a builder/runner in CI/CD pipelines. Not to be used to run production workloads. - saic-oss/anvil

roth.andy

07:16:42 PM

We also recently came up with a pattern that uses docker-compose that is working very well. Not open sourced yet but it should be soon

roth.andy

07:17:23 PM

the trick is to make sure the 4 AWS_* environment variables get passed through

roth.andy

07:17:25 PM

Then you’re good

OliverS

04:00:01 AM

Does anyone know how to programmatically get the permissions needed for a given action? Eg here us a bunch of actions (which aws docs also refer to as operations) and the corresponding permissions:

s3:HeadBucket                       -> s3:ListBucket
s3:HeadObject                       -> s3:GetObject, s3:ListBucket
s3:GetBucketEncryption              -> s3:GetEncryptionConfiguration
s3:GetBucketLifecycleConfiguration  -> s3:GetLifecycleConfiguration
s3:GetObjectLockConfiguration       -> docs don't say

Surely there is a table or an AWS CLI command to get this mapping? Eg something like aws iam get-permissions --action s3:HeadBucket and the output would be s3:ListBucket.

kskewes

04:58:31 AM

This is the worst. Following. Just been through down scoping Packer. Both the encoded error and cloud trail complete with its 30m+ lag were hopeless.

#vent

sheldonh

02:55:56 PM

The eternal struggle. I think duo labs has stove tooling like pmapper/cloudmapper but nothing is perfect :-)

loren

03:02:21 PM

s3 is the only service i’ve seen that doesn’t line up 1:1, which i assume is because it’s probably their oldest api

OliverS

07:01:25 PM

I saw many others not line up, I stumbled into this issue because I did a grep from a terraform plan for the actions it would take, and after adding them to the policy, found many (about half) that did not match a permission. However, this was a quick experiment maybe I got unlucky first try

sheldonh

07:06:27 PM

Did you give cloud mapper a shot? https://github.com/duo-labs/cloudmapper It does a pretty decent job of helping parse those out. Basically start with an elevated account, perform the action and the teardown and see what it requested. Then you can deploys a more narrowly privileged service user.

At least I found it reasonable. It uses athena and build this history from cloudtrail.

duo-labs/cloudmapper

CloudMapper helps you analyze your Amazon Web Services (AWS) environments. - duo-labs/cloudmapper

OliverS

07:59:02 PM

@sheldonh Interesting! So I would use the collect command of cloudmapper then process the json? What would I look for in there, and is there a way of only collecting iam info otherwise the file will be humungous.

sheldonh

08:14:13 PM

Sorry mixing up repos.

https://engineering.salesforce.com/salesforce-cloud-security-automating-least-privilege-in-aws-iam-with-policy-sentry-b04fe457b8dc And cloudtracker is what I used Maybe explore policy sentry too.

Salesforce Cloud Security: Automating Least Privilege in AWS IAM with Policy Sentry attachment image

Use this open source tool to achieve least privilege at scale.

OliverS

01:18:08 AM

@sheldonh cloudtracker sounds awesome I’ll give it a shot, thanks so much!

#aws (2021-01)

Discussion related to Amazon Web Services (AWS)

2021-01-04

2021-01-05

2021-01-06

2021-01-07

2021-01-08

2021-01-11

2021-01-12

2021-01-13

2021-01-14

2021-01-15

2021-01-16

2021-01-17

2021-01-18

2021-01-19

2021-01-20

2021-01-21

2021-01-22

2021-01-26

2021-01-27

2021-01-28

2021-01-29