SweetOps #aws for June, 2020

Discussion related to Amazon Web Services (AWS)

Archive: https://archive.sweetops.com/aws/

2020-06-01

msharma24

Do we have any recommendations on how to write into dynamo table using EMR - we are using com.audienceproject1.0.1 lib and this takes ages to write into dyanamodb , even though I used provisioned WCU to 100, the consumed write capacity stays < 8

2020-06-02

Prasad

04:46:28 PM

Hi Folks..the certificate things are pretty confusing if we are new ..are there any difference in usage of the below 2 cli commands? aws iam upload-server-certificate ….. and aws acm import-certificate….?

Matt Gowie

04:57:08 PM

@Prasad ACM cert management and IAM cert management are two different things. You’re correct that it’s confusing, but I would suggest you investigate the two solutions.

Matt Gowie

04:57:57 PM

ACM is definitely the suggested route, but I believe it’s not available in all regions.

Prasad

05:03:11 PM

if i need to upload a internal custom signed cert and assign to a Internal LB which one do we prefer ..? Internet is restricted for us and white listing is being done only when its absolutely necessary. Can we use Endpoint for acm service and perform this? i believe aws iam has only public endpoints..Have you come across any pointers that explains the difference between the 2…Thanks again!

07:05:26 PM

Regarding the “Amazon ECS ARN and resource ID settings”. Has anyone had any issues turning on the new arn format for ECS container instance, service, or task ?

07:12:23 PM

@jose.amengual what broke or what was difficult ?

jose.amengual

07:12:36 PM

 aws ecs put-account-setting-default --name containerInsights --value enabled --region us-west-2
 aws ecs put-account-setting-default --name serviceLongArnFormat --value enabled --region us-west-2
 aws ecs put-account-setting-default --name taskLongArnFormat --value enabled --region us-west-2
 aws ecs put-account-setting-default --name containerInstanceLongArnFormat --value enabled --region us-west-2
 aws ecs put-account-setting-default --name awsvpcTrunking --value enabled --region us-west-2

jose.amengual

07:12:46 PM

you need to do that

jose.amengual

07:12:51 PM

on the cli

jose.amengual

07:13:12 PM

is an account wide setting you could I think AWs organizations to do that

07:13:28 PM

right because its region specific. i thought it was an account wide setting too

jose.amengual

07:13:47 PM

it is pretty stupid

07:13:58 PM

so wait after you enabled it, did you see any issues with the new arn format?

jose.amengual

07:14:11 PM

awsvpcTrunking

and containerInsights is not needed for the ID thing

jose.amengual

07:14:31 PM

not issues at all

jose.amengual

07:14:39 PM

I did this in a very old prod account

jose.amengual

07:14:42 PM

nothing happened

07:15:46 PM

awesome! that makes me feel a lot better

07:15:53 PM

seems easy enough to opt out anyway

Matt Gowie

09:13:52 PM

They were supposed to make these the default settings earlier this year, but then they backed out. It’s a real pain.

jose.amengual

07:09:06 PM

yes

Zach

12:57:03 AM

Is there some way to convince AWS console that yes, I will use their new interfaces, please stop reverting to the old one?

Maciek Strömich

07:17:09 AM

don’t mess with your cookies?

Zach

12:31:36 PM

To my knowledge I haven’t been

2020-06-03

mado

09:43:43 AM

My client just using AWS like RDS to save their customer data (financial service), how to improve the security on it? Any best practice?

msharma24

10:55:30 AM

I would start here https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_BestPractices.html

Best Practices for Amazon RDS - Amazon Relational Database Service

Follow these best practices for working with Amazon RDS.

03:53:32 PM

any suggestions on tagging policies ? we use

Name - for whatever env - for production, staging, etc role - function team - team name which corresponds to slack channel

03:54:03 PM

we have a new one from terraform called git_repo which is dynamic based on where the terraform was applied

03:55:32 PM

The https://github.com/cloudposse/terraform-null-label also has namespace , stage, and attributes tags

Santiago Campuzano

03:54:25 PM

My recommendation on tagging policy…. not too much, not a couple

Santiago Campuzano

03:54:51 PM

We have a Billing tag

03:55:49 PM

what is the value of Billing tag ?

Santiago Campuzano

03:56:27 PM

It depends… normally it’s gonna be the team/system that;s gonna pay for that cloud service

Santiago Campuzano

03:56:45 PM

So we can charge back and split the AWS bill based on that

Maciek Strömich

04:03:58 PM

we have a similar tag. out accounting provided cost allocation numbers for teams and we use those to tag resources in shared accounts. if an application is using a separate account (e.g. aall production environments are separated) it’s automatically attached to a certain number and we don’t have to do it (but we still do to simplify our cf templates)

07:01:08 PM

thats pretty cool

07:01:23 PM

i also found this nice doc on best practices

https://d1.awsstatic.com/whitepapers/aws-tagging-best-practices.pdf

07:02:34 PM

does anyone use tags for access control ?

Santiago Campuzano

07:02:58 PM

We do…

Santiago Campuzano

07:03:32 PM

We use AWS Session Manager

07:06:05 PM

ah interesting. found this example for that.

https://docs.aws.amazon.com/systems-manager/latest/userguide/getting-started-restrict-access-examples.html#restrict-access-example-instance-tags

Santiago Campuzano

03:55:02 PM

For costs/billing purposes

David

04:17:15 PM

AWS seems to recommend putting IAM Policies onto Groups, then adding IAM Users to Groups, as opposed to directly adding Policies to the Users.

But this seems in conflict with having a non-increasable service limit where IAM Users can only be in 10 Groups.

How do you all manage situations where you feel a User should be in >10 Groups?

Matt Gowie

04:32:39 PM

You can put 10 policies in one group, so I’ve honestly never needed to associate a user with more than 100 policies.

jose.amengual

04:50:08 PM

we hit that limit already

jose.amengual

04:50:26 PM

we have users in more than 10 groups

jose.amengual

04:50:37 PM

and more than 10 policies per group

Matt Gowie

04:50:55 PM

AWS limit that you can request to be bumped then?

jose.amengual

04:51:02 PM

We are looking at ways to use Service policies for the ORG

jose.amengual

04:51:22 PM

not, it can’t be bumped, that is what they told us

jose.amengual

04:51:27 PM

it is a hard limit

Matt Gowie

04:52:04 PM

So you have users who you want to put in more than 10 groups and add more than 10 policies to those groups is what you’re saying?

jose.amengual

04:54:10 PM

mmmm no I think I’m mistaken, we have more than 10 team/groups and some users need to belong to more than 10 groups so we attache policies directly to the users

jose.amengual

04:54:21 PM

it is a mess

jose.amengual

04:59:02 PM

you can’t add users to more than 10 teams

Matt Gowie

06:37:12 PM

Ah that is a mess. So @David’s is valid and you’ve already run up against. That sucks.

Jonathan Parker

03:40:36 AM

Can’t you create new pseudo groups for those that need to be in more than 10 groups? You might require some automation to manage the pseudo groups.

Jonathan Parker

03:42:29 AM

For example the automation would take two groups and make a new Group1_Group2 group with the policies from both. Would need to listen for updates to Group1 or Group2 to sync the policies over.

Tim Birkett

01:13:00 PM

That sounds like hell @jose.amengual - Perhaps you need to simplify your users / groups configuration or move to something like Okta / SSO and externalise user / group management instead allowing users to assume roles based on groups in the Identity Provider.

Tim Birkett

01:13:47 PM

It’s been a long time since I’ve had any AWS accounts that contain actual AWS users in them.

Zach

02:39:15 PM

We’re using assumed roles only, same deal. I have a few ‘legacy IAM Users’ that I’ve been giving a countdown to for their access to be removed

jose.amengual

04:34:22 PM

it is a mess, it has been done for a log time by hand and now we have a team of people starting to automate, finally

Tim Birkett

05:59:55 PM

Ah… the joys… How are your groups organised? Around business function or capability? I’m guessing all the users use a single account?

For the users with 10 groups you might be better off either consolidating their permissions into more powerful groups or creating roles that have the policies attached and allow those users to assume those roles.

Tim Birkett

06:03:02 PM

I think that anything you do will be coding around the problem and your account(s) is probably due a bit of an IAM refactor. I’ve done similar things with various places migrating from IAM users / groups to IAM users / roles to SSO / roles with permissions boundaries.

These things are always scary but can be done gradually allowing experimentation and learning. You can enable SSO and run it alongside plain old IAM users. There are a few good articles out there on the subject (https://segment.com/blog/secure-access-to-100-aws-accounts/) and lots of tooling to help your users (aws-vault, aws-azure-login, aws-okta). Maybe it’s worth a chat with your friendly local AWS TAM to get some ideas?

That said, I don’t know how big the real problem is for you or if it warrants the work needed to refactor your accounts / IAM. It would be satisfying for you once it’s “done”

Secure access to 100 AWS accounts attachment image

The Segment team’s latest thinking on all things data, product, marketing, and growth.

jose.amengual

06:19:30 PM

we have groups that we attach roles to it so it is not soooo bad but there is a mix of user with policies attached, users attached only to groups or direct attached policies

jose.amengual

06:19:51 PM

so we need to cleanup and take one approach

2020-06-04

2020-06-05

sahil kamboj

08:14:54 AM

Quick question Do we need security group (ports open) to talk internally in vpc services

ikar

08:34:12 AM

you can use “default” security group for that. all machines with this group can see each other in internal communication

sahil kamboj

08:54:12 AM

what about services like rds

ikar

10:02:22 AM

same with rds, it had security groups too

sahil kamboj

11:42:58 AM

so you are saying default one have special permission to talk with in vpc

ikar

11:54:15 AM

well, not a permission, but an inbound rule that says: allow all traffic from the same group (must be in same region)

jose.amengual

04:48:44 PM

It is bad practice to use the default vpc, usually default vpc is fully open, inbound and outbound

jose.amengual

04:49:43 PM

in our case I think we delete it or at least change the rules to not allow anything and then we create SG for each resource

jose.amengual

04:50:28 PM

and you always need an SG to open ports to talk to other services inside AWS

01:44:09 PM

anyone get logging working between aws and datadog ? looking at datadog’s ecs_fargate#log_collection and considering fluentbit over the lambda log forwarder

Aleksandr Fofanov

02:02:35 PM

Yep, the fluentbit approach is working like a charm. The only downside is that you should have sidecar (which needs resources) for every workload that need to ship logs to datadog

02:16:10 PM

ah fantastic. i was planning to do the following, add a datadog sidecar container, use the fluent configuration from that document, and is there any additional configuration ?

02:16:25 PM

do i have to manually open the tcp port on the agent or can i just add a port mapping for that ?

Aleksandr Fofanov

02:21:56 PM

Nope, sidecar can talk to the main container in the task and vice versa without any additional rules or port openings. And obviously the task should be able to communicate with datagod’s intake for logs (i.e. have egress rule in sg)

02:23:43 PM

perfect!

02:23:58 PM

so the firelens and fluent portion, firelens is built into aws, but for fluent, do i need to install fluent ?

Aleksandr Fofanov

02:28:55 PM

Firelens is a log driver that allows to route logs and fluentbit (in a sidecar container) receives them and forward to the destination (datadog in your case). You don’t need to install anything, you should only change your task definition as suggested in the docs you’ve mentioned

02:29:44 PM

ah so it sounds like i need fluentbit sidecar with my app as a sidecar for sending logs

02:29:57 PM

i figured if i wanted apm, i also needed the ddagent as a sidecar too

02:30:08 PM

so then ddagent, fluentbit, and my app

Aleksandr Fofanov

02:33:48 PM

Not necessarily, AFAIK you can also have ddagent deployed as a service and configure your app to send metrics and traces to this service. So sidecar for ddagent in your app’s definition is not a must.

Steven

02:34:57 PM

AWS has a fluent image design to work with firelens. Makes it easy to setup. You’ll need to add datadog plugin to it or route all traffic through an aggregator that has the plugin

03:22:59 PM

interesting so perhaps to make our fargate more scalable, it would be best to deploy the ddagent as a service in my cluster, and then have all my fargate services (app + fluent with firelens config) configured

03:23:23 PM

so that way ddagent will get apm for the fargate services and each fluent sidecar with firelens config will send the logs to datadog

Aleksandr Fofanov

03:40:03 PM

yes

03:44:23 PM

we have multiple regions, multiple clusters. would every cluster in a region need its own ddagent service ?

Aleksandr Fofanov

04:01:07 PM

Yes, each instance of ddagent would monitor the cluster it’s deployed to.

06:31:27 PM

i got fluent to work appropriately using a fluent sidecar

06:31:40 PM

running into this now. hoping someone from this thread might be able to weigh in

06:31:40 PM

https://sweetops.slack.com/archives/CCT1E7JJY/p1592504977306200

Regarding fluent-bit pushing logs from ECS to Datadog, we noticed there is a log key being used for every log entry. Is there a way to rename this to datadog’s expected msg key ?

Joseph Ashwin Kottapurath

02:33:22 AM

hey everyone, I am having an issue with an AWS deployment. I am very new to both AWS and to terraform. I get this error when I apply the terraform configuration. I am using the default VPC with a security group with the following configuration:

ingress {
    from_port       = 443
    to_port         = 443
    protocol        = "tcp"
    cidr_blocks     = ["0.0.0.0/0"]
  }

  ingress {
    from_port       = 80
    to_port         = 80
    protocol        = "tcp"
    cidr_blocks     = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

Joseph Ashwin Kottapurath

02:34:46 AM

https://stackoverflow.com/a/30140563/5356465 according to this, the issue could be with the VPC, but this is the default VPC, isn’t it supposed to have internet access by default? I haven’t associated this with any VPC at all

LaunchWaitCondition failed. The expected number of EC2 instances were not initialized within the given time

The error message is: Stack named ‘awseb-e-r3uhxvhyz7-stack’ aborted operation. Current state: ‘CREATE_FAILED’ Reason: The following resource(s) failed to create: [AWSEBInstanceLaunchWaitCondit…

Joseph Ashwin Kottapurath

02:41:34 AM

is this because of the VPC or is it related to something else? do you guys need any other information? if it’s related to VPC, is it because I am using the default VPC and aren’t associating the Elastic Beanstalk environment with any VPC?

maarten

06:17:45 AM

I’m looking for experiences with the Amazon Partner Network ? Are there freebies involved with certification ? Other benefits ?

Igor

02:21:37 PM

You get credits that exceed the membership cost

2020-06-06

David Medinets

07:05:44 PM

Hello. Just found you. Would you have any idea about using AWS CloudWatch Agent on Fedoura CoreOS? I have the SSM Agent working but I want to copy logs file to CloudWatch and SSM is deprecated. Wish I knew that last night.

jose.amengual

07:17:52 PM

SSM is deprecated?

jose.amengual

07:19:22 PM

the log agent part, yes

David Medinets

11:51:08 PM

I figured out how to get both ssm and cloudwatch agents running on FCOS.’

2020-06-07

07:59:59 PM

Any objections with storing secrets using kms encryption in s3?

jose.amengual

09:19:18 PM

I think is better to use parameter store or secret manager

10:53:52 PM

Doesn’t that cost a lot more compared to s3? I do like the idea of secret rotation with secrets manager

loren

11:42:26 PM

parameter store is free, unless you need the “advanced” parameter feature

loren

11:42:43 PM

secrets manager does seem rather pricey in comparson

loren

11:46:30 PM

personally i see nothing wrong with kms-encrypted s3 objects. i think it is a fantastic option. to me, really the only difference between the three are their APIs, and any integration with other aws services (for automatically retrieving/decrypting values, e.g. cloudformation + parameter store)

jose.amengual

03:57:50 AM

Parameter store and Secret manager have version for secret, use KMS for the encryption etc

jose.amengual

03:58:11 AM

SM is 3x more expensive than Parameter store

jose.amengual

03:58:45 AM

s3 buckets can be set public very easily so having secrets could be a risk

mfridh

06:56:52 AM

Also, Parameter Store have size restrictions so you’re often forced to use S3 anyway.

Mike Schueler

11:19:12 PM

hello. migrating from my self hosted k8s cluster to EKS. considering switching my ingress controller. I guess the popular approach is to use a hybrid ALB + nginx setup, best of both worlds.

got 2 questions

it seems there are two different nginx ingress.

https://github.com/kubernetes/ingress-nginx/

https://github.com/nginxinc/kubernetes-ingress

the one released by nginx team looks interesting, wondering if anyone has experience with it? not sure if it’s even worth considering if i’m not getting the paid version

while the hybrid approach seems like the way to go, i haven’t been able to find steps online to setup. i guess the AWS way is to just use the ALB ingress controller. anyone point me in the right direction? or convince me i should just use the ALB ingress?

kubernetes/ingress-nginx

NGINX Ingress Controller for Kubernetes. Contribute to kubernetes/ingress-nginx development by creating an account on GitHub.

nginxinc/kubernetes-ingress

NGINX and NGINX Plus Ingress Controllers for Kubernetes - nginxinc/kubernetes-ingress

2020-06-08

Matt

11:51:33 AM

I have an existing Beanstalk application which is running as single container applications. I just enabled CloudWatch logging and now all log data is streaming to CloudWatch (nginx, docker, etc) except for the log messages written to stderr/stdout. In other words the log messages I care most about.

I can see exactly what's wrong, the config file for the Cloudwatch agent (/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml) is configured to track a stderr/stdout log file at this path /var/log/eb-docker/containers/eb-current-app/stdouterr.log. However, my applications are logging to this path /var/log/eb-docker/containers/eb-current-app/eb-f54b7e030fd5-stdouterr.log

The CloudWatch log group uses the path with 'eb-current-app/stdouterr.log' in it (no surprise there).
The f54b7e030fd5 is the (short) ID of the running Docker container.

Does anyone know how I can configure this? I'd like to do it in a way that will of course survive beyond containers and instances as they spin up and down. I can define additional log files in .ebextensions which allows Beanstalk to become aware of non-standard log files but that would require me knowing the container ID ahead of time i.e. when bundling the Beanstalk application version. 
It seems like the best fix would be to force Beanstalk to log stderr/stdout from the container to the path which CloudWatch expects.

Joe Niland

09:57:47 PM

The file attribute in the config file supports wildcards I’m pretty sure. Have you tried that?

Matt

10:10:16 PM

yes, didn’t seem to work for me

Joe Niland

10:17:07 PM

Do you get any errors in eb-activity.log?

Matt

10:23:00 PM

Matt

10:23:04 PM

the service runs fine

Matt

10:23:16 PM

The logs just aren’t being streamed to CloudWatch

Joe Niland

10:24:21 PM

Yeah, I do remember when this happens awslogs agent should notify if the file match pattern doesn’t match. Perhaps it was in its own log file.

Joe Niland

10:24:38 PM

It’s been a while since I looked at this

Matt

12:52:43 PM

Turns out it’s a bug, AWS is working on it.

Joe Niland

10:44:42 PM

Wow, ok. Thanks for updating. Is it a bug with the agent or in the EB processes?

Matt

10:53:33 PM

I think EB

Matt

10:53:53 PM

the way it configures the Docker env

Joe Niland

10:54:54 PM

Ok I see.

Matt

11:51:56 AM

Fun with Beanstalk. . .

2020-06-09

12:39:12 PM

did anyone use CW>>lambda>>firehose>>splunk integration? We have an issue where the logs have a delay around 3m. Is there a better solution for app logs from AWS to splunk without any delay. I’ve used kinesis streams in the past but seems like the driver isnt supported and if the log data is huge it seems to be dropping logs

Mike Schueler

04:51:57 PM

we are using, but w/o firehose. if you’re just shipping logs to HEC, wondering the need for it

02:54:05 AM

How are you sending the logs to directly to hec from cloud watch?

Mike Schueler

02:54:55 AM

lambda

Mike Schueler

02:55:49 AM

just don’t see what you’re using firehose for

01:39:05 PM

so if the logs arent being processed then they can send it to s3 in firehose

01:39:13 PM

can we do that with lambda?

Matt Gowie

10:15:10 PM

Any Storage Gateway File Gateway users? Wondering if I can pick somebody brain… My file gateway keeps getting created with a private IP as if I’m trying to associate it to be an internal / VPC gateway which is not what I want.

Matt Gowie

11:05:32 PM

Figured this out. AWS is a PITA… They show / provide the private IP of the Storage Gateway instance in the console and corresponding file share mounting commands. That led me to believe AWS was only exposing the File Gateway privately inside my VPC, but that’s not the case. Using the Elastic IP that is associated with the storage gateway instance did the trick.

01:32:42 AM

In the coming few months, AWS Fargate will update the LATEST flag to Platform Version (PV) 1.4.0. This means all new Amazon Elastic Container Service (ECS) Tasks or ECS Services that use the Fargate launch type and have the platformVersion field in their Task Definition set to LATEST will automatically resolve to PV 1.4.0. For customers who use Amazon VPC Endpoints along with their ECS tasks running on Fargate, the new platform version has changes that may require customer action. For more information see the FAQs below. If you do not use VPC endpoints for Amazon ECR, AWS Secrets Manager or AWS Systems Manager no action is necessary.

2020-06-10

jose.amengual

05:56:59 PM

Hi I have an interesting problem, I’m enabling IAM auth for RDS and I have users that have U2F keys so the aws cli does not support them and so they can’t get a token for the RDS host, BUT they do have console access so I was wondering if there is a way to run trough the console the the command to get the token and then they can use it to connect to the host aws rds generate-db-auth-token is the command. I was thinking that maybe an SSM doc or like a container they can fire up and get the token or something like that, I do not want to use a Bastion host for this

Jason Huling

11:18:57 PM

Not an exact answer to your question… but I’m in a similar situation using a YubiKey for programatic/cli access, and I do use it for IAM auth with RDS.

Specifically I’m using aws-vault and the YubiKey with OATH-TOTP support. More details on setting it up here: https://github.com/99designs/aws-vault/blob/master/USAGE.md#using-a-yubikey

You can do the same setup without using aws-vault as well.

For console access, I use Alfred on Mac to run the ykman cli command to generate the token and paste it in the MFA prompt.

99designs/aws-vault

A vault for securely storing and accessing AWS credentials in development environments - 99designs/aws-vault

Alfred - Productivity App for macOS

Alfred is a productivity application for macOS, which boosts your efficiency with hotkeys, keywords and text expansion. Search your Mac and the web, and control your Mac using custom actions with the Powerpack.

Jason Huling

11:20:37 PM

So essentially, the easy way is to use the YubiKey with OATH-TOTP and not U2F.

jose.amengual

11:26:27 PM

that is interesting

jose.amengual

11:26:38 PM

we use aws-vault all the time

jose.amengual

11:27:05 PM

this specific user is on windows so I hope this works for him

Jason Huling

12:30:06 AM

It looks like the ykman cli is installed along with the GUI of YubiKey Manager for Windows (details), that should be all that is needed. Good luck!

jose.amengual

02:24:19 AM

I will report back

2020-06-11

07:00:56 PM

does anyone know if we can heapdump or thread dumps from fargate?

Steven

07:29:42 PM

As long as you can remotely trigger it and output to STDOUT, it would be possible. But haven’t done myself. We use APM and haven’t needed to dig deeper yet

Robert Horrox

12:53:16 AM

what are peoples thoughts on using https://eksctl.io/ vs terraform modules to setup an EKS cluster? Pros/cons

eksctl

The official CLI for Amazon EKS

Chris Fowles

12:54:02 AM

if everything else in your environment is terraform - use terraform.

eksctl

The official CLI for Amazon EKS

Chris Fowles

12:54:22 AM

if not then the tool seem solid

Robert Horrox

12:55:07 AM

It’s a new setup, using variant2 for the orchestration

Chris Fowles

12:58:16 AM

that seems brave. be interested to see how you go with that

Robert Horrox

12:59:31 AM

Brave or something else, a fine line. I will report my experiences

Chris Fowles

01:00:34 AM

hahaha good luck

Robert Horrox

01:04:43 AM

Raymond Liu

03:31:32 AM

It’s great, don’t use cloudformation, it sucks..

Robert Horrox

12:49:37 PM

eksctl is great?

Raymond Liu

01:56:58 PM

Yes, if you are new comer of AWS, it’s very easy to use eksctl to create AWS networking components such as VPC and subnets.

vFondevilla

11:42:24 AM

Maybe I’m a bit late to the party, but eksctl it’s great. We’re using it for our EKS and I love it.

2020-06-12

Maciek Strömich

10:08:03 AM

theoretically, would you replace web -> sqs -> worker -> firehose with a simple web -> firehose approach? I’m load testing the app currently with double the traffic I have normally and altho everything seems stable I’m having second thoughts about it because of increased latency (not much just 100-150ms on avg).

Chris Fowles

10:14:37 AM

firehose is designed for handling large amounts of data ingest - so if you’re putting things in front of it that’s probably defeating the purpose somewhat

Chris Fowles

10:14:42 AM

where’s it going after firehose?

Maciek Strömich

10:19:22 AM

Maciek Strömich

10:20:02 AM

previously the sqs -> worker -> firehose/elasticsearch made sense becauyse of how the elasticsearch indexes were created

Maciek Strömich

10:21:03 AM

and firehose wasn’t able to deliver messages the way we needed into the index but since we have dropped elasticsearch I started to question the architecture

Chris Fowles

10:21:35 AM

i’d say you’re probably running more than you need to, and bottle-necking firehose. this is without knowing your specific use case though

Maciek Strömich

10:22:37 AM

also spending several k more per month for maintaining worker instances

Chris Fowles

10:28:57 AM

yup

Bircan Bilici

11:13:35 AM

AWS, Create IAM Role API Collapsed https://stackoverflow.com/questions/62341775/aws-create-role-rate-exceeded?answertab=votes#tab-top

AWS Create Role Rate exceeded

While creating a new IAM role I am getting Rate exceeded I have around 215 roles for my AWS account. Is that a limit, if it is how can i increase it? if not a limit how I can resolve it?

randomy

01:46:03 PM

How do people use and switch between different AWS accounts (hundreds) in the AWS console? We currently have a CLI command that opens the browser for the specified AWS profile but it’s not great.

Tyrone Meijn

02:17:14 PM

We use AWS extend switch roles Chrome extension

niek

02:17:28 PM

we too

niek

02:17:34 PM

and a config file for the cli

Tyrone Meijn

02:18:02 PM

I do not however manage 100 accounts, so don’t know what the experience is at that level.

Santiago Campuzano

02:27:54 PM

We use assume-role at the CLI

Santiago Campuzano

02:28:15 PM

And same Chrome extension as @Tyrone Meijn

Joe Niland

10:48:57 PM

Using aws-vault exec or login depending on the need

Mike Schueler

11:46:11 PM

SSO w/ all the accounts under an org master

Jonathan Parker

04:11:06 AM

At a previous role I used different Chrome profiles for the different IAM authentication accounts (2) and then different AWS Extend Role Switcher Config for the two profiles. You could also generate the AWS Extend Role Switcher config using a NibleText template or some other templating tool so that you can switch the config between different groups of accounts.

niek

02:17:12 PM

Anyone any advise what to use for encrypting environment variables for AWS lambda, so any opnion about using SSM with KMS or AWS secret manager?

msharma24

07:48:57 AM

SSM secured string uses the aws managed KMS.

aaratn

08:26:29 AM

There is an option to encrypt environment variable on lambda itself. You can use KMS key to encrypt those lambda variables

2020-06-13

2020-06-15

Maciek Strömich

12:41:07 PM

anyone using aws ecr get-login in a Makefile here? I’ve this annoying issue that whenever I execute any aws cli command that assumes a role and the source_profile is MFA protected I get following error:

aws ecr get-login --no-include-email
Enter MFA code for arn:aws:iam::AWS_ACCOUNT_ID:mfa/USER:  docker login -u AWS -p [...]
make: Enter: No such file or directory
make: *** [ecr_login] Error 1

here’s my makefile target

ecr_login: AWS_REGION ?= us-east-1
ecr_login:
	$(shell docker-compose run --rm -e AWS_PROFILE=$(AWS_PROFILE) $(DEV_IMAGE) aws ecr get-login --region us-east-1 --no-include-email)

as far as I can see it fails because I’m trying execute the output of the command which is returned to STDOUT but the Enter MFA code is also provided via STDOUT and the execution going nuts. 2nd execution succeeds because boto3 stores a session and doesn’t ask for MFA for another hour. What’s also puzzling is that the Enter MFA information is not shown and you need to assume it’s there because terminal ‘hangs’ while expecting an input.

loren

12:51:59 PM

why are you using shell in this particular context? try removing that part

Maciek Strömich

01:33:06 PM

yeah it’s just to execute the docker login command. But your approach was right. I just had to replace the

$(shell command)

with

`command |tr -cd "[:print:]"

Maciek Strömich

01:34:42 PM

the piped tr is there to cut all non printable chars from the output before execution

loren

01:35:33 PM

was a shot in the dark, but glad it helped!

Maciek Strömich

01:36:34 PM

sometimes the simplest solutions are hiding from us because of “experience”

Eric Berg

11:28:39 PM

I’m working on tightening up our NACLs and was trying to find out whether NACLs apply to load balancer ingress. I have the same question regarding the new Transfer Servers VPC; you can configure a security group, but it’s not actually in a subnet, so NACLs shouldn’t apply. Am I off-track here? Thanks, as always.

maarten

05:38:50 AM

What @Chris Fowles said, NACL’s provide a good way of blocking IP addresses VPC wide. Security Groups are the instrument to do the actual firewalling with.

Eric Berg

12:40:09 PM

Thanks for the input, guys, but my question was whether NACLS actually even apply to ingress on load balancers.

I have heard and like the idea of keeping NACLs to a minimum.

Chris Fowles

11:33:44 PM

Most of the time NACLs are not a great pattern to follow.

Chris Fowles

11:35:37 PM

Trying to tighten nacls beyond “this subnet should only receive traffic from this other subnet” is generally going push down more on the reduced functionality side of the functionality vs security balance.

Chris Fowles

11:37:18 PM

(this is as a warning from someone with burnt fingers - not a criticism)

2020-06-16

sahil kamboj

01:34:07 PM

hey guys Need help regarding Application Load balancer i have setup aws acm and attach it to the listener of alb on port 443, which is redirected to target group for 443 and a simple listener for port 80 redirected to target group on port 80 target group(instances are same but on different port 80 and 443) instances are serving webapp on port 80 I am getting 502 with https and working well on http(which not showing ssl)

what i am doing wrong?

Zach

01:39:00 PM

and a simple listener for port 80 redirected to target group on port 80
target group(instances are same but on different port 80 and 443)
instances are serving webapp on port 80 This seems mixed up, not sure if you just typed it incorrectly?

Zach

01:40:08 PM

What you probably meant/want is 1 ALB rule that redirects port 80 to 443 on the ALB 1 ALB rule that forwards 443 to your targetgroup (etiher on 80 or 443 at that point, whatever the group is configured to listen on)

sahil kamboj

02:38:59 PM

shuold i forword alb 443 listner to 80

sahil kamboj

02:42:35 PM

now i have simple config listener 443 to target ec2 port443 listener 80 to target ec2 port 80

Zach

02:53:22 PM

You want both 443 and 80 to reach the instance?

Zach

02:53:29 PM

if so, yes thats the config

Zach

02:53:53 PM

most setups though you redirect port 80 on the ALB to 443 so that it establishes the TLS

sahil kamboj

03:03:04 PM

@Zach Thnx man its working now

Zach

03:42:23 PM

sahil kamboj

02:40:51 PM

should i forward alb 443 listener to 80 listener?

Zach

12:04:44 AM

omg https://aws.amazon.com/about-aws/whats-new/2020/06/amazon-ec2-auto-scaling-now-supports-instance-refresh-within-auto-scaling-groups/

Chris Fowles

12:05:57 AM

omg!

Chris Fowles

12:06:13 AM

i was writing my own ASG roller

Chris Fowles

12:06:19 AM

now i can do something else

Chris Fowles

12:06:21 AM

Zach

12:09:26 AM

Question now is how long will it take the AWS provider to support it in terraform

loren

12:26:22 AM

I’m kinda curious what it would even look like in terraform… You still have to change the launch config… So, would it be a non-replacing change to that property of the asg? Maybe send the StartInstanceRefresh command instead of force replacing the asg?

Zach

12:54:53 AM

Yah might be odd. Maybe this is just deployment command rather than something in the state.

Zach

03:15:21 PM

https://github.com/terraform-providers/terraform-provider-aws/issues/13785

loren

12:26:53 AM

Here’s a better article describing how the feature works, strangely not linked from the what’s new post… https://aws.amazon.com/blogs/compute/introducing-instance-refresh-for-ec2-auto-scaling/

Introducing Instance Refresh for EC2 Auto Scaling | Amazon Web Services attachment image

This post is contributed to by: Ran Sheinberg – Principal EC2 Spot SA, and Isaac Vallhonrat – Sr. EC2 Spot Specialist SA Today, we are launching Instance Refresh. This is a new feature in EC2 Auto Scaling that enables automatic deployments of instances in Auto Scaling Groups (ASGs), in order to release new application versions or make […]

Matt Gowie

12:46:50 AM

SWEET. Stoked this is around. “Instance Refresh” seems like a pretty poor name for this though…

Jonathan Le

04:41:54 PM

I kinda like the name. I can’t think of a better one off the top of my head.

Matt Gowie

05:52:27 PM

I think “Instance Rollout” would be my name. Not saying that is much better, but “instance refresh” makes me think of updating static instances. Not that it is swapping out old instances with new ones.

Jonathan Le

06:43:58 PM

Ahh, I guess I just see your point. I’m bet Product Marketing said Refresh sounds better, so they went with that.

Matt Gowie

06:53:38 PM

Haha yeah likely.

Igor

02:14:22 AM

Couldn’t you just scale up with the new launch template and then scale down? That’s how we refreshed instances.

Adrian

06:50:53 AM

Yes, if you have one cluster to manage

Zach

12:35:33 PM

Yes that’s what people tended to due, or just replace the ASG entirely. The downside of those methods is that they take awhile to execute, and you double (or more) your number of running instances during the swap - which could run into account limits or temporary AWS capacity shortages.

2020-06-17

11:33:28 AM

has anyone tried this with ECS EC2 instances to see if those will be adequately drained ?

Zach

02:46:24 PM

Is there a good pattern for managing DataPipelines for EMR between dev/staging/prod in some sort of sane manner? Its simultaneously ‘infrastructure’ and software all rolled up in one. We tried doing it all in terraform but its a mess because the Data team has no idea how to maintain that, and its got like 100 more options than they need … and then we’re stuck trying to use terraform as a config mgmt tool

Jonathan Le

04:48:40 PM

I’ve run into this problem a few times. It’s always turned into “Data Engineering Team owns EMR”. We might try to bootstrap something for Security and watch IAM and SGs and the normal stuff, but I haven’t really found a good way to bring EMR clusters into a Central DevOps/SRE team.

End up easier that that team just hired their own DevOps/Infra person and managed stuff in that space more loosely coupled than the central team.

Zach

04:50:09 PM

Yah that is what I’m trending toward. They were comfortable writing some Ansible/boto stuff to create the EMR, but we took it off their hands because we didn’t want to give them all these resource creation permissions for the SGs and IAM. That then turned into a constant “hey can you guys update the json in terraform with X”. But maybe I just let them do that and turn it into an automation job they can run

Jonathan Le

04:58:01 PM

Hmmm…maybe do a Jenkins job with a bunch of build parameters to launch the cluster and me okay as it morphs away from code

Jonathan Le

04:58:51 PM

I dunno. I’m my older years, I really believe in decentralized DevOps squads now, esp. in my current role. It’s super expensive to have teams with their own SRE though. Only big successful orgs can do it.

Zach

05:00:49 PM

yah we’re a small company, I don’t have time to micro manage this stuff

Jonathan Le

05:01:51 PM

the constant json updates reminds me of this: https://info.acloud.guru/resources/why-central-cloud-teams-fail-and-how-to-save-yours

Why “central cloud teams” fail (and how to save yours) attachment image

After years of watching central cloud teams struggle, I found a surprising metric that predicts whether your cloud transformation will succeed.

Zach

05:02:21 PM

We initially tried it as this HUGE parameterized terraform module, with a Rundeck job for it, where they could specify overrides and stuff, but it was just too much for them to figure out how certain settings mapped into what they saw in the execution. For whatever reason they understand the boto reprsentation of the Pipeline better than how it actually ends up rendered in the AWS side

Jonathan Le

05:50:50 PM

Zach

05:51:47 PM

Yah I dunno. I’m happy to remove it from my terraform repo and just tell them to make some python to create their pipeline and we’ll run it from a job

Jonathan Le

07:04:33 PM

that might be a fair idea. create a repo they can PR too, devops approves the merge and the jenkins job fires off the apply

Jonathan Le

07:05:37 PM

if they get python templating and don’t get TF, it’ll make them more productive, you still get security review with the PR review and jenkins maintains IAM applies/SG permissions

Zach

07:06:06 PM

Right, mostly just need to see that they didn’t stick in “super_secret_admin_role”

Igor

03:12:04 PM

Woah. I just noticed https://aws.amazon.com/about-aws/whats-new/2020/06/introducing-aws-codeartifact-a-fully-managed-software-artifact-repository-service/ Per usage pricing

07:08:23 PM

So we have to maintain a list of iam roles that access our s3 bucket within kms or those roles will have s3 access but not decryption access

aws iam list-entities-for-policy --policy-arn arn:aws:iam::snip:policy/s3-access --query 'PolicyRoles[].RoleName' | egrep -v '^\]|^\[' | cut -d'"' -f2 | sort | uniq

we have to manually update the kms policy based on this list. is there a way of automating this to keep our kms policy in line with any iam role containing the policy ?

07:08:56 PM

there doesn’t seem to be a terraform list-entities-for-policy data source to tie into this

Harry

11:42:03 PM

Does anyone here know how to add instances to Systems Manager? I’ve got a few showing up somehow but the majority of my instances aren’t in the list and I’m not sure if it’s something I did in the web interface that added them or something I changed on the box somehow.

Matt Gowie

01:44:20 AM

It’s the SSM-Agent running on those instances. It comes with Amazon Linux 2 AMIs.

Matt Gowie

01:46:05 AM

You can install it yourself on other linux distros if that is what you’re looking to do:

sudo yum install -y <https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm>
sudo systemctl enable amazon-ssm-agent
sudo systemctl start amazon-ssm-agent

Harry

01:53:03 AM

Ok, so just having the instance on the box is enough?

Matt Gowie

01:55:46 AM

You mean the agent? Yeah. You may also need an EC2 Instance Profile with correct permissions… I forget actually

Here are the docs: https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent.html#sysman-install-ssm-agent

Working with SSM Agent - AWS Systems Manager

Install SSM Agent on EC2 instances, and on-premises servers and virtual machines (VMs), to enable AWS Systems Manager to update, manage,and configure these resources.

Matt Gowie

01:58:42 AM

Pretty sure I’m wrong about the Instance Profile — You don’t need that. I have a module written around ssm-agents, so I should remember these things haha.

Harry

01:59:18 AM

Ah, I think I’ve cracked it - some of my instances didn’t have the correct IAM policy.

Matt Gowie

02:16:46 AM

Ah what was the IAM Policy they needed? I was looking back at my module and thinking they didn’t need anything…

Harry

02:55:04 AM

I’m using it with cloudwatch so I needed both of these

"arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore",   "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy",

Matt Gowie

03:06:11 AM

Aha yes — AmazonSSMManagedInstanceCore

randomy

12:46:47 PM

Beware the AmazonSSMManagedInstanceCore policy, it gives access to all SSM parameters.

Harry

01:05:47 PM

Is that not what I need to use SSM with those instances?

Matt Gowie

02:29:16 PM

@Harry It is. @randomy is just saying it’s pretty permissive which can be a bad thing. If that sounds scary to you (probably if you’re in a large org) then you can look at the underlying policy and pull it apart. I’d only do that if you’re worried about security / POLP / etc within your org / team.

Harry

02:32:27 PM

Ah ok, good to be aware of but I’m definitely at the scale where I’m not going to worry too much about that. I’m the only one doing any ops work here.

randomy

02:33:39 PM

yep, that’s what i meant, thanks. a colleague of mine found that this policy worked: https://gitlab.com/claranet-pcp/terraform/aws/tf-aws-iam-instance-profile/-/blob/master/ssm.tf#L42-68

plus this if you want to use session manager: https://gitlab.com/claranet-pcp/terraform/aws/tf-aws-iam-instance-profile/-/blob/master/ssm.tf#L109-124

ssm.tf · master · Claranet PCP / Terraform Modules / AWS / tf-aws-iam-instance-profile

GitLab.com

ssm.tf · master · Claranet PCP / Terraform Modules / AWS / tf-aws-iam-instance-profile

GitLab.com

loren

06:34:36 PM

holy smokes, why on earth does ManagedInstanceCore have GetParameter[s] ? argh… sometimes i hate the aws managed policies so so much

2020-06-18

06:29:37 PM

Regarding fluent-bit pushing logs from ECS to Datadog, we noticed there is a log key being used for every log entry. Is there a way to rename this to datadog’s expected msg key ?

06:29:56 PM

I’m hoping I won’t have to create my own fluent container

06:30:41 PM

Id like to use an environment variable if possible

06:30:49 PM

Worst case scenario, I have to use a custom fluentbit configuration which I should be able to store in S3 and retrieve easily upon starting a new task

06:38:51 PM

Ref: https://github.com/aws/aws-for-fluent-bit/issues/45

How do I replace the "log" key with something else? · Issue #45 · aws/aws-for-fluent-bit

I'm using a fluent-bit sidecar in ECS to ship logs from my app to datadog. I want to use the msg key that datadog expected and I've been seeing the log key coming from fluent. Is there a wa…

07:06:42 PM

Are you importing your logs as JSON?

07:06:54 PM

yessir

07:08:21 PM

I believe that is done by using this in the firelens config

"config-file-type": "file", 
"config-file-value": "/fluent-bit/configs/parse-json.conf"

07:09:14 PM

If you have it coming in as JSON, I think you need to remap the values in the log config in

07:10:21 PM

In my case I had to run a simple grok pattern that parsed the log as JSON. When you do that, DD will identify all the properties of the log as metrics you can add as columns in the log explorer

07:10:48 PM

I believe that is done by using this in the firelens config Yep, I was thinking about using an s3 conf file if I cannot use an env var to remap the key

07:11:16 PM

Once they are separate metrics you can remap them in DD

07:11:17 PM

If you have it coming in as JSON, I think you need to remap the values in the log config in Have not considered this. Would this affect all services ? I wonder if it would be better to do it in fluentbit or in datadog itself.

07:11:49 PM

In my case I had to run a simple grok pattern that parsed the log as JSON. When you do that, DD will identify all the properties of the log as metrics you can add as columns in the log explorer
Once they are separate metrics you can remap them in DD that’s really good to know. I will investigate further

07:13:59 PM

This is the message mapper ui. Go to Logs > Config then create a pipeline (filtering for your apps logs). All logs the filter catches will run through the pipeline and be remapped

2020-06-19

2020-06-21

sweetops

10:30:49 PM

Anyone here use Client VPN with Transit Gateway? It doesn’t look like Client VPN is supported as a transit gw attachment. So I’m trying to think of creative ways to make it work.

maarten

07:51:03 AM

No experience with Client VPN, and this will most likely kill the way you setup security; I think you can create a NAT instance in a routable transit gw subnet, in which you MASQ the VPN traffic through.

2020-06-22

Pierre Humberdroz

06:59:38 PM

Did someone here already run the a1 instances with eks ? How does it full to run k8s on arm?

Santiago Campuzano

12:04:03 AM

Hello there ! Does anyone know how to enable S3 access logs to a bunch of S3 buckets (100+) I don’t want to go one by one through the AWS UI ?

Santiago Campuzano

12:04:20 AM

Seems like the aws cli does not have an option for doing that

bradym

12:30:27 AM

I think you’re looking for aws s3api put-bucket-logging

Santiago Campuzano

12:32:36 AM

Thanks @bradym !!!!

bradym

12:37:53 AM

Happy to help.

Santiago Campuzano

12:40:41 AM

Interesting / odd that there is no way for reversing that

Santiago Campuzano

12:41:16 AM

’Cause I want to enable logging for a couple of weeks and then disable all the logging

bradym

12:44:20 AM

You use the same command. From the manpage aws s3api put-bucket-logging help:

To enable logging, you use LoggingEnabled and its children request elements. To disable logging, you use an empty BucketLoggingStatus request
element:
    <BucketLoggingStatus xmlns="<http://doc.s3.amazonaws.com/2006-03-01>" />

Santiago Campuzano

12:45:17 AM

Hmmm interesting

Santiago Campuzano

12:45:21 AM

Thanks again !

Haroon Rasheed

05:31:34 AM

Hi folks - I have general question on AWS networking. Is there any limits on how many packets per seconds allowed for a AWS ENI?

maarten

07:39:52 AM

No implied limits other than physical limits of the networking infrastructure.

Haroon Rasheed

08:22:38 AM

Actually I tried to pump more than 150K packets per second(pps) with 128Bytes frame size to the instance type c5a.4xlarge(supported up to 10Gbps speed) . I got only around 100K pps inside the aws instance.

2020-06-23

Karoline Pauls

12:42:26 PM

Does anyone have an idea why SSL connections to Influx Cloud would get stuck establishing much more if done from EKS pods running on nodes in private subnets, with aws-vpc-cni, as opposed to EC2 instances with public IP addresses?

Public EC2 instances have default sysctl net.ipv4.tcp_keepalive_* settings.

05:59:03 PM

anyone have thoughts on cost cutting of fargate services ?

05:59:16 PM

we’re using a ddagent and fluentbit sidecar for each of our apps

05:59:42 PM

we’re constrained by the task definition mapping of fargate services, such as 1024 cpu has to use minimum of 2048 mem

jose.amengual

06:00:16 PM

I will say they stupid answer you are not hopping for but use ECS+EC2

06:00:35 PM

ah crap. ya thats not what im looking for as we’re migrating off of that haha

06:00:55 PM

we’re simultaneously “rightsizing” these tasks to reduce costs

jose.amengual

06:00:55 PM

we calculated that fargate when it gets close to the limit of cpu unit and memory it is 3 times more expensive than ecs+ec2

06:01:09 PM

oh wow

jose.amengual

06:01:37 PM

we run docker with 300 GB of memory

06:01:39 PM

we’re in the middle of cost comparing now and willing to pay extra to forego managing our own cluster instances

Zach

06:01:43 PM

Even with the recent-ish price changes?

jose.amengual

06:02:00 PM

maybe now is about 2x

Zach

06:02:45 PM

I keep looking at it because I’d rather not have to manage ECS but my VP would slam the door if I told him it was 2x the cost of our EC2 instances

jose.amengual

06:03:41 PM

do your math, I could be wrong too, I did this last year

06:03:41 PM

as a test with real data, you could spin up your most expensive ECS service in both fargate and ec2 and have 50% of production traffic hit 1 and hit 2

06:03:54 PM

then use cost explorer to see which costs more (provided each have different tagging)

Zach

06:06:18 PM

Back in Jan 2019 when they did the last price cut AWS blogged this, which is why I keep looking at it now
If your application is currently running on large EC2 instances that peak at 10-20% CPU utilization, consider migrating to containers in AWS Fargate.

jose.amengual

06:11:05 PM

sure, if it uses less than 16GB of ram and whatever is the limit of cpu units

jose.amengual

06:11:25 PM

if you are using ebs or efs need to watch that out

jose.amengual

06:11:32 PM

they just rolled out EFS support

jose.amengual

06:11:53 PM

but I do not know about EBS, I do not think is possible

Zach

06:13:01 PM

Oh yah, in my case I’m using a lot of t3 nano/mico/small instances for our EC2 apps

Zach

06:14:21 PM

its my goal to get the company into containers but its slow and the price of the managed services makes my bosses unhappy

Steven

06:14:35 PM

Prices have gone down a couple times in last year. But new options have been added as well. Savings plans discounts apply to fargate and spot for fargate is available. We just cut $50K by moving services from fargate to fargate spot

jose.amengual

06:15:41 PM

interesting

06:20:36 PM

how does one enable fargate spot with terraform ?

06:22:00 PM

ah i see it https://github.com/terraform-providers/terraform-provider-aws/issues/11134

Zach

06:22:45 PM

yah kind of a secret squirrel definition … their docs don’t mention what the potential names for the capacity provider are

06:22:49 PM

aws_ecs_cluster has a default_capacity_provider_strategy and aws_ecs_service has a capacity_provider_strategy that can set the capacity_provider to FARGATE_SPOTcutting costs further.

06:23:22 PM

yah kind of a secret squirrel definition … their docs don’t mention what the potential names for the capacity provider are yep. looks a bit confusing from the docs but that issue makes it look pretty easy

Steven

06:31:12 PM

We had to create new ECS clusters to get the fargate spot capacity provider defined. Don’t know if they fixed that. But easy to setup services to use it

07:09:52 PM

have you all noticed any issues betw switching from fargate back to ec2 ?

07:10:07 PM

we were thinking perhaps it would be as easy as changing the launch type from fargate back to ec2

07:10:46 PM

this was thought of as a failsafe if the few services that we’re POCing have a significant cost over the EC2 launch type.

Steven

07:23:17 PM

There are a few changes needed. How many depend on your config on the EC2 side. But it is all straight forward and consistent. Assuming EC2 side is consistent

07:24:07 PM

one thing that comes up is that before going to fargate, our ec2 containers would log to cloudwatch whereas our fargate containers use the ddagent and fluentbit containers to send logs to datadog so we’d have to remove those too

Steven

07:28:58 PM

You can have fargate log to cloudwatch or use fluentbit sidecar. Again easy to change in task def. If you’re doing ddagent as a sidecar, that will also be easy. If you have it in your app image, this just got outside of terraform’s complete control

Chris O.

09:36:33 PM

ooh, does anyone have a pointer to a KISS implementation of a elasticsearch/kibana logging sidecar?

Joe Niland

12:44:35 AM

@RB if you want a code sample for enabling Fargate Spot I can provide

01:22:01 AM

Yes please! @Joe Niland

Joe Niland

03:28:59 AM

@RB

Cluster example:

resource "aws_ecs_cluster" "fargate_cluster" {
  count = var.enabled ? 1 : 0
  name  = var.cluster_name

  capacity_providers = ["FARGATE_SPOT", "FARGATE"]

  default_capacity_provider_strategy {
    base              = 0
    capacity_provider = var.default_capacity_provider
    weight            = 1
  }

  setting {
    name  = "containerInsights"
    value = var.container_insights_enabled ? "enabled" : "disabled"
  }

  lifecycle {
    create_before_destroy = true
  }
}

cluster .tfvars:

default_capacity_provider   = "FARGATE_SPOT"
container_insights_enabled  = false

ECS Service example (from https://github.com/cloudposse/terraform-aws-ecs-alb-service-task/blob/master/main.tf#L253)

# this is within resource "aws_ecs_service" "a_service" {

dynamic "capacity_provider_strategy" {
    for_each = var.capacity_provider_strategies
    content {
      capacity_provider = capacity_provider_strategy.value.capacity_provider
      weight            = capacity_provider_strategy.value.weight
      base              = lookup(capacity_provider_strategy.value, "base", null)
    }
  }

Service .tfvars:

capacity_provider_strategies = [
    {
      capacity_provider = "FARGATE_SPOT",
      weight            = 1,
      base              = 1
    }
  ]

cloudposse/terraform-aws-ecs-alb-service-task

Terraform module which implements an ECS service which exposes a web service via ALB. - cloudposse/terraform-aws-ecs-alb-service-task

Joe Niland

03:29:52 AM

base is minimum tasks per provider weight is a proportion, so you could put 1 for FARGATE and 1 for FARGATE SPOT and 50% of your tasks will run on each.

Joe Niland

03:29:53 AM

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cluster-capacity-providers.html

Amazon ECS capacity providers - Amazon Elastic Container Service

Amazon ECS capacity providers enable you to manage the infrastructure the tasks in your clusters use. Each cluster can have one or more capacity providers and an optional default capacity provider strategy. The capacity provider strategy determines how the tasks are spread across the cluster’s capacity providers. When you run a task or create a service, you may either use the cluster’s default capacity provider strategy or specify a capacity provider strategy that overrides the cluster’s default strategy.

rms1000watt

12:16:55 AM

Are you able to use API Gateway to point directly to Target Groups (specifically, I’m curious about getting to ECS services WITHOUT using a load balancer in front of ECS)?

maarten

10:12:13 AM

Why don’t you want to use a load balancer ?

rms1000watt

04:43:01 PM

To support a crazy architecture

rms1000watt

04:43:21 PM

But actually, we came to a different conclusion, so this is not needed anymore

rms1000watt

04:43:35 PM

(where we are putting load balancers in front, haha)

maarten

06:55:12 PM

sanity prevails

sahil kamboj

05:53:52 AM

Hey Guys aws rds (mariadb10.4) root user by default cant give GRANT ALL permission how can we make it do that I need grant all permission

Jonathan Marcus

03:51:25 PM

Are you using a managed instance? If so then it makes sense that you wouldn’t be full admin on that DB instance.

2020-06-24

maarten

11:41:05 AM

Does anyone have SAP-C01 dumps to share ?

Matt Gowie

04:49:07 PM

For anybody using ECS — I released a side-project over the weekend that may help you avoid writing an ugly bash script in the future: ecsrun. It’s a small golang CLI tool that provides a wrapper around the ECS RunTask API and it aims to be much more easily configurable than invoking the AWS CLI yourself. It enables invoking admin tasks like database migrations, one-off background jobs, and anything similar that you can cram into a Docker CMD. I’m eager to get some feedback if anybody ends up using it!

https://github.com/masterpointio/ecsrun

masterpointio/ecsrun

Easily run one-off tasks against a ECS Task Definition - masterpointio/ecsrun

Matt Gowie

03:43:29 AM

@Joe Niland If you get a chance, check out this project. Would appreciate your thoughts if you end up getting to use it.

Not suited for your node project because why not just invoke RunTask through the native AWS JS SDK but it might help you out elsewhere!

masterpointio/ecsrun

Easily run one-off tasks against a ECS Task Definition - masterpointio/ecsrun

Joe Niland

03:44:08 AM

Thanks I am having a look!

Joe Niland

03:45:53 AM

That’s a great project. I’m thinking of various ways to use it.

Matt Gowie

03:46:26 AM

Awesome! Let me know how it goes and if you have any feedback!

04:51:34 PM

One tool I’ve been looking for is one to update a task definition’s single container definition’s container image. Currently were using ugly fabfiles that do this that are copied and pasted everywhere and they typically recreate the task definition instead of reusing the one in terraform.

Matt Gowie

04:53:14 PM

As in update the image tag on the container def? So you don’t need to point to latest?

04:54:18 PM

Right. If we point to the latest in the task definition and the latest image is updated, would it auto update the running task?

04:54:57 PM

One thing we’d lose is the ability to know what version of our container is currently running. Its a bit hard to decipher with latest tag, no?

Matt Gowie

04:55:01 PM

Nope — You need to invoke update-service (I think that is the API name).

And the latest tag is an anti-pattern anyway. It’s a bad practice.

Matt Gowie

04:55:19 PM

Yeah, latest tag is crap. Don’t use that

04:55:22 PM

Ya we dont use the latest tag

04:55:56 PM

Not sure if were speaking about the same issue

Matt Gowie

04:56:25 PM

So you need something to invoke after you push the image that updates the task def to use the newest image tag and then invoking update-service to update your currently running tasks.

04:57:51 PM

Ah yes. I suppose we could continue using the makefile to build and push the container then replace the fabfile with the update service command to update our running task

Matt Gowie

04:59:23 PM

Yeah. You still do need to update the task def to point at the new image tag. Invoking update-service is just the piece that will actually deploy that new revision of your task def.

05:30:40 PM

lol then back to square one

05:31:38 PM

current process

Makefile to create new build of docker and upload it
copied and pasted fab file that registers a new task definition by recreating it and then runs update service to trigger an update of the task

05:32:30 PM

new process that would be cool

Makefile to create new build of docker and upload it
community maintained script that will take an existing task definition, update a specific containers container image, recreate that task definition, and run update service to update the running task

05:32:55 PM

the #2 from the new process would replace our wonky, drift ridden, copy pasta fabfile

05:33:43 PM

it would also allow us to maintain our task definition in terraform while the separate script can reuse its params while only replacing the specific container image

05:33:55 PM

@Matt Gowie I hope this makes sense

Matt Gowie

06:09:23 PM

Yeah makes sense for sure. That’d be an interesting project — Basically a tool to do a ECS deploy given a task def + service. I’ve written a bash script around that too, but just never wanted to abstract it away from reuse.

10:53:28 PM

so ive been thinking more about this….

why shouldn’t we use a latest tag for the docker image in our task definition? if we did, then to update the service would be as easy as running the update-service command with --force-new-deployment

source: https://stackoverflow.com/a/48572274/2965993

10:53:56 PM

we could update the docker labels in the Dockerfile itself for a git ref versioning so that could be a way to figure out what version hash is actually running

10:54:09 PM

besides hitting the /status/version endpoint across out APIs

Matt Gowie

10:54:24 PM

Others can explain better than I can: https://vsupalov.com/docker-latest-tag/ https://blog.benhall.me.uk/2015/01/dockerfile-latest-tag-anti-pattern/

What's Wrong With The Docker :latest Tag?

Frequent issues and misconceptions around Docker’s most used image tag.

10:57:14 PM

ah ok, so i skimmed through this. so instead of calling the tag latest, we could call the tag by it’s env name like production, no ?

Matt Gowie

10:57:28 PM

I just learned it the hard way on a client project where I was using latest. It was a stupid mistake on my end, but had my CI / CD pipeline for production building my image, pushing it to the registry, asking for approval to deploy to prod, and then doing deploy to prod if approved.

We had a release candidate waiting for approval and during that time, our containers had died and restarted. That inadvertently triggered the release as the task def was pointed at latest and therefore the build that was a release candidate was deployed which the client wasn’t ready for and caused problems.

10:57:30 PM

would that still be regarded as an antipattern if this tagged was used in the task definition ?

10:57:59 PM

ah ok i see how that can be problematic

Matt Gowie

10:58:15 PM

I think that’d still be a pattern as my above problem still would’ve exposed itself.

Matt Gowie

10:58:37 PM

Yeah and then rollbacks require re-tagging the old image as latest or doing a full build of the old code and pushing that as latest

Matt Gowie

10:58:40 PM

Kinda weak.

10:59:03 PM

i wonder if using a different tag than latest and having strict controls in place to prevent pushing those tags can allow my proposed setup to work without causing headaches of accidental deploys

Matt Gowie

11:00:00 PM

Possible. One of the problems with latest is that it’s implicit. I believe images get tagged that way in most repos regardless if you explicitly tag it that way or not.

11:00:26 PM

right so we can easily solve that, at the very least, by renaming it to the env like production

Matt Gowie

11:00:34 PM

Or docker tags newly built images implicitly with latest. That is the issue.

11:01:04 PM

then if we can put in a policy for all engineer roles to be unable to overwrite this tag, then create another iam role used by cicd that does overwrite this image tag, then it should be safer, no ?

Matt Gowie

11:01:22 PM

Yeah that solves some problems. But still gives you problems with rollbacks. Means you either need to tag your rollback image with production and push it or you need to rebuild with your old code possibly.

11:02:02 PM

we could tag our images twice. once with the first 7 chars of the git ref or a version number and once with the env production and push both up

11:02:12 PM

that would allow us to rollback fairly easily

Matt Gowie

11:02:15 PM

Yeah, and you should definitely do that regardless.

11:02:21 PM

we do

11:03:14 PM

i wonder if there is a project out there that creates skeleton repositories for these kinds of sane defaults so you dont have to always reinvent the wheel. someone has certainly solved this before.

Matt Gowie

11:03:14 PM

I would bring this up as a #office-hours question. I think it’s a good one. I’m sure Erik and crew have opinions / war stories with this

11:03:30 PM

cool. ya i havent joined the office hours yet.

Matt Gowie

11:03:54 PM

The first I heard of latest being an anti-pattern was from Zach L sometime a bit back during office hours. Then it bit me.

Matt Gowie

11:04:40 PM

Link this convo over there and we’ll chat through it next week. Will probably spark some interesting conversation as we obviously just chatted for a bit about it.

Matt Gowie

11:04:57 PM

If you can’t join the office hours, you can always listen / watch after the fact.

07:41:40 PM

@Erik Osterman (Cloud Posse) and team suggested this tool

https://github.com/fabfuel/ecs-deploy#deploy-a-new-image

and the following approach

setup a tag input for the terraform module to pass to the container definition
when building a new docker image, continue to only tag it with something unique like the first X chars of the git log ref
copy that new tag and pass that to the module using terraform apply -auto-approve that should be good.

fabfuel/ecs-deploy

Powerful CLI tool to simplify Amazon ECS deployments, rollbacks & scaling - fabfuel/ecs-deploy

07:42:25 PM

we could use that tool in lieu of terraform with this command as it will only update a specific container’s image in our task definition

ecs deploy my-cluster my-service --image webserver nginx:1.11.8

Joe Niland

01:41:20 AM

We have a node.js worker that runs jobs that can take hours and don’t tolerate interruption. It has a graceful shutdown function which is called when catching sigterm and sigkill.

We’re currently running the worker on ECS Fargate.

We want to make sure the containers aren’t killed mid job. With the standard ECS deployment methods this isn’t possible.

Is anyone aware of a good way to achieve this? I’m looking into the external ECS deployment controllers. Perhaps Fargate is not a great choice for this however the client doesn’t have an Ops team, so if there’s any way to use it that’d be ideal.

jose.amengual

02:04:07 AM

Why is fargate a problem?

jose.amengual

02:04:23 AM

Is the task being killed at some point?

Joe Niland

02:05:09 AM

Hey PePe - sorry meant to say - the task will be replaced during a deployment based on the standard ECS deployment controller behaviour

Joe Niland

02:08:55 AM

also we’re using Fargate Spot in staging so that can have an impact too I think

Joe Niland

02:25:10 AM

Fargate also has a 120 second max stop timeout after which it kills the container AFAIK

Matt Gowie

03:28:04 AM

So the issue is that you’re doing a service deployment and it’s killing your running jobs, which you don’t want? Or is it that you’re using Fargate Spot and the tasks are being killed because they’re spot instances and AWS is reclaiming them? Both?

Joe Niland

03:29:22 AM

Yes, both, but mainly the deployment. BTW, I am now realising that we won’t use Fargate Spot for this particular ECS service!

Matt Gowie

03:29:26 AM

If you invoke via the RunTask API then it should live outside of the Service lifecycle and a deployment shouldn’t affect it.

And I wouldn’t think Fargate Spot would be the best choice for jobs that you’re hoping to keep.

Matt Gowie

03:29:52 AM

How are the jobs invoked? Through a single service or are they scheduled?

Joe Niland

03:30:49 AM

It’s a node.js service managed by PM2 inside the container. The jobs are managed by Agenda and stored in Mongo. This architecture preexists AWS hosting.

agenda/agenda

Lightweight job scheduling for Node.js. Contribute to agenda/agenda development by creating an account on GitHub.

Joe Niland

03:32:02 AM

Thinking about how we could orchestrate RunTask in this case.

Matt Gowie

03:32:03 AM

Huh got it. When a new job gets posted is agenda just scaling a service or is it invoking RunTask?

Matt Gowie

03:35:04 AM

Or is agenda a long running service and it’s invoking jobs internally — Something similar to celery or sidekiq?

Joe Niland

03:35:13 AM

Yes that

Joe Niland

03:35:24 AM

it’s just running in a loop inside the node process

Joe Niland

03:35:57 AM

I don’t think I can move away from using it within this project scope

Matt Gowie

03:37:50 AM

Got it got it. Okay — This might be hard to implement but I wonder if you could spin up a new service each time, let the old service finish its currently running jobs, launch a new service which is then the only one allowed to read from Mongo / launch new jobs, and then when the old service completes its jobs it shuts itself down.

Matt Gowie

03:38:44 AM

Not sure if there is an ECS native way to deal with it.

Joe Niland

03:39:17 AM

Interesting. I had a similar thought using cloudwatch events to schedule the runs. It would run and pick up whatever jobs are waiting.

Joe Niland

03:40:07 AM

From what I can see your idea (or close) could be done with external deployment controllers

Joe Niland

03:40:38 AM

AWS support sent an example of doing blue/green with Jenkins - https://github.com/aws-samples/ecs-bg-external-controller/blob/master/Jenkinsfile

aws-samples/ecs-bg-external-controller

This repo contains the sample code for the blog post: https://aws.amazon.com/blogs/containers/blue-green-deployments-with-the-ecs-external-deployment-controller/ - aws-samples/ecs-bg-external-contr…

Matt Gowie

03:40:51 AM

Huh. I mean yeah, depending on how complex things are you could similarly update all your jobs to invoke RunTask and then they’ll spin up new tasks for each job and the jobs on the agenda server will complete quickly so you won’t have to worry about interrupting them when you do a service deploy.

Joe Niland

03:41:36 AM

I like that idea. I will look into whether we can integrate that somehow. Appreciate the input!

Joe Niland

03:43:14 AM

I think your idea is good because we don’t have to mess around with deployment controllers, which doesn’t look simple at all

Matt Gowie

03:44:22 AM

Yeah that groovy code is non trivial. Ugh Jenkins, I’m glad I have no work with Jenkins right now

Joe Niland

03:44:44 AM

Hahah agreed

Joe Niland

03:45:18 AM

It’s been a while for me but I am talking to a client who may need some Jenkins pipeline updates

jose.amengual

04:09:50 AM

we have super long running job inside ECS as sidecards, not fargate but we used to set the Minimum healthy percent 100 and Maximum percent 200 and that way you can have running two versions of a task

jose.amengual

04:10:31 AM

we have sidecards as cron job basically

Joe Niland

02:27:57 PM

@jose.amengual how are you deploying the job containers?

jose.amengual

04:35:43 PM

yes

jose.amengual

04:36:13 PM

we deploy 3 containers at the same time

jose.amengual

04:36:43 PM

1 is the initial sync tas that sync s3 and then dies ( they run in order)

jose.amengual

04:36:55 PM

then we have the cron, that stays running for ever

jose.amengual

04:37:10 PM

and then the app that runs until a new deployment happens

Joe Niland

11:31:11 PM

I see and are you using the runtask API or codedeploy or something else?

jose.amengual

11:35:42 PM

basically a Go app we build that calls the standard api

jose.amengual

11:36:00 PM

it does not do any magic, it basically fires a deploy

jose.amengual

11:36:10 PM

the trick was on the deployment %

Joe Niland

11:37:50 PM

Ah I see

Matt Gowie

11:38:19 PM

@jose.amengual Does your team primarily use Go?

jose.amengual

11:38:35 PM

no, we are a Java shop

Joe Niland

11:38:40 PM

I am thinking I need to do something similar because the standard ECS deploy switches the primary task as soon as the new one becomes healthy

jose.amengual

11:39:36 PM

you could have task in the same service and deploy independently ?

jose.amengual

11:39:45 PM

on service, two tasks same app

Joe Niland

11:42:07 PM

Yes we could. Sorry, I said task but I think I mean https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_Deployment.html.

Deployment - Amazon Elastic Container Service

The details of an Amazon ECS service deployment. This is used only when a service uses the ECS deployment controller type.

Joe Niland

11:42:21 PM

I am going to play with the percentages

Joe Niland

11:42:31 PM

Thanks PePe

jose.amengual

11:50:43 PM

thank Us if it works

Joe Niland

11:57:15 PM

Actually where do you run your go app from? CI process?

jose.amengual

12:02:01 AM

jenkins yes

Joe Niland

11:57:32 AM

I ended up not using ECS deployments for this service. Instead, within CodeBuild, I downloaded the latest task definition, and used jq to modify the image name/tag. Then registered this new task definition and used run-task to create new tasks. To kill off the old tasks, my client modified the worker code to use stop-task to shut the oldest tasks down when they see that there are more tasks than MAX_TASKS (which we stored in param store.) Hack? probably, but it is working

jose.amengual

05:22:56 PM

that is not a hack

JMC

02:51:52 AM

Hey is this what I think it is ?

https://aws.amazon.com/about-aws/whats-new/2020/06/amazon-ec2-auto-scaling-now-supports-instance-refresh-within-auto-scaling-groups/?sc_channel=em&sc_campaign=GLOBAL_CT_NL_global-snapshot-newsletter_20200624_&sc_medium=em_264309&sc_content=PA_nl_la&sc_geo=mult&sc_country=global&sc_outcome=pa&trk=em_264309_()_Velocity_WhatsNewForYou_Compute_1&mkt_tok=eyJpIjoiTTJVek5EQTFOV0ZsWWpJMSIsInQiOiJiV2hHR1wvNENYUTd6NVkrN25OQ0lKcURMdEtzQjhwVGxDbVFxYlhKTzV2UU5BSm5PZUxnWWNDZTZOaWR0N2lTYiswb1daS2w4YmJtOHdRTm1WVTMrWVRyODI5d2lvYVwvWXNOS1VXMTZHRDRSc2FyU2VZQkg3N0dzVnVYUzArRStlTmFzSndnNEErbkpvTndQZ2ttSVg4UT09In0%3D

Update AutoScalingGroup & launch template in place without the hassle or swapping them, or tweak around with terraform prefix / create before destroy ?

JMC

02:52:17 AM

Anyone knows if Terraform will support this anytime soon in the V13 ?

loren

03:19:21 AM

fwiw, support for this is totally separate from tf 0.13, just a new resource/feature in the aws provider. Here’s the issue to track for the release… https://github.com/terraform-providers/terraform-provider-aws/issues/13785

Optional Instance Refresh on ASG update · Issue #13785 · terraform-providers/terraform-provider-aws

Community Note Please vote on this issue by adding a reaction to the original issue to help the community and maintainers prioritize this request Please do not leave "+1" or other comme…

JMC

02:46:29 PM

you’re completely right about the provider, good point

JMC

02:46:33 PM

thanks for linking the issue

2020-06-25

David Medinets

05:01:39 PM

Does anyone know the password for the ‘centos’ account in the Centos AMI? I want to remove NOPASSWD from the sudo function.

Reinholds Zviedris

05:23:50 PM

just set the password before removing NOPASSWD

David Medinets

05:24:24 PM

Don’t I need to know the existing password before I can do that?

David Medinets

05:24:49 PM

Oh.. I can sudo passwd. I had forgotten.

Reinholds Zviedris

05:27:40 PM

keen

05:38:38 PM

07:19:48 PM

idk if this has been mentioned but this really makes a difference when all i get is an instance id from aws in an email

https://github.com/bash-my-aws/bash-my-aws

$ AWS_REGION=us-east-1 instance-tags i-snip
i-snip  env=snip snip=snip snip=snip snip=true snip=0.56 team=snip Version=snip service=snip CreatorName=snip aws:ec2spot:fleet-request-id=snip CreatorId=snip snip=snip

bash-my-aws/bash-my-aws

Bash-my-AWS provides simple but powerful CLI commands for managing AWS resources - bash-my-aws/bash-my-aws

2020-06-29

contact871

01:20:28 PM

Hi, does anyone know how “real time” the AWS Cost Manager is? For example when I launch an ECS task, should I see the cost already after 1h? Or hast it some delay - for example I need to wait 24h until I see the cost appear in AWS Cost Manager?

Maciek Strömich

01:30:56 PM

https://aws.amazon.com/aws-cost-management/faqs/

Q: How frequently is the AWS Cost & Usage Report updated?

The AWS Cost & Usage Report is updated at least once per day. An updated version of the report is delivered to your S3 bucket each time an update is completed.

AWS Cost Management FAQs - Amazon Web Services

AWS Cost Management Frequently Asked Questions (FAQs) - Tools to help you to access, organize, understand, control, and optimize your AWS costs and usage.

contact871

01:38:29 PM

thx @Maciek Strömich . This relates to AWS Cost & Usage Report. Do You know if the 24h also apply to AWS Cost Explorer API?

Maciek Strömich

01:47:47 PM

AWS Cost and Usage Report Service is part of Cost Explorer API.

Maciek Strömich

01:48:45 PM

my guess is that both follow the same data freshness principles

contact871

01:50:02 PM

I thought:

• AWS Cost & Usage Report - reports stored on S3 - more static

• AWS Cost Explorer API - more real time HTTP API

contact871

01:51:02 PM

AWS wants 0.01$ per AWS Cost Explorer API request, so I thought one could expect more

jedineeper

03:03:34 PM

Kind of worried about task definitions in ECS, is there a maximum number they can go up to? Do I need to worry about this?

03:03:54 PM

nah, dw about it

03:04:38 PM

according to https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-quotas.html

Revisions per task definition family is 1,000,000

jedineeper

03:05:14 PM

Excellent. Thanks.

03:05:21 PM

since it’s a default quota you could probably request an increase. this is per TD tho so you’d have to be really abusing task definitions to have to increase it lol

jedineeper

03:06:05 PM

Haha, yeah. :)

2020-06-30

nileshsharma.0311

03:08:19 PM

@channel what would be your choice for building a devops pipeline , use gitlab and go with a single vendor approach or use a combination of tools to keep it flexible? Thanks in advance

03:34:31 PM

our choice is what we’re currently using.

we use buildkite. basically if a pr passes tests/linting and is approved by github CODEOWNERS, once it’s merged, the code is built into a container, and deployed on to ecs with zero downtime.

David Medinets

05:44:07 PM

@nileshsharma.0311 Make a pilot project. Evaluate in a month. You need real world experience to decide. Other people’s experience can provide a guide but it is your skills that matter.

nileshsharma.0311

03:29:55 PM

@David Medinets thanks for this one

03:09:16 PM

are there enough people in this community to warrant a #cloudcustodian channel ?

https://cloudcustodian.io/

nileshsharma.0311

03:13:24 PM

@RB how’s your experience with it ?

03:21:52 PM

im proficient with it. been using it for a couple years

03:22:07 PM

but i do fall into snags where i need to hit up their gitter

03:22:15 PM

or stackoverflow/reddit/etc

03:22:35 PM

it works but it can be a pain in the ass to configure and setup

nileshsharma.0311

03:25:35 PM

So the deployment part is kinda cumbersome

03:25:50 PM

is that a question or statement

nileshsharma.0311

03:26:17 PM

No just thinking out loud , obviously it depends on the skill level

nileshsharma.0311

03:27:30 PM

Can you provide some input to my question , it’s in the channel, just above yours , it would be helpful

03:34:48 PM

not sure how that related to cloud custodian but kk

03:34:50 PM

¯_(ツ)_/¯

03:35:02 PM

custodian isn’t hard to deploy. that’s the easy part. it’s hard to configure.

nileshsharma.0311

03:35:26 PM

Gotcha

nileshsharma.0311

03:35:43 PM

Yeah it wasn’t related but just needed some feedback

sheldonh

06:02:45 PM

I ran it in a few minutes from a docker command without issue. Is there something more complex to it? Couldn’t you just set it to run periodically in fargate container and be done?

06:22:42 PM

the deployment to a lambda or running it from the command line is dead simple

David J. M. Karlsen

10:46:48 AM

how would you rate cloudcustodian vs the services from aws like config, control-tower etc?

10:55:34 AM

custodian doesnt do control tower

10:55:42 AM

custodian does leverage aws config tho

David J. M. Karlsen

10:55:57 AM

yes, I know - but the tools kind of do the same

David J. M. Karlsen

10:56:16 AM

so you could roll your own with CC or go with AWS services ($$)

10:56:39 AM

kindof. custodian will configure aws config for you using the yaml file. it doesn’t touch upon control tower as its not in its purview.

10:57:00 AM

here are the different modes where you can configure custodian https://cloudcustodian.io/docs/aws/aws-modes.html

10:57:42 AM

the only way you can use cloud custodian without aws specific services like lambda is by running it in a cron job like jenkins from the cli but you give up a lot of functionality

11:03:28 AM

so you can configure your own lambdas to emulate custodian or create your own rules in aws config manually, but it’s a lot more convenient with cloud custodian because of its simplified yaml language

11:04:59 AM

@David J. M. Karlsen thoughts ?

David J. M. Karlsen

12:12:05 PM

thanks for the info. I might take a closer look at CC now.