SweetOps #aws for November, 2020

Discussion related to Amazon Web Services (AWS)

Archive: https://archive.sweetops.com/aws/

2020-11-03

Shankar Kumar Chaudhary

09:16:46 AM

anyone have successfully updated eks from 1.14 using terraform terragrunt? using terraform-root-modules

kalyan M

05:53:10 PM

How can we restrict aws IAM users not to generate their own access or secret keys by themselves.

roth.andy

05:54:43 PM

Deny iam:CreateAccessKey

roth.andy

05:55:16 PM

Though, is that really the right way to go? If you do that, you have to manage them, including rotating them.

roth.andy

05:56:31 PM

https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#access-keys_required-permissions

Managing access keys for IAM users - AWS Identity and Access Management

Create, modify, view, or rotate access keys (credentials) for programmatic calls to AWS.

Yoni Leitersdorf (Indeni Cloudrail)

05:56:34 PM

You can allow them to create keys for themselves

roth.andy

05:56:54 PM

right

roth.andy

05:57:52 PM

If you create my access keys for me and give them to me, that means you know them. As the user I don’t like that. I want to be the only person on the planet that knows those keys

roth.andy

05:59:12 PM

Do you allow users to set their own passwords (within the confines of a secure password policy)?

Yoni Leitersdorf (Indeni Cloudrail)

05:59:27 PM

@kalyan M what are you trying to achieve? Or what is the risk you’re trying to mitigate?

kalyan M

06:03:24 PM

previously one of the developer machine keys got malwared and there are c9 large clusters spinned in the Ireland region, other regions, which we are not aware. the ec2 spin up is done from china region. we got a a billing of $10K on that month. Looking for a way to remove their keys after work is done.

roth.andy

06:17:19 PM

Encouraging use of secure development practices like pre-commit hooks that check for AWS keys before a commit, or use of tools like aws-vault which always creates a temporary key using AWS STS (and better training on protecting keys and other security practices) would be a better strategy. What you mentioned is a serious problem and one that really needs to be solved, but it should be solved in a way that increases trust in the dev culture, not locking things away

roth.andy

06:17:35 PM

https://pre-commit.com/

roth.andy

06:17:45 PM

https://github.com/99designs/aws-vault

99designs/aws-vault

A vault for securely storing and accessing AWS credentials in development environments - 99designs/aws-vault

roth.andy

06:20:02 PM

https://blogs.sap.com/2018/06/19/trust-is-the-foundation-of-devops/

Trust is the foundation of DevOps | SAP Blogs

[tl;dr] Organizations going through a DevOps transformation must review their culture, to evaluate whether it is meeting the requirements of a true DevOps organization An absence of trust is the root cause

roth.andy

06:22:40 PM

Okay here’s an actual strategy you can use instead of what you are proposing: You can configure IAM policies such that MFA must be used, even when running console commands with an access key. aws-vault supports it beautifully.

roth.andy

06:23:09 PM

https://aws.amazon.com/premiumsupport/knowledge-center/mfa-iam-user-aws-cli/

roth.andy

06:24:53 PM

You can also lock out regions that you don’t use via IAM

roth.andy

06:25:09 PM

Or use something like CloudCustodian to monitor AWS activity

Yoni Leitersdorf (Indeni Cloudrail)

06:48:04 PM

Agree with Andrew 100% here. Making things harder for devs won’t solve your problem here. Also, dev’s are very resourceful and find ways around the blocks you set up. You need to make sure you build something that’s easy for them to adopt.

We use aws-vault and AWS SSO (with G Suite IDP). The keys are never saved to the HD. Malware can try and make aws-vault calls, so it’s not fool-proof, but you’re adding more hurdles for the malware to try to overcome.

2020-11-04

Erik Osterman (Cloud Posse)

07:00:49 PM

https://www.youtube.com/watch?v=BtJAsvJOlhM&feature=youtu.be

168 AWS Services in 2 minutes

2020-11-05

uselessuseofcat

08:25:05 AM

Hi! Is there any way to increase Security groups per network interface other than through service quotas. Maximum number is 16, is there any way to have it set on, let’s say 30 or 50? Should I contact AWS support? Thanks!

Henry Carter

09:54:35 AM

Be aware that if you increase the number of sg-per-eni then the number of rules-per-sg will decrease.

Henry Carter

09:56:26 AM

https://docs.aws.amazon.com/vpc/latest/userguide/amazon-vpc-limits.html#vpc-limits-security-groups

uselessuseofcat

09:56:33 AM

Thanks! I use only a few rules-per-sg. The thing is, I’ve created separate sg for every ECS cluster, and I have a lot of ECS clusters…

uselessuseofcat

09:57:43 AM

I’ve created everything trough Terraform. So other option is to redesign everything

Alex Jurkiewicz

07:49:39 PM

You can redesign things by creating a common security group, and having each ECS cluster add rules to the common group.

If you have this many clusters you might want to consider some other form of security. You could have a single rule based on CIDR and use dedicated subnets

uselessuseofcat

07:52:24 PM

Thanks @Alex Jurkiewicz - at the end there was no problem at all to begin with since I’ve misunderstood how VPC endpoints work :)

Alex Jurkiewicz

07:53:03 PM

Even better

2020-11-06

2020-11-07

btai

08:33:25 PM

I have a cloudwatch alarm for read iops where I want to set the threshold for alarming at a certain number (i.e 5000) but every night at 2am we run some sync jobs that are read intensive that spike up to higher than that number for a short period of time (i.e 7000) is there a way to configure the alarm threshold to be higher for that short period of time

btai

08:35:39 PM

i don’t necessarily want to set it at the higher threshold all the time as prolonged iops will lower our burst balance but the 2am spike is when we are at our lowest traffic and the sync jobs don’t last too long

Erik Osterman (Cloud Posse)

11:32:40 PM

And what about increasing the duration of elevated iops?

Erik Osterman (Cloud Posse)

11:33:32 PM

Also, this is something you should be able to control with your escalation platform

Alex Jurkiewicz

09:02:44 AM

Yeah, if you feed these alerts into pager duty, you can have scheduled maintenance to silence alarms for a certain time period. But that brush might be a little broad.

There’s no way to do what you want natively in AWS, except the hammer of a lambda scheduled action.

I wonder how useful this alert is though. Why do you want to be notified about high write iops? Can you instead be notified about degradations in application performance directly? Or if you must alert on the database, maybe there are other metrics which are more useful. Like number of connections or CPU.

btai

11:02:46 PM

Hmm, I don’t necessarily want to increase the duration as I think it could lead to slower reaction time to a system that might be impaired soon. Also I’m not sure if we can get that granular in our escalation platform (victorops) to be in maintenance mode for just a specific alert type? I’m reading that people have set up lambdas to disable/enable alarms. So I could have an alarm with a higher threshold enabled between 2am PST and 3am PST while disabling the general alarm during that period.

@Alex Jurkiewicz we alert on many things including cpu, connections, and iops. for us, we are very read heavy and increased read iops past our specific threshold for a longer period of time can drop our burst balance (which once depleted will cause read latency and service degradation) I like to alert on read iops because it generally gives us the most time to verify if we’re likely to be in trouble soon. Alerts on low burst balance, queue depth, read latency generally would come after as they are lagging symptoms of the problem. I feel like alerting on degradation in application performance to me is reactionary and you definitely want proactive alerting like prolonged read iops spikes as it gives you time to debug and hopefully prevent an issue that hasn’t happened yet.

Alex Jurkiewicz

11:11:08 PM

Thanks for the explanation. That makes a lot of sense, alerts are always application/system-specific. I agree with your point that if something is going wrong, it’s better to know before than after. But I think this situation is a great example of how difficult it is to build an alert that predicts problems in a reliable manner.

I’m reminded of a tweet from one of the original YouTube SREs, who said something like “in the early days, YouTube monitoring alerted on two metrics only: stream starts and number of uploads”. If you are a large company with a big complement of on-call staff, you can get tricky and predictive and build automated remediation based on this alert! But if not, you will end up with overly sensitive alarms and a culture of alert fatigue.

btai

11:41:53 PM

agree on alert fatigue!

Erik Osterman (Cloud Posse)

11:49:00 PM

I still see this as not an alert problem but an escalation problem. That alert escalation should be associated with a schedule. Very easy to do in something like opsgenie, but not sure about victorops

Erik Osterman (Cloud Posse)

11:56:51 PM

Erik Osterman (Cloud Posse)

11:56:59 PM

opsgenie

btai

10:43:05 PM

got it @Erik Osterman (Cloud Posse)

2020-11-08

2020-11-09

Andreas P

11:59:35 AM

Hey all is there a way to create multiple databases in an RDS instance as part of terraform provisioning ??

Zach

04:39:46 PM

use the postgresql provisioner

Zach

04:40:11 PM

and I’m sure there’s a mysql one

Andreas P

04:40:51 PM

You mean using postgresql provider in the terraform specification?

Zach

04:40:59 PM

yah

Andreas P

04:41:36 PM

Interesting didn’t think of that thanks!

Andreas P

04:41:38 PM

https://www.terraform.io/docs/providers/postgresql/index.html

Provider: PostgreSQL - Terraform by HashiCorp

A provider for PostgreSQL Server.

Andreas P

04:41:40 PM

That one right?

Matt Gowie

04:56:43 PM

That’s it. I do this exact thing. The only rough part about this is that you typically want a separate root module to do this because :

You typically need a SSH tunnel into your VPC so you can access your private RDS instances for the postgresql provider to do its work.
It’s not done often, so keeping it isolated and not having to share that tunnel across multiple root modules is the right way to go.

Zach

05:16:57 PM

I’ve also found that the postgres provider doesn’t seem to integrate very well with the terraform dependency graph. It tries to ‘jump the gun’ and execute before things are ready

Matt Gowie

05:16:24 PM

Anyone know any tooling to keep AWS Config files (+ Kube config files as well) up-to-date across an org? I’m considering writing a script around a gomplate template for this but before I do I figured I should check that this isn’t already a thing.

tim.j.birkett

10:54:41 AM

You mean like users AWS configs, configs for the “AWS Extend Switch Roles” plugin etc…? I created a simple Jenkins job at a previous employer so that users could self-serve getting configs for their user (email). Currently, I have some janky Ruby / rake tasks that do some of those things. I haven’t found anything out in the wild that does it.

Matt Gowie

04:10:29 PM

Yeah, that’s what I was referring to.

Matt Gowie

04:10:46 PM

What is AWS Extend Switch Roles? I should check this out…

Matt Gowie

04:11:25 PM

Oh a browser extension. Interesting.

tim.j.birkett

09:34:47 AM

It’s great if you work across many accounts without SSO or some sort of app portal like you get with Azure AD.

2020-11-10

03:46:48 PM

Requesting upvotes https://github.com/aws/containers-roadmap/issues/256 (for ecs automatic asg draining)

[ECS] [RFC]: Automatic management of instance draining in an ASG · Issue #256 · aws/containers-roadmap

Request for comment: please share your thoughts on this proposed improvement to ECS! With this improvement, ECS will automate instance and task draining. Customers can opt-in to automated instance …

aaratn

07:24:30 PM

https://aws.amazon.com/blogs/aws/introducing-aws-gateway-load-balancer-easy-deployment-scalability-and-high-availability-for-partner-appliances/

Introducing AWS Gateway Load Balancer – Easy Deployment, Scalability, and High Availability for Partner Appliances | Amazon Web Services attachment image

Last year, we launched Virtual Private Cloud (VPC) Ingress Routing to allow routing of all incoming and outgoing traffic to/from an Internet Gateway (IGW) or Virtual Private Gateway (VGW) to the Elastic Network Interface of a specific Amazon Elastic Compute Cloud (EC2) instance. With VPC Ingress Routing, you can now configure your VPC to send all […]

Mr.Devops

10:11:23 PM

Can anyone tell me if it’s possible to decrease an FSX once it has been increased?

Alex Jurkiewicz

10:17:50 PM

The FAQ https://aws.amazon.com/fsx/windows/faqs/ says:
Q: Can I change my file system’s storage capacity and throughput capacity?
A: Yes, you can increase the storage capacity, and increase or decrease the throughput capacity of your file system – while continuing to use it – at any time by clicking “Update storage” or “Update throughput” in the Amazon FSx Console, or by calling “update-file-system” in the AWS CLI/API and specifying the desired level.

Mr.Devops

10:19:28 PM

Thx @Alex Jurkiewicz I also saw that as well. I’m guessing the only option is increasing but you cannot decrease as aws doc doesn’t mention this.

Alex Jurkiewicz

10:21:12 PM

I’m sure they would mention if it’s possible. AWS prefer to say nothing than to say “you can’t do X”

Mr.Devops

10:23:26 PM

jmccollum

11:14:15 PM

Can’t decrease the storage space of a FSX. Only thing that can be decreased after creation is the throughput.

Mr.Devops

11:21:04 PM

Thx

2020-11-11

kalyan M

09:41:17 PM

is there any software that can restrict the users to just view the code. instead of modifying or downloading the code. even copy/Paste?

Jonathan Marcus

01:26:29 PM

There are solutions for this but they tend to be very intrusive and enterprise-y. Not really an #aws issue. Some examples:

• Citrix makes a remote-access virtualization product. You open up your app through it, and then users log in via a special Citrix agent that prohibits copy/paste.

• On iOS and Android there are enterprise management tools (i.e., rootkits but from your employer) that can create special sandboxes that prohibit copy/paste, screenshots, etc.

• A simpler one is Windows Remote Desktop. IIRC you can set it so that copy+paste works within the RDP session but is disallowed between remote and local.

09:45:57 PM

if you can view the code, couldn’t you also copy and paste the code ?

Issif

09:47:05 PM

screenshot and paste the image

Alex Jurkiewicz

11:20:37 PM

The most suitable software is a contract making them pay you lots of money if they copy the code, I think.

rms1000watt

12:25:55 AM

Does anyone have any experience around Aurora Failures in Prod?

(We’re going through planning on migrating to Aurora, but just curious of pitfalls to be aware of.. like.. “too many writes will knock Aurora over” or something based on past experience, instead of hypothetical: ‘it should be great!’)

Alex Jurkiewicz

01:45:55 AM

We migrated to aurora (MySQL) and very happy.

The replication is disk block based, so during failovers you lose almost no data, and they are very fast.

The biggest caveat is that Aurora is slower for writes than a traditional MySQL master. Because it waits for writes to be acked across multiple AZs. Not that much slower, but AWS don’t really mention this anywhere as it’s the only meaningful regression

kskewes

04:12:15 AM

Can’t update from 5.6 to 5.7 in place. Cloudposse module wants to recreate instances on minor engine version change. You can avoid by forking module and add lifecycle version ignores…

kskewes

04:14:57 AM

Otherwise been great for us too. Fail over quick. Odd MySQL error resulted in crash and fail over but fixed on later versions.

jose.amengual

04:34:42 AM

very heavy concurrent writes will force a failover of the cluster instances due to the things mention about the storage so as long as you can slow down your writes then you will be ok

jose.amengual

04:35:31 AM

I’m talking about millions of rows here

jose.amengual

04:35:56 AM

we have tables with more than 1B rows

Alex Jurkiewicz

09:00:20 AM

I haven’t seen failover caused by high write load. We’ve hit >30k WIOPS sustained without issues

jose.amengual

06:16:30 PM

we can literally crash it at will on an aurora 5.6 but again that is usually not a limit you will get to and there is still ways to avoid it

jose.amengual

06:17:08 PM

we use aurora for many other things and no issues

jose.amengual

06:17:43 PM

One of the things I like the most is how fast the clones are since is basically “sharing” the same storage it is pretty fast

rms1000watt

06:32:49 PM

@jose.amengual We have like ~3TB of data we’re going to migrate and I do think one or more of our tables have > 1B rows

We’re also going to migrate to Aurora Postgres rather than MySQL. @jose.amengual curious if you could clue me into how many writes would cause you to fall over?

jose.amengual

06:36:23 PM

mmmm I do not know exactly how many but I can tell you how I did it

rms1000watt

06:36:51 PM

yeah, that’d be good to know, if possible

Ronak

06:40:06 PM

I work with @rms1000watt and we just did a sync of data using pglogical from rds pg -> aurora pg (11.8) and it went up to 100k WIOPS and stayed there without any issue. This is not live traffic just syncing data over. Really pleased with the performance as of now.

jose.amengual

06:40:10 PM

I had 15x10GB files with two columns Name

MD5 hash and I had 15 tables, each table named after the file like file_1, file_2 etc and I was importing those files into each table, it was able to load 3 in parallel ( although writes are sequential) when I aded to more then it will just failover

jose.amengual

06:40:40 PM

but you need to keep in mind that the size of the instances matter

jose.amengual

06:40:57 PM

so it is a combination of the two, I was using db.r5.4xl

Ronak

06:41:18 PM

yes we are using r5.24xlarge

jose.amengual

06:41:35 PM

lol

Ronak

06:41:53 PM

I do want to scale it down but not during this transition

jose.amengual

06:42:07 PM

and keep in mind too, no tuning nothing, straight aurora

jose.amengual

06:42:42 PM

we use mostly the reader for the app so we can basically write all the time

jose.amengual

06:43:11 PM

you can do multimaster too that can double the WIOPS

jose.amengual

06:43:29 PM

and does some sort of sharding capabilities

Ronak

06:44:08 PM

I am really interested in multimaster!! Do you have any doc or article around this?

Ronak

06:45:01 PM

I see this https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-multi-master.html

Working with Aurora multi-master clusters - Amazon Aurora

Use multi-master clusters for continuous availability. All DB instances in such clusters have read/write capability.

Ronak

06:45:04 PM

I will check it out.

Ronak

06:47:59 PM

This seems to not support PG

Currently, multi-master clusters require Aurora MySQL version 1, which is compatible with MySQL 5.6. When specifying the DB engine version in the AWS Management Console, AWS CLI, or RDS API, choose 5.6.10a.

jose.amengual

07:27:14 PM

oohhhh yes PG is always behind

Alex Jurkiewicz

08:36:26 PM

You can tune things. For example we disable per commit sync for MySQL Aurora through parameter group tuning. I’m sure there are similar knobs for postgres

jose.amengual

08:37:39 PM

yes they are, but you can’t disable fsync in postgres aurora, you need to play with workmem, temp tables and such

2020-11-12

Stephen Bennett

09:13:51 AM

hi, for the cloudposse/elasticache-redis/aws module, there is no mention of tags. anyone able to tel me how to?

11:25:56 AM

So what kind of problems are you folks trying to or have successfully solved recently?

rei

11:52:29 AM

I am trying to get IaaC right in oder to deploy the same EKS cluster with service roles and helmfiles for alb, ingress, cert-manager, external-dns, etc. And for several environments (dev, prod, stg, …) And keeping the differences between them low. At the end Gitlab CI will deploy each branch to an own namespace “deployment” to be accessed using an unique URL. VPN access too. And you?

01:51:48 PM

that sounds like a beautiful endeavor! I tried to do something similar. EKS has been lowered in priority for us so we won’t get to conquer it until Q1 next year.

I’m currently trying to solve giving devs iam permissions using a permission boundary. I’m hoping I can get this out the door this week and unblock a lot of devs. Meanwhile, I’ve been exploring arm a lot in our ECS clusters so it always seems to be a balancing act.

loren

02:30:05 PM

@RB here’s how we manage the requirement for the permissions boundary…

02:45:17 PM

Thanks Loren. I’m using basically this same policy from the aws blog on perm boundary

Delegate permission management to developers by using IAM permissions boundaries | Amazon Web Services

Today, AWS released a new IAM feature that makes it easier for you to delegate permissions management to trusted employees. As your organization grows, you might want to allow trusted employees to configure and manage IAM permissions to help your organization scale permission management and move workloads to AWS faster. For example, you might want […]

loren

03:20:01 PM

yeah, that one flips the logic, using Deny with StringNotEquals instead of Allow with StringEquals

03:42:36 PM

may i ask what you use for the permission boundary policy ? we’re still trying to figure out what’s the best maximum permissions to give without giving too much. we were thinking of removing all deletion, for instance.

loren

03:51:59 PM

we actually use the same policy we assign to the role. it’s primarily to prevent privilege-escalation, for our use case

loren

03:52:31 PM

that doesn’t work of course if you apply multiple policies to the role

loren

03:53:41 PM

but say you have a DEV role, with only the DEV policy attached, you can require the DEV policy as the permissions boundary for any role/user they create to prevent privilege escalation

loren

03:55:56 PM

ah so whatever access the DEV role has, it can also create roles with max permissions as the DEV role since the single DEV policy is used as the boundary

03:56:11 PM

our issue is that we have multiple iam policies associated to our DEV role

03:56:33 PM

so i was going to create a separate maximum iam policy to use as a perm boundary policy

loren

03:56:50 PM

yeah, so that doesn’t work so well then, and you have to come up with an actual policy for the perm boundary separately

03:57:29 PM

pain points have been places where the console will auto-create roles when the user first attempts to use the service. such as service-linked roles, and the rds-monitoring role, etc… so we pre-create those in the account as we identify them yes, i can see this being an issue. we have most of these created in our accounts so hopefully this wont be an issue. we’re worried about existing iam roles exceeding the perm boundary so retroactively attaching may be an issue

03:57:59 PM

yeah, so that doesn’t work so well then, and you have to come up with an actual policy for the perm boundary separately exactly. we’ll have to be crafty here. not much online about this except for the aws blog side.

loren

03:58:32 PM

yes, we also rolled through all accounts and slapped the perm boundary on pre-existing roles/users when we shipped the perm boundary requirement

03:59:22 PM

awesome! how did you verify that each role did not exceed the permission boundary tho ?

loren

03:59:32 PM

it’s a fairly permissive policy for this environment, so we didn’t expect issues and encountered none

04:00:18 PM

ah I see. We have a more strict environment. Perhaps that’s something we can work on.. making things looser.

loren

04:00:46 PM

yeah, the stricter the perm boundary policy, the more likely you’ll have problems

loren

04:01:05 PM

or, you may need multiple perm boundary policies for different use cases

loren

04:01:56 PM

we stick with basic stuff around security and auditing that we manage in the account. if we create it, you can’t touch it. that kind of thing

kskewes

06:38:46 PM

Thanks everyone this looks like will help us out with IAM permissions for devs in their sandpit account. Currently IAM:* is basically off but we need to enable so they can do lambda etc. Will do some reading.

kskewes

06:39:32 PM

We’re currently investigating rabbitmq service. Want to kick our k8s one out of cluster and get from 115 to 118 etc.

maarten

02:07:02 PM

Has anyone played with https://github.com/aws-samples/aws-secure-environment-accelerator/ before ? It’s massive.

aws-samples/aws-secure-environment-accelerator

The AWS Secure Environment Accelerator is a tool designed to help deploy and operate secure multi-account AWS environments on an ongoing basis. The power of the solution is the configuration file w…

03:44:47 PM

i have not. i dont think i would touch it because it is massive. i’d prefer if it was broken up into modules.

we dont use code commit either.

aws-samples/aws-secure-environment-accelerator

uselessuseofcat

03:13:55 PM

Hi, one noob question - what about ALB outbound traffic? When client make request on port 443, for example, to some application the request will go through ALB to that EC2 instance and then response will go through ALB again (on high number unprivileged ports), so I was wondering, if I restrict egress traffic on my EC2 instances’ SG, should I also do the same for ALB? Is there a reason to restrict egress traffic on ALB? Thakns!

EvanG

08:24:26 PM

Is there a reason that this can’t be handled directly on the ALB itself? I’m a beginner too.

uselessuseofcat

08:25:18 PM

ALB has it’s own security group.

EvanG

08:27:48 PM

How did you teach yourself about SGs?

uselessuseofcat

08:28:18 PM

The hard way. You should go and buy Stephane Mareek course on Udemy!

EvanG

08:29:36 PM

not enough I need to just read some simple code and keep moving

EvanG

08:23:11 PM

Does anyone here use a framework for testing CIS Benchmark foundations in AWS? I’ve seen things like: https://github.com/mberger/aws-cis-security-benchmark#description, but I’m wondering if they’re worth the risk.

mberger/aws-cis-security-benchmark

Tool based on AWS-CLI commands for AWS account hardening, following guidelines of the CIS Amazon Web Services Foundations Benchmark (https://d0.awsstatic.com/whitepapers/compliance/AWS_CIS_Foundati…

Erik Osterman (Cloud Posse)

08:24:30 PM

and by testing, do you mean ensuring your safeguards for CIS compliance are working, or putting the safeguards in place?

mberger/aws-cis-security-benchmark

Erik Osterman (Cloud Posse)

08:24:43 PM

Because if you mean the latter, that’s what you get with SecurityHub standards.

Erik Osterman (Cloud Posse)

08:25:37 PM

We’ll have support for security hub in the next day or so. @matt is working out the last details fixing our tests for https://github.com/cloudposse/terraform-aws-security-hub/pull/3

get minimal module working by mcalhoun · Pull Request #3 · cloudposse/terraform-aws-security-hub

what Create the basic initial AWS Security Hub module and test

EvanG

08:27:14 PM

I really don’t know what that code is doing. I learned what a resource and datasource was last week. Time to give it the old college try.

Erik Osterman (Cloud Posse)

08:30:55 PM

are you using terraform at your company?

EvanG

10:46:27 PM

Yes

EvanG

10:47:56 PM

I’m trying to develop my skills as a devops engineer.

EvanG

11:02:55 PM

I keep on getting

Error updating CloudTrail: InsufficientEncryptionPolicyException: Insufficient permissions to access S3 bucket nf-cis-benchmark or KMS key arn:aws:kms:us-east-1:721086286010:key/af37e59b-e5fa-446c-b43c-180b425c6222.

when I try to apply my terraform with a kms_key

resource aws_kms_key "cisbenchmark" {
  description             = "cis-benchmark"
  enable_key_rotation     = true
}

I’m trying to setup my cloudtrail with a key like so

resource "aws_cloudtrail" "cisbenchmark" {
  name                          = "cis-benchmark"
  s3_bucket_name                = aws_s3_bucket.cisbenchmark.id
  enable_logging                = var.enable_logging
  enable_log_file_validation    = var.enable_log_file_validation
  is_multi_region_trail         = var.is_multi_region_trail
  include_global_service_events = var.include_global_service_events
  is_organization_trail         = var.is_organization_trail
  kms_key_id                    = aws_kms_key.cisbenchmark.arn

  # CIS Benchmark 3.1 Ensure CloudTrail is enabled in all regions
  # for a multi-regions trail, ensuring that management events configured for all type of
  # Read/Writes ensures recording of management operations that are performed on
  # all resources in an AWS account
  event_selector {
    # Specify if you want your event selector to include management events for your trail.
    include_management_events = true
    # Specify if you want your trail to log read-only events, write-only events, or all. By default, 
    # the value is All. Needed for logging management events.
  }
}

I stole some of this from your guys code.

matt

08:25:49 PM

@matt has joined the channel

aaratn

08:29:49 PM

https://aws.amazon.com/blogs/compute/using-aws-lambda-extensions-to-send-logs-to-custom-destinations/

Using AWS Lambda extensions to send logs to custom destinations | Amazon Web Services attachment image

You can now send logs from AWS Lambda functions directly to a destination of your choice using AWS Lambda Extensions. Lambda Extensions are a new way for monitoring, observability, security, and governance tools to easily integrate with AWS Lambda. For more information, see “Introducing AWS Lambda Extensions – In preview”. To help you troubleshoot failures […]

2020-11-13

Francesco Ciocchetti

11:36:03 AM

Hi Everyone. I just stared experimenting with IRSA on EKS and it is working great.

I just have one quesiton about the projected volume with the JWT token from AWS

  volumes:
  - name: aws-iam-token
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          audience: sts.amazonaws.com
          expirationSeconds: 86400
          path: token

What happens when the expirationSeconds expire ? I know from the k8s documentation ( so one where audinece is not external ) that it will renew at 80% … is it the same in this case ?

maarten

01:09:31 PM

https://aws.amazon.com/blogs/aws/meet-the-newest-aws-heroes-including-the-first-devtools-heroes/

Meet the newest AWS Heroes including the first DevTools Heroes! | Amazon Web Services attachment image

The AWS Heroes program recognizes individuals from around the world who have extensive AWS knowledge and go above and beyond to share their expertise with others. The program continues to grow, to better recognize the most influential community leaders across a variety of technical disciplines. Introducing AWS DevTools Heroes Today we are introducing AWS DevTools […]

Shreyank Sharma

05:57:58 PM

Am trying cluster migration in AWS, Both k8s clusters are in same region. Cluster 1 : Deployed 2 Application with PV reclaim policy one as Delete and another as Retain, and annotated so it will take Restic backup. Cluster 2: Restored those 2 applications, worked fine.

again Cluster 1: Deployed same 2 application with Reclaim policy as Delete and Retain but not annotated so it took snapshot when i backup. Cluster 2: Restore did not work as PV volume is failed to attach with the following Warning FailedAttachVolume pod/<pod-name> AttachVolume.Attach failed for volume "pvc-<id>" : Error attaching EBS volume "vol-<id>" to instance "i-<instance-id>": "UnauthorizedOperation: You are not authorized to perform this operation.

So, Snapshot restore feature will work in the same AWS region or am only getting this error????

2020-11-14

jfraley

04:03:16 PM

planning out a future website structure and wondering if I can have a single cloudfront distro with two S3 bucket origins

cloudfront distro: sample.cloudfront.net CNAME/Alias that points foo.com to sample.cloudfront.net

S3 bucketA contains: foo.com [foo.com/foo/](http://foo.com/foo/) [foo.com/bar/](http://foo.com/bar/)

S3 bucketB contains: [foo.com/bat/](http://foo.com/bat/) [foo.com/biz/](http://foo.com/biz/)

seems like I cannot do this at the cloudfront origin level even with using a custom behavior and a pattern match

can I: put redirects in bucket A to redirect [foo.com/bat/](http://foo.com/bat/) and [foo.com/biz](http://foo.com/biz) to bucket B ? if I do that ^ should I put another cloudfront distro in front of bucket B (bucketB.cloudfront.net) and point the redirects at that ?

so that:

[foo.com/](http://foo.com/)-> alias for sample.cloudfront.net -> bucketA [foo.com/foo/](http://foo.com/foo/) -> alias for sample.cloudfront.net -> bucketA [foo.com/bar/](http://foo.com/bar/) -> alias for sample.cloudfront.net -> bucketA

[foo.com/bat/](http://foo.com/bat/) -> alias for sample.cloudfront.net -> bucketA -> bucketA redirect to bucketB.cloudfront.net -> bucketB [foo.com/biz/](http://foo.com/biz/) -> alias for sample.cloudfront.net -> bucketA -> bucketA redirect to bucketB.cloudfront.net -> bucketB

any other options ?

some top level router thing that points paths to cloudfront distros ?

Alex Jurkiewicz

07:12:07 PM

You can do that. It would be represented as two origins, four url behaviours, and one default url behaviour

Alex Jurkiewicz

07:12:46 PM

There’s a rather low limit on url behaviours by default though (I think 20) so you can’t scale it up too far if that’s a concern

maarten

09:10:49 PM

It can scale using lambda@edge : https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/lambda-examples.html#lambda-examples-content-based-S3-origin-request-trigger

Lambda@Edge Example Functions - Amazon CloudFront

Examples of functions for Lambda@Edge.

jfraley

04:41:02 PM

@Alex Jurkiewicz thanks! testing it out today.

jose.amengual

06:26:15 PM

I don’t know if cloudfront supports multiple origins but I have done this with cloudfront and a nginx proxy and many buckets with different routes

2020-11-17

jfraley

09:05:48 PM

so I have a distribution working with 2x origins, but having trouble getting the /bat/*, /biz/** routes to flow to bucketB in example above ^

jfraley

09:06:49 PM

404 Not Found
Code: NoSuchKey
Message: The specified key does not exist.
Key: bat/index.html

jfraley

09:47:31 PM

so all content needs to be in the /bat/ directory on the target origin ^

maarten

11:04:02 AM

Instead of using a bucket as origin, you can also use a bucket as “Custom origin”, this way it should be possible to change the default path.

Alex Jurkiewicz

11:45:00 PM

yep

2020-11-18

Tomek

07:27:59 PM

hey all, was going to cross post here as it sounds like this is a better channel for these types of questions https://sweetops.slack.com/archives/CB6GHNLG0/p1605655766298600

:wave: Is there a way to define the session expiration time for the role an ECS task assumes in terraform? The AWS docs state that the default is 6 hours. max_session_duration for aws_iam_role only sets the allowed max session but it looks like when changing that to 12 hours, the ECS task’s role still uses the default 6 hour session duration

Darren Cunningham

11:00:09 PM

I’m annoyed by the AWS Support Feedback experience. I selected Good rather than Excellent and I got an email from the support engineers reporting manager asking why I gave the engineer a “low” grade. I can’t stop thinking about how awful that is and now every time I go to fill out the AWS Support Feedback form now a little part of me dies. I felt that it was necessary to try and do something about it. http://chng.it/gmtG8xqF

Darren Cunningham

11:01:00 PM

and yes I realize that “that’s how all feedback systems work” but I feel like this is something that we collectively can ask for better options

Darren Cunningham

11:01:15 PM

I’m totally open to feedback too

Alex Jurkiewicz

12:54:01 AM

good old “5 stars is normal service” rating as popularised by the service economy

Alex Jurkiewicz

12:55:10 AM

“help keep our employees precariously employed by giving us an excuse to fire them if they get close to a benefit date”

Alex Jurkiewicz

12:55:41 AM

send this to Corey Quinn

Darren Cunningham

12:56:12 AM

forgive me I’m socially dysfunctional …who’s that?

Darren Cunningham

01:00:15 AM

I can Google, but I just feel weird sending somebody I don’t know a DM…again bad at the social thing. I don’t really take part in the social medias. Slack is the only thing I stick with because it feels like less of a blackhole of dialog..and people are generally more rational.

Zach

01:00:26 AM

he runs the Last Week In AWS blog

Zach

01:00:31 AM

and does AWS consulting

Zach

01:00:37 AM

very funny guy

Darren Cunningham

01:02:24 AM

if others agree with my position and want to share it out I’m all for it. I want to try to do something to change the stats quo because (1) it hurts our fellow engineers (2) it doesn’t help us at all

Darren Cunningham

01:03:10 AM

buuuut, my amount of trying probably equates to about 1hr of effort

Zach

01:03:48 AM

you could just email that to jeff bezos and ask him “wtf”

Darren Cunningham

01:04:17 AM

I could, or I could just drop a dumbbell on my foot

Zach

01:04:59 AM

? I mean that Bezos pretty famously deals with customer feedback like this in a very serious way

Darren Cunningham

01:07:00 AM

oh sorry, I thought you were being facetious

Zach

01:07:27 AM

heh, no - his email at amazon is public and people will on occasion send him product complaints about amazon or aws

Darren Cunningham

01:08:16 AM

I can give that a shot too but just figured that the only way real change was going to happen if Amazon saw that it’s something we collectively cared about as it’s customers…you k now how typical large enterprises operate

Zach

01:08:45 AM

https://www.cnbc.com/2018/05/07/why-jeff-bezos-still-reads-the-emails-amazon-customers-send-him.html

Darren Cunningham

01:58:48 AM

thanks @Zach I sent it off – who knows if it will go anywhere but makes me feel like I’m doing my part to try to help our fellow engineers working the support desk at AWS

2020-11-20

uselessuseofcat

08:45:48 AM

Has anyone restricted outgoing traffic from ec2 instances via security groups with success? I’ve enabled a lot of VPC endpoints, but I still see a lot of outbound traffic towards AWS subnets for which I cannot identify the service that they belong to.

Darren Cunningham

01:51:36 PM

do you have non-VPC Lambdas? those are usually the culprit for me

Darren Cunningham

02:27:48 PM

wait, you mean just from the ec2? Sorry I thought you meant account wide

curious deviant

02:44:48 PM

Hello

My company has a domain purchased say mydomain.com with GoDaddy and it is today mapped to servers running on prem. We are migrating to AWS and have our Route 53 DNS and public hosted zone setup as ourcoolcloud.com (say). We want to setup the DNS routing such that when our clients hit mydomain.com it actually gets proxied to ourcoolcloud.com. We do not want the ourcoolcloud.com to appear in the client browser and we may drop mydomain.com at some point and purchase a cooler name. We do not want to keep changing our AWS HZs. Is there some DNS voodoo we can do to make this routing happening from mydomain.com - > ourcoolcloud.com without the client seeing this in the browser ?

roth.andy

02:50:11 PM

Can you change the A record for mydomain.com to be the same value as the one for ourcoolcloud.com? If you are using an alias to an AWS ALB/ELB then probably not, but if you are just pointing at an IP address then you should be able to just point both A records to the same IP address

curious deviant

02:52:24 PM

I am thinking not since ourcoolcloud.com would typically point to a CloudFront distribution and would have other record sets say ALBs etc.

curious deviant

02:58:48 PM

Would a CNAME record suffice ?

roth.andy

03:07:55 PM

I’m not an expert at DNS. I know just enough to be dangerous. If I was in your shoes I’d be finding someone who is an expert, and if there aren’t any that I can get, hiring a consultant.

DNS can be quite complicated, and in most cases it is a single point of failure. If you screw something up your whole site will go down. Most of the downtime for the big players like CloudFlare, AWS, etc, have been because someone screwed up a DNS setting

curious deviant

05:37:48 PM

thank you so much for taking the time to understand the problem. I’ll try and see if I can do a spike or something to test how this would work using a test DNS maybe. I’ll share the results here just for future reference. Thank you again

roth.andy

06:34:03 PM

Alex Jurkiewicz

07:42:06 PM

https://github.com/awsdocs/aws-cloudformation-user-guide/pull/438

A >1 year old trivial doc change finally getting attention, but they require YOU to rebase it. Contributing to AWS docs in a nutshell.

Move docs about accessing nested stack outputs by alexjurkiewicz · Pull Request #438 · awsdocs/aws-cloudformation-user-guide

Issue #, if available: Description of changes: By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

2020-11-21

Riak

08:35:33 AM

Hello All, How are you today? ‘Im looking for a cloud formation stack to deploy traefik reverse proxy on ecs

Joe Niland

08:52:56 AM

Can you use Terraform instead?

https://registry.terraform.io/modules/colinwilson/traefik-v2/docker/latest

Riak

08:53:17 AM

nope :(

2020-11-23

uselessuseofcat

01:52:26 PM

Hi, I have a domain, for example test.com and I want to delegate delegate.test.com to another AWS account. I was able to do that but I can only create A records for a.delegate.test.com and b.delegate.test.com for example but I need to create A record for delegate.test.com. Can I do that from the account to which I delegated the zone to? Thanks!

Maciek Strömich

02:26:04 PM

you have two hosted zones now: test.com delegated.test.com

test.com has an entry [delegate.test.com](http://delegate.test.com) IN NS …

This means that all records for the domain delegate.test.com (including the A record for the top level domain) will be handled by the NS servers configured in the test.com IN NS record and those records are handled by delegate.test.com hostedzone (in whatever aws account this hosted zone is deployed)

uselessuseofcat

02:27:10 PM

Thanks Maciek!

2020-11-24

Stephen Bennett

11:24:15 AM

using: https://github.com/cloudposse/terraform-aws-efs is it possible to turn off the creation of a security group and pass one to it instead?

cloudposse/terraform-aws-efs

Terraform Module to define an EFS Filesystem (aka NFS) - cloudposse/terraform-aws-efs

Alex Jurkiewicz

11:25:46 AM

try #terraform

cloudposse/terraform-aws-efs

Terraform Module to define an EFS Filesystem (aka NFS) - cloudposse/terraform-aws-efs

Stephen Bennett

11:26:18 AM

thanks

Shreyank Sharma

01:03:57 PM

Hi All, We are having an Application in Lambda, is there any way we can backup that for Disaster Recovery. or how Amazon does this for us. Thank you.

msharma24

01:21:30 PM

Hey @Shreyank Sharma Im not sure what do u mean by backing up a lambda function for DR - I would just keep the infra configuration and the lambda application + cicd code in a git repo ?

Zach

03:19:08 PM

The backup is simply your code in your repo, the ‘lambda’ in AWS is just the runtime configuration

uselessuseofcat

04:22:30 PM

Help, is there any way to list all services that I’m using on my AWS account?

Santiago Campuzano

04:22:59 PM

Yes.. the simplest way is check your AWS bill

Santiago Campuzano

04:23:30 PM

You can use CostExplorer to get the list of AWS resources/services that you are being charged for

uselessuseofcat

04:26:51 PM

Thanks @Santiago Campuzano

11:44:46 PM

my attempt at lifecycle hook niche awesome list https://github.com/nitrocode/awesome-aws-lifecycle-hooks

nitrocode/awesome-aws-lifecycle-hooks

Awesome aws autoscaling lifecycle hooks for ECS, EKS - nitrocode/awesome-aws-lifecycle-hooks

11:45:20 PM

i couldn’t find a resource that aggregated all of it. it was a bit disparate. i tried to put it all together.

nitrocode/awesome-aws-lifecycle-hooks

Awesome aws autoscaling lifecycle hooks for ECS, EKS - nitrocode/awesome-aws-lifecycle-hooks

11:45:31 PM

if anyone knows of additional lifecycle hooks for other services or whatever, please let me know

11:46:16 PM

im still trying ot figure out the difference of using only cloudwatch (ecs-drain-lambda), using sqs (claranet method), and using sns

2020-11-25

maarten

05:21:38 PM

http://status.aws.amazon.com welp

Matt Gowie

07:03:23 PM

Yeah, not a fun morning.

jose.amengual

06:52:55 PM

and this is why I tell people DO NOT DEPLOY in us-east-1

Matt Gowie

07:03:44 PM

I know better and yet one of my larger clients is running in us-east-1

rei

07:06:22 PM

Do you mean us-east-1 is the aws staging environment?

Maciek Strömich

07:12:33 PM

tbh for me it’s the 2nd larger outage that I’m affected by in roughly 7 years of running stuff in us-east-1. first one being s3 outage (happily we lived through it without downtime) few years back.

Maciek Strömich

07:13:29 PM

and currently i see longer response times from firehose endpoint (800-1000ms instead of normal 50-80ms) and cognito endpoint is completely down

Matt Gowie

07:15:15 PM

We’re having troubles with ECS auto scaling and scaling out. Desired tasks threshold is not triggering new task creation and some tasks are failing to be placed due to “limited capacity”. I think it’d all tied to CloudWatch problems.

Maciek Strömich

07:17:43 PM

yeah and CW is failing because of kinesis streams which IMO started the domino to fall. Happily for me I’ve quite good understanding of CPU needs and cosidering those few k rpm i’m receiving on a constant basis if no instance will be replaced i should be fine

Issif

07:23:03 PM

@rei I always thought that

Issif

07:34:42 PM

when AWS has issues, a lot have too

Issif

07:34:46 PM

rei

10:37:48 PM

jose.amengual

05:04:13 AM

still down…..shit tomorrow 3000 people from amazon fired

jose.amengual

05:04:57 AM

so now that shit hit the fan people will start migration to us-east-2 and then it will go down

voidSurfr

07:25:25 PM

hey guys, just dipping my toe in with https://registry.terraform.io/modules/cloudposse/tfstate-backend/aws/latest and I can’t get it to create the bucket, lol

Initializing the backend...

Successfully configured the backend "s3"!
...
Error: Failed to get existing workspaces: S3 bucket does not exist.             <--

The referenced S3 bucket must have been previously created. If the S3 bucket    <-- uhhh, what?
was created within the last minute, please wait for a minute or two and try
again.

Error: NoSuchBucket: The specified bucket does not exist
	status code: 404, request id: 08C447AD410DF430, host id: a6btOtHixZYbFcNJ8E+gpLoFP9vw4MIvFfibWxHrtQwB+tf2HSDJ9bbMvmGRBDt9BmqW/XoZUzY=

Matt Gowie

07:41:13 PM

You need to first create the resources with a local backend and then transition over.

Matt Gowie

07:42:46 PM

Usually you should do this through having a separate root module for your tfstate-backend which bootstraps your backend

Matt Gowie

08:34:36 PM

@voidSurfr did you see these? I think the above is your immediate problem.

voidSurfr

08:38:12 PM

sorry, I was attending office hours. I’ll double-check your notes against my issue.

jose.amengual

09:52:31 PM

you do something like : terraform apply -var-file='production.tfvars' -target module.terraform_state_backend

jose.amengual

09:52:37 PM

with local state defined

jose.amengual

09:53:24 PM

once the s3 bucket is created , hen you can define the backend-config on a file or main.tf and then

jose.amengual

09:53:28 PM

terraform init -var-file='production.tfvars' -backend-config=production-backend-us-east-2.tfvars

jose.amengual

09:53:43 PM

and it will ask you to move the local state over

jose.amengual

09:53:54 PM

this is all on the README

voidSurfr

07:28:17 PM

can someone help me to understand this?

2020-11-27

Joan Porta

02:56:29 PM

Hi guys! QQ we are in AWS, and we are finding some global cron scheduler to run tasks. I know Rundeck but something more SaaS? Something that dev teams do not depend on IT operations to create things in terraform like lambdas triggered by cloud events, or ECS tasks that run based on events…. need something easy and fast for dev’s. Thx!!!!!

loren

03:49:39 PM

Cloudwatch events has cron support for scheduled triggers, codebuild works as a target and can run arbitrary commands

loren

03:50:27 PM

Or cloudwatch events to lambda works fine

loren

03:51:06 PM

I think ssm automation is also an option

Joan Porta

03:55:09 PM

yeah thx! I have found as well this https://github.com/blinkist/terraform-aws-airship-ecs-service

blinkist/terraform-aws-airship-ecs-service

Terraform module which creates an ECS Service, IAM roles, Scaling, ALB listener rules.. Fargate & AWSVPC compatible - blinkist/terraform-aws-airship-ecs-service

Erik Osterman (Cloud Posse)

04:05:19 PM

We even have an #airship channel :-)

Joan Porta

06:38:07 PM

cool!

Joe Niland

09:31:58 PM

Sounds like you want something managed? But how about Serverless Framework? It satisfies your Lambda requirement easily. Defining events is really easy in the serverless.yml file.

Not sure how well it’ll fit with ECS but found this: https://www.serverless.com/plugins/serverless-fargate-tasks

Part of the challenge here is coming up with sufficiently restrictive policies to allow developers to deploy without being able to change things they shouldn’t.

Also managing everything for cost and tracking purposes will require some thought. Serverless.com paid offering might help with that but I’ve not used it yet.

Serverless Fargate Tasks

A plugin to run fargate tasks as part of your Serverless project

Erik Osterman (Cloud Posse)

11:05:07 PM

also, since you mentioned something easy one thing we use a lot of is Zapier and Integromat. Both of them have cron like functionality.

Joe Niland

11:07:05 PM

Oh yeah, ofc! @Erik Osterman (Cloud Posse) just curious - would you choose one over the other? I was looking at doing something with Xero the other day, and for some reason the Integromat connector was more full-featured.

Erik Osterman (Cloud Posse)

11:10:33 PM

Yea, we started with Zapier so we have ~300 zaps there. However, if we were starting from scratch would 100% use Integromat. Easy things are a little bit harder to do in integromat (compared to Zapier), but it more than makes up for it with all the advanced features. I feel like Integromat is more geared towards advanced users / programmers. It has iterators, routers, variables, etc. If their integration doesn’t provide the API endpoint you want, they usually provide a raw method to use the integration. Integromat supports handling errors, while Zapier doesn’t.

Joe Niland

11:11:36 PM

Ah thank you for the insight! Very useful.

Joe Niland

11:27:40 PM

Integromat’s API looks pretty good too

Joan Porta

02:52:39 PM

very valuable info, thx all!

2020-11-28

Vlad Ionescu (he/him)

09:14:00 AM

AWS Kinesis incident details: https://aws.amazon.com/message/11201/

antonbabenko

03:25:54 PM

Good week-ends reeding In short, use just EC2 instances and don’t use new and fancy services because everything is interconnected internally.

Vlad Ionescu (he/him)

03:28:58 PM

I deeply disagree with that. The best example against it is the experience iRobot had: https://twitter.com/ben11kehoe/status/1332391783829913600

Yes, they had a bad incident, but it was: 1) rare, and by managing EC2 themselves they would’ve definitely had more incidents + higher cost + huge opportunity cost 2) relaxed, as in they had to wait for AWS to recover. No stress, no fighting to figure out the root cause, no marketing hit of “Roombas are down” but “A bunch of IoT things are down”

I want to talk a bit about what this was like. tl;dr: it was long and inconvenient timing but, as an operations team, not particularly stressful. Questions of “when”, not how or if systems would come back. A lot of waiting and watching—and that’s desirable. https://twitter.com/ben11kehoe/status/1332028868740354048

I’m still working with my team to mitigate the after effects of the day-long AWS outage yesterday, including dealing with follow-on AWS issues. I’ve gotten three hours of sleep and it’s ruining my Thanksgiving day. Hot take: I am thankful we have built serverless on AWS.

antonbabenko

03:34:20 PM

… it was trolling from my side. I understand that such incidents happen all the time. Smaller or larger, and it is our job to prepare and minimize impact as much as we can.

Vlad Ionescu (he/him)

03:34:52 PM

welp, I did not catch that. Sorry!

antonbabenko

03:35:48 PM

Some people posted on linkedin during an incident that their EC2+RDS services were 100% up during that period, so they are lucky that they don’t use fancy services… and then it started a “discussion” ec2 VS serverless :)))

Yoni Leitersdorf (Indeni Cloudrail)

08:01:58 AM

The discussion we’re having is: “should we replace AZ-level redundancy (within us-east-1), with the more resource intensive region-level redundancy”.

2020-11-29

03:57:57 PM

Is there a simple explanation of minimum security group configuration for an internal load balancer, target groups, and ec2 instances ? I’m a bit confused

04:00:48 PM

This explains the load balancer sg rules but not the ec2 instances

https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-update-security-groups.html

Security groups for your Application Load Balancer - Elastic Load Balancing

Learn how to update the security groups for your Application Load Balancer.

Alex Jurkiewicz

09:42:20 PM

The security group on the EC2 instances should allow ingress and egress to the load balancer security group, and vice versa.

I would suggest adding allow all out egress rule to both groups as well though.

03:59:03 PM

Anyone know if the a) asg instance refresh or b) manual termination invokes the lifecycle hooks?

03:59:23 PM

There are only 2 references to lifecycle hooks here https://docs.aws.amazon.com/autoscaling/ec2/userguide/asg-instance-refresh.html

Replacing Auto Scaling instances based on an instance refresh - Amazon EC2 Auto Scaling

Describes the scenario of replacing the Amazon EC2 instances in your Auto Scaling group, based on an instance refresh.

Zach

05:44:46 PM

It does yes

Zach

05:45:19 PM

Well, I can say for sure it certainly invokes the create hook, we don’t make use of the termination one

09:06:29 PM

You’re saying only the insurance refresh will invoke the lifecycle hook?

Zach

09:07:24 PM

No I’m just confirming that instance-refresh definitely does. I assume it triggers in both cases that you’re asking though

Chris Fowles

02:47:00 AM

i can confirm it triggers the termination hooks

Chris Fowles

02:47:12 AM

we make use of it regularly

04:03:11 PM

from my tests, it looks like manual termination of the ec2 instance does not invoke the lifecycle hook

04:19:45 PM

ah and it looks like an instance refresh does invoke the lifecycle hook. thank you.

Zach

04:20:20 PM

interesting. I guess they take the stance that if you manually asked for a termination that you want it done right now

04:53:36 PM

weird. i spoke with support and they said that manual terminating should invoke the hook

04:53:52 PM

perhaps the https://github.com/getsocial-rnd/ecs-drain-lambda just doesn’t work for the manual termination

getsocial-rnd/ecs-drain-lambda

Automation of Draining ECS instances with Lambda, based on Autoscaling Group Lifecycle hooks or Spot Instance Interruption Notices - getsocial-rnd/ecs-drain-lambda

05:53:35 PM

ah ok it was a miscommunication. Confirmed with support that manual termination from the UI does not run the lifecycle hook but it does if the aws autoscaling terminate-instance-in-auto-scaling-group cli command is run

10:32:58 PM

oh and what about asg max instance lifetime settings ? does it invoke the lifecycle hook too ?

Replacing Auto Scaling instances based on maximum instance lifetime - Amazon EC2 Auto Scaling

Describes the scenario of replacing the Amazon EC2 instances in your Auto Scaling group, based on a maximum instance lifetime limit.

2020-11-30

Steve Wade (swade1987)

02:51:59 PM

is anyone able to help me understand VPC peering, I have created a terraform module and applied it but am struggling to understand one thing ….

I want to peer the private subnets from account X with the database subnets from account Y

Where do I need to enable dns resolution to allow the instances in the private subnet to use internal hosted zones in account Y ?

Santiago Campuzano

02:52:50 PM

@Steve Wade (swade1987) you need to enable DNS resolution at the VPC peering config

Santiago Campuzano

02:52:53 PM

Let me show you

Steve Wade (swade1987)

02:53:15 PM

thanks man your help is much appreciated

Santiago Campuzano

02:53:39 PM

Santiago Campuzano

02:54:05 PM

We enabled it for some of our VPC peerings.

Steve Wade (swade1987)

02:55:48 PM

so i need to enable in the VPC connection in account X, right?

Santiago Campuzano

02:56:10 PM

Yep//// you’re right !

Steve Wade (swade1987)

02:56:24 PM

let me try that

Steve Wade (swade1987)

03:01:29 PM

it doesn’t seem to be working with nslookup

Santiago Campuzano

03:02:30 PM

Wait… I think you should do it the other way….

Steve Wade (swade1987)

03:02:41 PM

ok let me try that

Santiago Campuzano

03:02:48 PM

I read your config again

Steve Wade (swade1987)

03:03:51 PM

trying …

Santiago Campuzano

03:04:28 PM

Cool… let me know how that goes

Steve Wade (swade1987)

03:04:52 PM

will do

Steve Wade (swade1987)

03:05:35 PM

nope

Steve Wade (swade1987)

03:06:16 PM

could it be security group related?

Santiago Campuzano

03:07:01 PM

Wait…weird…. are you having issues with DNS resolution or with TCP connectivity ?

Steve Wade (swade1987)

03:07:21 PM

root@test-shell:/# nslookup elastic.logging.de.qa.underwriteme.internal
Server:         172.20.0.10
Address:        172.20.0.10#53

** server can't find elastic.logging.de.qa.underwriteme.internal: NXDOMAIN

Santiago Campuzano

03:07:55 PM

Is that K8S by any chance ?

Steve Wade (swade1987)

03:08:05 PM

100% its EKS

Santiago Campuzano

03:08:05 PM

Are you doing that from a POD ?

Steve Wade (swade1987)

03:08:11 PM

yes i am

Santiago Campuzano

03:08:37 PM

I’m wondering if that could be an issue… would you be able to do that from a regular EC2 instance ?

Steve Wade (swade1987)

03:09:01 PM

we are using bottlerocket so going onto the node isn’t the easiest for me unfortunately

Santiago Campuzano

03:09:17 PM

Hmmm gotcha

Steve Wade (swade1987)

03:19:31 PM

i might just try dns resolution from both sides

Steve Wade (swade1987)

03:24:01 PM

oh shoot is it because i am peering with the database subnets from the upstream VPC module which deny inbound, don’t they?

Santiago Campuzano

03:25:07 PM

What do you mean. When you configure VPC peering, you peer the entire VPC

Santiago Campuzano

03:25:24 PM

And then you configure SGs and Routing Tables to allow certain traffic

Santiago Campuzano

03:25:32 PM

Have you configured both ?

Steve Wade (swade1987)

03:26:00 PM

let me create a gist it might be easier

Steve Wade (swade1987)

03:27:36 PM

https://gist.github.com/swade1987/61e43b74abd3d3147e7143fedf2173c2

Steve Wade (swade1987)

03:34:38 PM

i tried traceroute as well to no avail

Steve Wade (swade1987)

03:43:44 PM

i think my routing logic is correct

Steve Wade (swade1987)

04:20:29 PM

@Santiago Campuzano i have an update, i can nslookup from account X to an instance in the database subnet in account Y without a problem, the issue is just DNS resolution is not working

Steve Wade (swade1987)

04:20:42 PM

root@test-shell:/# nslookup redacted.cssoccfscup8.eu-west-1.rds.amazonaws.com
Server:         172.20.0.10
Address:        172.20.0.10#53

Non-authoritative answer:
Name:   redacted.cssoccfscup8.eu-west-1.rds.amazonaws.com
Address: 10.60.26.86

Santiago Campuzano

04:21:22 PM

@Steve Wade (swade1987) What do you mean by you can nslookup but DNS resolution is not working. ?

Santiago Campuzano

04:21:29 PM

That is like contradictory

Santiago Campuzano

04:21:42 PM

I mean… nslookup is pretty much DNS resolution

Steve Wade (swade1987)

04:22:06 PM

i can nslookup to an AWS known DNS record (see above)

Santiago Campuzano

04:22:56 PM

Ok… and you’re able to get the private IP address from that RDS instance

Steve Wade (swade1987)

04:23:14 PM

i can’t resolve custom private DNS addresses

Steve Wade (swade1987)

04:23:41 PM

e.g. [yellowfin.rds.de.qa](http://yellowfin.rds.de.qa).underwriteme.internal which is a CNAME to an RDS endpoint

Santiago Campuzano

04:23:54 PM

Hmmm got it… you’re resolving the public IP, not the private one

Steve Wade (swade1987)

04:24:44 PM

currently i have dns resolution setup on both sides

Steve Wade (swade1987)

04:24:57 PM

maybe i need to turn off account X like you said before?

Santiago Campuzano

04:25:24 PM

I would give it a try….

Steve Wade (swade1987)

04:27:29 PM

made no difference

Steve Wade (swade1987)

04:27:41 PM

i wonder if this has something to do with coredns configuration?

Steve Wade (swade1987)

04:29:24 PM

or maybe i need to create a route53 resolver?

Santiago Campuzano

04:32:55 PM

I was checking my config, and I don’t have the R53 resolver configured

Steve Wade (swade1987)

04:33:19 PM

but are you peering the whole VPCs together?

Steve Wade (swade1987)

04:33:28 PM

or subnets in account X to others in account Y ?

Santiago Campuzano

04:35:12 PM

The whole VPCs together

Steve Wade (swade1987)

04:35:21 PM

yeh i think that might be the difference

Santiago Campuzano

04:36:08 PM

Well…. wait.. is it possible to peer certain subnets ?

Santiago Campuzano

04:36:21 PM

AFAIR you peer the entire VPC

Santiago Campuzano

04:36:30 PM

not certain subnets/CIDRs

Steve Wade (swade1987)

04:36:42 PM

well the route tables are subnets

Santiago Campuzano

04:37:26 PM

Ok… The route tables indicate what traffic to send through the VPC peering…

Steve Wade (swade1987)

04:37:39 PM

yeh mine looks right

Steve Wade (swade1987)

04:37:47 PM

the issue is just the private dns resolution is not working

Santiago Campuzano

04:39:52 PM

I would discard. the EKS component of this equation

Santiago Campuzano

04:40:03 PM

I’d create a simple Ec2 instance and try from there

Santiago Campuzano

04:40:06 PM

The DNS resolution

Santiago Campuzano

04:40:48 PM

EKS/K8S DNS resolution is weird sometimes

Steve Wade (swade1987)

04:40:53 PM

https://aws.amazon.com/blogs/compute/enabling-dns-resolution-for-amazon-eks-cluster-endpoints/

Enabling DNS resolution for Amazon EKS cluster endpoints | Amazon Web Services attachment image

Update – December 2019 Amazon EKS now supports automatic DNS resolution for private cluster endpoints. This feature works automatically for all EKS clusters. You can still implement the solution described below, but this is not required for the majority of use cases. Learn more in the What’s New post or Amazon EKS documentation. This post […]

Santiago Campuzano

02:29:33 PM

Morning @Steve Wade (swade1987) !

Santiago Campuzano

02:29:40 PM

How this ended up for you ?

Santiago Campuzano

02:29:51 PM

The DNS resolution issue ?

Steve Wade (swade1987)

02:33:16 PM

after talking to AWS you can do this via the command line

Steve Wade (swade1987)

02:33:35 PM

see https://aws.amazon.com/premiumsupport/knowledge-center/private-hosted-zone-different-account/

bazbremner

04:29:46 PM

Anyone using ACM private CA (PCA) with root CA and subordinate CAs in separate accounts (as recommended by the best practice section of the PCA docs)? I’m bashing my head against the wall trying to work out how to have the root CA sign the subordinate’s certificate and import the certificate to get it into an active state.

bazbremner

04:32:18 PM

If I try a RAM share of the subordinate into the root CA’s account (which uses the RAM-managed permissions) trying to sign the subordinate from the root account fails with 1 validation error detected: Value at 'csr' failed to satisfy constraint: Member must satisfy regular expression pattern: -----BEGIN CERTIFICATE REQUEST-----... which makes me suspect that the root CA/account can’t read the CSR for the subordinate, but of course I can’t change the permissions since RAM manages them. Can’t see what the alternative workflow would look like - I don’t see anything in the docs.

bazbremner

04:35:56 PM

Oh and yes, trying to view the CSR of the subordinate as shared by RAM in the root CA account just beachballs, which is another tint on the permissions side of things.

bazbremner

04:38:44 PM

Going the other way, and sharing the root CA into the account with the subordinate doesn’t work with the “Import CA” -> “ACM private CA” flow in the console, as the root isn’t presented as an option.

bazbremner

04:39:00 PM

…And Terraform doesn’t support most of the PCA operations.

Rob Williams

05:09:39 PM

Hey folks Whats the recommended way to partition environments and related infra? want to keep prod env as much isolated as possible, but still want to ensure pace of development. We are a relatively new startup ~6 eng, so am just setting things and process up.

roth.andy

05:11:45 PM

either separate AWS accounts or separate VPCs is what we do (depending on scale)

roth.andy

05:12:17 PM

@Erik Osterman (Cloud Posse) did a good overview in office hours a few weeks ago

Rob Williams

05:14:20 PM

ah thanks, do we have recording of those? Till now I have decided to go with separate vpc, not sure where shared services like logs/metrics etc should go. I have read that they should generally go into a different “shared” environment, but not sure of how their config etc would be tested. Like where should staging version of such shared services go?

roth.andy

05:14:28 PM

https://www.youtube.com/c/CloudPosse/videos

Cloud Posse

We’re a DevOps accelerator. That means we help companies own their infrastructure in record time by building it with you and then showing you the ropes. If t…

roth.andy

05:15:46 PM

I couldn’t find the exact video (there’s a lot there). A “Cloud Posse Explains: AWS Account/Environment Laydown” video would be helpful

roth.andy

05:16:31 PM

Who the hell is that ugly guy @Erik Osterman (Cloud Posse)?? You really should keep out the rifraf

Rob Williams

05:28:32 PM

ah thanks for the help and the direction.

Erik Osterman (Cloud Posse)

06:29:50 PM

We’re actively releasing stuff every week as part of our reference architecture. It’s too early to point you to something concrete. But by end of december it will be much closer. Documentation will begin in earnest Q1 2021.

Erik Osterman (Cloud Posse)

06:30:17 PM

All of our top-level AWS components are here: https://github.com/cloudposse/terraform-aws-components

cloudposse/terraform-aws-components

Catalog of reusable Terraform components and blueprints for provisioning reference architectures - cloudposse/terraform-aws-components

Erik Osterman (Cloud Posse)

06:30:56 PM

this is the future home of our reference-architectures https://github.com/cloudposse/reference-architectures but its documentation is stale and examples incomplete.

cloudposse/reference-architectures

[WIP] Get up and running quickly with one of our reference architecture using our fully automated cold-start process. - cloudposse/reference-architectures

Erik Osterman (Cloud Posse)

06:31:25 PM

we’ll definitely be recording lots of videos, demos, walkthrus in addition to more step by step documentation.

Vlad Ionescu (he/him)

11:58:26 PM

re:Invent gets extended: https://twitter.com/txase/status/1333559564998828032