SweetOps #aws for August, 2020

Discussion related to Amazon Web Services (AWS)

Archive: https://archive.sweetops.com/aws/

2020-08-03

https://sweetops.slack.com/archives/CDYGZCLDQ/p1596474714109900

Just released 2 container modules (datadog and fluentbit) on the tf registry to make fargate and datadog integration easier.

2020-08-04

contact871

10:49:40 AM

Is it possible to use "PropagateTags": "TASK_DEFINITION" when triggering an ECS task with CloudWatch Event rule?

contact871

11:20:30 AM

I see this is NOT yet possible: https://github.com/aws/containers-roadmap/issues/89

[Fargate, ECS] [Tagging]: Support tagging when starting a task from CWE · Issue #89 · aws/containers-roadmap

Tell us about your request Support for tagging a task started through CloudWatch Events. Which service(s) is this request for? Fargate, ECS Tell us about the problem you're trying to solve. Wha…

Luis

04:09:02 PM

Hi! About https://github.com/cloudposse/terraform-aws-eks-cluster/ and https://github.com/cloudposse/terraform-aws-eks-node-group. I am currently testing the bugfix implemented in 0.22.0 : https://github.com/cloudposse/terraform-aws-eks-cluster/releases/tag/0.22.0 In the example, https://github.com/cloudposse/terraform-aws-eks-cluster/blob/master/examples/complete/main.tf

data "null_data_source" "wait_for_cluster_and_kubernetes_configmap" { module "eks_node_group" { cluster_name = data.null_data_source.wait_for_cluster_and_kubernetes_configmap.outputs["cluster_name"]

I have this in my “main.tf”, but when I apply Terraform I get the following error: Error: Cycle: module.eks_cluster.kubernetes_config_map.aws_auth, module.eks_node_group.module.label.output.tags, module.eks_node_group.aws_iam_role.default, module.eks_node_group.output.eks_node_group_role_arn, module.eks_cluster.var.workers_role_arns, module.eks_cluster.local.map_worker_roles, module.eks_cluster.kubernetes_config_map.aws_auth_ignore_changes, module.eks_cluster.output.kubernetes_config_map_id, data.null_data_source.wait_for_cluster_and_kubernetes_configmap, module.eks_node_group.var.cluster_name, module.eks_node_group.local.tags, module.eks_node_group.module.label.var.tags, module.eks_node_group.module.label.local.tags

Has this been tested like in the example? Thanks!

Erik Osterman (Cloud Posse)

06:09:50 PM

Every pull request runs automated tests using terratest against examples/complete

Erik Osterman (Cloud Posse)

06:10:16 PM

tests are in test/src

Pedro Henriques

04:28:54 PM

Hello everyone Do you mind taking a look into this PR please? https://github.com/cloudposse/terraform-aws-elasticsearch/pull/63

Dynamic cognito_options inner block to avoid permission problems in AWS China by brdasilva · Pull Request #63 · cloudposse/terraform-aws-elasticsearch

what Amazon Cognito authentication for Kibana is not supported on AWS China. Therefore we need to have a way to avoid setting the cognito options inner block on the aws_elasticsearch_domain terrafo…

jose.amengual

05:24:53 PM

https://github.com/cloudposse/terraform-aws-elasticsearch/releases/tag/0.19.0

Release 0.19.0: Transformed cognito_options inner block into a dynamic block to avoid… · cloudposse/terraform-aws-elasticsearch

what Amazon Cognito authentication for Kibana is not supported on AWS China. Therefore we need to have a way to avoid setting the cognito options inner block on the aws_elasticsearch_domain terraf…

2020-08-05

Milosb

12:13:01 PM

Hi all, Do you know if I can share Transit Gateway between regions with RAM in same account?

Alan Kis

12:15:32 PM

You can’t share the resource, at least this particular between regions. You can share it between the different accounts, but for building a regional network using Transit Gateway, you need to create a peering connections between Transit Gateways in different AWS regions.

Milosb

12:16:06 PM

i saw that peering attachment option

Milosb

12:16:20 PM

but that means i need transit gateway in each region

Alan Kis

12:17:08 PM

Exactly. Or you can use other ways to create connection over regions, like using IPSec tunnels

Milosb

12:18:58 PM

I have vpc peering, but i wanted to use something to rule them all

Milosb

12:19:54 PM

thanks

raghu

12:15:00 PM

You should do tgw peering across region

Milosb

12:27:30 PM

should is hard word I wanted to avoid that

Milosb

12:28:00 PM

if i see it right there will be at least one additional tg-attachment in that case ( more if you want to connect more regions ) edit: actually its x2

raghu

12:29:58 PM

Without peering, i dont think you can connect cross region

Zach

04:29:36 PM

I was looking at the ASG max instance lifetime setting … the units are seconds but it has a minimum value of 604800

2020-08-06

Prasad

06:25:29 PM

hello all, The documentation of Application load balancer says SSL termination at LB level…if we configure https listeners for the target… How does the traffic flow from ALB to the target servers? is it not encrypted again from ALB to target servers?

Zach

06:55:56 PM

Yes it re-encrypts, although they don’t do any cert validation on the backend, its fire & forget

pjaudiomv

07:31:31 PM

Anybody play with the python CDK

Jaeson

10:45:21 PM

Just played with the TF CDK for Python yesterday. My experience was pretty awful. What is installed and used for python CDK is actually a skeleton that converts javascript to python and runs it. It was pretty slow, and difficult to find what I was looking for. I use tfswitch to manage the TF version in a container, which requires the TF version to be pinned, but couldn’t figure out how to pin the TF version with the CDK. So my experience was a short but painful one.

I’ve used CDK for AWS as well, and the experience was better, though I ran into CFN limitations, which is one of the reasons why TF interests me.

So, from my perspective, CDK is still a ways out from being very useful.

pjaudiomv

10:47:44 PM

Thank you I shall wait to even go as far as you did then

Steen

07:36:26 AM

Have been using the CDK (sans TF) with Python for a couple of weeks in a real-world scenario. I really like the programmatic feel of the setup, although having Python convert to Typescript behind the scenes ruins the usual developer toolchain for me (i.e. especially not being able to simply throw pdb into the mix); I could of course just go with Typescript but, aaaah. The documentation is autogenerated and sucks big time. They have taken idioms from other languages and pulled them down over Python and the result is not very pythonic. But for me, it sure beats HCL, which is an abhorrent nightmare of the NIH syndrome backed by naïve enthusiasm and silicon capital. In the words of L Peter Deutsch: “Every now and then I feel a temptation to design a programming language but then I just lie down until it goes away.”. Once you accept the tie-in to AWS and their stack concept, accept that CDK uses Python in name only and that there is no real state tracking, CDK feels welcoming especially for people with developer backgrounds (caveat: that’s me)

pjaudiomv

07:31:43 PM

Or the CDK in general

pjaudiomv

07:33:34 PM

I’m interested to see what the terraform CDK adoption is gonna be like

pjaudiomv

07:39:12 PM

Why would one use the terraform CDK over aws one if only using aws

Erik Osterman (Cloud Posse)

06:01:31 PM

I’ll bring this up in #office-hours today

Prasad

07:40:37 PM

@pjaudiomv may be we would want to migrate to different cloud down the line. we never know:)

07:40:53 PM

i think its to write up the terraform code programatically without having to write up terraform manually

07:41:32 PM

if i understand it correctly

write cdk in coding language of your choice like python (similar to pulumi)
run cdk to generate terraform
terraform apply

loren

09:20:15 PM

i’ve also had use cases where i needed to generate/template the terraform hcl, to workaround some limitation of terraform. that particular use case was addressed by for_each, but i expect other similar cases where generating the hcl from a more expressive language has advantages

loren

09:28:09 PM

maybe also as a different abstraction layer for vars/inputs, a wrapper that takes inputs in your form of choice, and writes the values into the hcl. something of a workaround for the annoying decision to warn (and maybe error) when a tfvars file has undeclared vars

Zach

09:28:19 PM

Is there some cloudformation juju to lookup an existing aws resource that is not part of a stack? ie I have a kms key alias and I need the arn

loren

09:28:38 PM

terraform

Zach

09:29:11 PM

yah thats the model I’m coming from

Zach

09:29:29 PM

but I’m trying to deploy a lambda in this new fangled Serverless model that uses cloudformation

loren

09:30:24 PM

right, so, no. huge limitation of cfn. you can construct the arn, since the format is known. or you can write a custom resource that basically runs describekey (or whatever the api call is) and returns the result

Zach

09:31:00 PM

wow thats awful

loren

09:31:23 PM

or you can create the key in the stack, so you have access to its attributes

loren

09:31:44 PM

or just take the arn as a parameter and let the user figure out how to get it to you

Zach

09:32:38 PM

I guess option 3 is that I create the IAM role/policy in terraform and have the serverless config reference that by name to add to the lambda

Zach

09:32:56 PM

loren

09:33:07 PM

speaking of cdk, this is why i think cloudformation templates should always themselves be an artifact generated by something else. the cdk is great for this

pjaudiomv

09:28:34 PM

Yea I suspect that’s most common reason, just limitations of hcl

jose.amengual

11:07:25 PM

is it possible to use Service Discovery with ECS+EC2 setups ?

jose.amengual

11:07:39 PM

the examples and docs point to fargate

jose.amengual

11:07:52 PM

I’m trying to use AppMesh with ECS+EC2

Matt Gowie

03:31:33 PM

I’ve done it.

Matt Gowie

03:31:39 PM

Got some code — stand by.

Matt Gowie

03:32:49 PM

I setup Service Discovery as part of a talk I did on ECS. Here is the repo that went with the talk:

https://github.com/masterpointio/ecs-101-demo

masterpointio/ecs-101-demo

A small demo application for an ECS 101 talk I’m giving @ AWSMeetupGroup - masterpointio/ecs-101-demo

Matt Gowie

03:33:51 PM

Relevant bits are:

https://github.com/masterpointio/ecs-101-demo/blob/master/terraform/main.tf#L494-L518

https://github.com/masterpointio/ecs-101-demo/blob/master/terraform/main.tf#L478-L483

masterpointio/ecs-101-demo

A small demo application for an ECS 101 talk I’m giving @ AWSMeetupGroup - masterpointio/ecs-101-demo

masterpointio/ecs-101-demo

A small demo application for an ECS 101 talk I’m giving @ AWSMeetupGroup - masterpointio/ecs-101-demo

Matt Gowie

03:34:22 PM

Though this doesn’t include App Mesh, so maybe it won’t fit what you’re after. But that service / DB task that I wire up to Service Discovery is running on EC2 and not Fargate.

jose.amengual

03:38:47 PM

Awesome, it is not clear in the docs if you can use your current alb as the endpoint for. Appmesh instead of service discovery but I found one of the aws training labs were thru show it using an internal alb instead of a service discovery endpoint

jose.amengual

03:38:56 PM

As always very confusing

Matt Gowie

03:39:43 PM

Yeah… you gotta dig deep in the docs for the more complicated shit in AWS.

jose.amengual

03:41:04 PM

I’m going to write a module

Matt Gowie

03:41:37 PM

2020-08-07

2020-08-08

04:28:49 AM

Anyone here setup netflix’s repokid or ardvaark? Would love to know your deployment, caveats, and ways to simplify getting it setup

jose.amengual

04:39:25 AM

I’m very interested on this too

2020-08-09

2020-08-10

contact871

08:27:06 AM

Can I track EFS costs per Access Point? In other words when I set an Access Point tag will I be able to see the EFS cost for this tag in Cost Explorer?

Phuc

04:55:30 AM

only if you enalbe allocation tag on that specific tag, just after that the cost can be sorted out in Cost Explorer with tag

dalekurt

05:58:23 PM

Has anyone had issues cloning a git repo over SSH while connected to AWS VPN?

Eric Berg

11:05:38 PM

Take a look at how your VPN is configured. Generally, you can set it to route only the traffic to the VPN-connected network or to route all traffic. If it’s all traffic, there may be some network rules in the VPN VPC that you connect to that keep SSH from egressing the network.

jason einon

06:59:12 PM

hey, what error are you getting ?

2020-08-11

dalekurt

04:50:25 PM

@jason einon I will have the get the exact error, but what happens is that once I’m connected to VPN I’m unable to git clone or git push over SSH

jason einon

08:03:50 PM

is this for any git repo? its very possible that the vpn connection does not have the correct port open for ssh tcp 22 usually

2020-08-12

02:10:18 PM

anyone know of an ssm command line tool where you can specify the command and list of instance ids to run the command ?

Issif

02:31:16 PM

I don’t have this, but if you used ssm, you could find useful a tool I’ve made last year : https://github.com/claranet/sshm

rajeshb

02:41:03 PM

i havent tried ssm command line to filter instances. but, i have created a doc and association with tags and applied that association using tf ,

resource "aws_ssm_association" "config-files-load" {
  depends_on = [aws_s3_bucket_object.monitoring-config-files-upload]
  name       = aws_ssm_document.shell-config-update-doc.name
  targets {
    key    = local.monitoring_identifation_tag_name
    values = [local.monitoring_identifation_tag_value]
  }
  association_name = "${aws_ssm_document.shell-config-update-doc.name}-association"
}

resource "null_resource" "example2" {
  depends_on = [aws_ssm_association.config-files-load]
  provisioner "local-exec" {
    command = "aws ssm start-associations-once --association-id ${aws_ssm_association.config-files-load.association_id}  --region ${local.region}"
  }
}

02:52:23 PM

there are so many ssm projects.

• https://github.com/gjbae1212/gossm

• https://github.com/itsdalmo/ssm-sh

• https://github.com/xen0l/aws-gate

• https://github.com/disneystreaming/ssm-helpers

• https://github.com/elpy1/ssh-over-ssm

jose.amengual

04:27:04 PM

I use sshm

jose.amengual

04:27:36 PM

if you have ssh over ssm working you can use cssh

jose.amengual

04:40:59 PM

and SSM RunCommand does not do this already?

sheldonh

09:55:18 AM

This is run Powershell ask or any other sdk/CLI does . Do you need a specific example?

Juan Soto

06:50:18 PM

Looking like wafv2 doesn’t allow geoblocking for all the evil countries. Is the easiest way to fix this to apply geolocation routing in r53? Where would you route the bad traffic to? An S3 bucket that says “you are not allowed”? or what?

Issif

06:57:07 PM

who agree the new EC2 console is ugly and really inconvenient?

kskewes

08:45:22 PM

Especially route53!!

Issif

08:47:55 PM

+100000000

vFondevilla

07:01:36 PM

@Juan Soto 127.0.0.1

vFondevilla

07:02:18 PM

If you send them to S3, it will cost you money. If you send themselves to 127.0.0.1 it will be free

Juan Soto

07:03:30 PM

good idea, let me check it

2020-08-13

Suresh

12:30:03 PM

Hello guys,

quick AWS query, I have a use case of hitting private hosted zone domain from public api gateway, the HTTP integration request of the API gateway is not happy with the private hosted zone domain name. Did anyone tried this before?

Issif

12:55:55 PM

Have you declared your VPC in allowed VPC for the domain?

Suresh

02:06:24 PM

Hey, thanks for the reply, sorted this out with the VPC Link.

08:36:20 PM

can i use the same security group in different vpcs ? or do i need to recreate the security group ?

if i have to recreate the security group per vpc, is there a cool aws way to reuse the security group rules (already reusing them at the moment using tf but wondering if there is a better way)

08:37:33 PM

optional thread

loren

08:40:19 PM

If the VPCs are peered, you can use peered vpc security group references. Otherwise, I don’t think there is a way to use the same security group, nor share the rules

loren

08:42:03 PM

Still probably not quite what you’re looking for though… https://docs.aws.amazon.com/vpc/latest/peering/vpc-peering-security-groups.html

Updating your security groups to reference peer VPC groups - Amazon Virtual Private Cloud

Update your security group rules to reference security groups in the peer VPC.

vFondevilla

08:42:35 PM

This is not working for the Transit Gateway connected VPCs.

loren

08:43:07 PM

No, it does not, it’s specific to vpc peering (at the moment, anyway)

vFondevilla

08:43:07 PM

This week I tried to reference one SG from another VPC and it didn’t worked.

08:59:34 PM

interesting. i should have mentioned the vpcs are in different regions
You cannot reference the security group of a peer VPC that’s in a different region. Instead, use the CIDR block of the peer VPC.

08:59:44 PM

thanks for clearing that up folks!

08:59:56 PM

i guess i’ll just stick with the module that duplicates the same sgs per vpc-region

jose.amengual

11:55:40 PM

if you are in the same region and you use Shared VPCs ( which is newish) you can do that

2020-08-14

MrAtheist

07:44:13 AM

Anyone has some insights into how to appropriate configure an idle timeout for ALB? When a request comes in from a client, i have a rails api, with no nginx involved, that goes to rds and fetch whatever is needed and serves it back as a csv (a very typical workflow i assume…) Are there any downside to just bump up the idle timeout to say 10x default = 600s? Or should i really be looking at nginx of that sort or retweak my app to make it more async?

I’m currently going thru this blog and hope someone can chime in on this topic. https://sigopt.com/blog/the-case-of-the-mysterious-aws-elb-504-errors/

maarten

09:57:40 AM

I think it depends case by case. If it is a backend kind of application with a json report generation it doesn’t hurt much to do so. If it’s a critical consumer facing application then high idle timeouts make the application easily doss’able, async is the much nicer option.

2020-08-15

Prasad

02:11:03 PM

Hello all, I just wanted to understand the 2 options and how they differ in terms of usage as i’m just not able to differentiate them

1)kms:ViaService

2)kms:GrantIsForAWSResource

Problem: My initial thought of a policy required for user to start ec2 instance which had a CMK key encrypted volume was that i needed to provide decrypt permission with a condition statement for the ec2 instance service so that it can call kms to get the plain text data key on to the memory.

“Action”: [ ”kms:Decrypt”,

”Condition”: { ”StringEquals”: { ”kms<i class=”em em-ViaService””></i> [ ”ec2.us-west-2.amazonaws.com”,

The AWS documentation and a google search shows to use kms:CreateGrant and kms:GrantIsForAWSResource true to allow an user to start EC2 with KMS CMK encrypted volume

”Action”: [ ”kms:CreateGrant”, ], ”Resource”: “*”, ”Condition”: {“Bool”: {“kms<i class=”em em-GrantIsForAWSResource””></i> true}}

Juan Soto

10:23:10 PM

https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/

Amazon S3 Path Deprecation Plan – The Rest of the Story | Amazon Web Services attachment image

Last week we made a fairly quiet (too quiet, in fact) announcement of our plan to slowly and carefully deprecate the path-based access model that is used to specify the address of an object in an S3 bucket. I spent some time talking to the S3 team in order to get a better understanding of […]

01:19:51 PM

Sigh. They still don’t have a plan for bucket names that include dots
Bucket Names with Dots – It is important to note that bucket names with “.” characters are perfectly valid for website hosting and other use cases. However, there are some known issues with TLS and with SSL certificates. We are hard at work on a plan to support virtual-host requests to these buckets, and will share the details well ahead of September 30, 2020.

Amazon S3 Path Deprecation Plan – The Rest of the Story | Amazon Web Services attachment image

2020-08-16

2020-08-17

walicolc

10:38:32 AM

Ello peoples, anyone faced an issue where their user_data script wasn’t executed on startup ?

walicolc

10:39:33 AM

Via the console that is

walicolc

10:57:00 AM

fixed, forgot to include #!/bin/bash

roth.andy

05:54:18 PM

LPT: Use #!/usr/bin/env bash. It is far more universally compatible

roth.andy

05:54:52 PM

https://stackoverflow.com/questions/10376206/what-is-the-preferred-bash-shebang

What is the preferred Bash shebang?

Is there any Bash shebang objectively better than the others for most uses? #!/usr/bin/env bash #!/bin/bash #!/bin/sh #!/bin/sh - etc I vaguely recall a long time ago hearing that adding a dash t…

walicolc

11:41:08 AM

Thanks Andrew!

Zach

05:35:06 PM

classic problem, I have that happen so frequently

11:02:26 PM

https://sweetops.slack.com/archives/CQCDCLA1M/p1597705333029000

Hi All. Anyone know of any tool that accepts multiple IP/CIDRs and creates a map of used and unused IP ranges ?

2020-08-18

Karoline Pauls

05:40:20 PM

Are there any implications of which “direction” a peering connection goes within a single region and a single AWS account?

bradym

05:45:58 PM

I have several peering connections like what you describe and I’ve never come across any implications or issues related to the direction of a peering connection.

walicolc

05:49:06 PM

Satish

09:49:43 PM

Hello, we have EKS workloads running in separate AWS accounts for non-prod and prod environments. I’m thinking of creating a “SharedServices” AWS account and setting up ECR repositories that can be used by both non-prod and prod environments. Any downsides with this approach? Other recommendations?

Steven

10:08:26 PM

That is what I do. But I do 2 ECR. 1 in dev account for CI builds and 1 in shared account for candidates that have passed testing. Reduces risk of really bad code being able to get to most environments

Eric Berg

12:46:12 AM

We have a single ECR for multiple environments. We grant access to all of the accounts for each repo.

2020-08-19

Karoline Pauls

04:01:31 PM

AWS currently does not support unicast reverse path forwarding in VPC peering connections that checks the source IP of packets and routes reply packets back to the source. https://docs.aws.amazon.com/vpc/latest/peering/peering-configurations-partial-access.html#peering-incorrect-response-routing This means that in a “star” peering configuration (multiple side VPCs to one central), side VPCs in practice simply cannot share subnet ranges, even though it is theoretically possible.

If could work if one picked non-overlapping subnets from each “side” VPC and a routing table was defined for that. But that’s impractical.

Am I right?

loren

04:03:58 PM

correct, cidrs are not allowed to overlap

Karoline Pauls

04:04:29 PM

thanks

Karoline Pauls

04:05:39 PM

though i edited it to clarify, because I think peering can be established, but routing will not work well

loren

04:13:03 PM

when i’ve tried it in the past, i received errors when the cidrs overlapped that did not allow the peering connection to be created

Karoline Pauls

04:23:33 PM

even when they transitively overlapped? (i’m not trying to do that, just wondering)

loren

04:24:28 PM

yes

loren

04:30:21 PM

it seems it no longer errors when creating the peering connection, that’s interesting

walicolc

04:40:59 PM

ello peoples, anyone know of any good resources on implementing ci/cd on aws with terraform. In particular best practices on managing the plan and apply commands in the build phase using codebuild and interacting with s3 state files?

rajeshb

05:18:07 PM

GOCD?

walicolc

05:19:11 PM

i’d like to stick with aws products, so codebuild, and codedeploy

07:19:49 PM

We use an office security group to allow ingress into our vpc. We’re approaching the 60 security group rule limit. What’s a good way to scale past this limit ?

07:20:18 PM

looking at waf as an option but it’s expensive

07:20:56 PM

we were thinking about perhaps splitting the security groups up but that kind of kicks the can down the road. once we’re at 60 ipcidrs for office ips, what do we do next ?

Steven

08:04:31 PM

You can do up to 5 security groups, move to WAF, or create your own solution

08:43:49 PM

for now i asked aws to increase the limit from 60 to 100 per sg per vpc in 1 region and they complied which is nice. i guess in the future, we’ll have to consider a waf or get more creative with a solution

Steven

08:50:48 PM

Unless things have changed (I did this 2 years ago). There is a max of 300 rules. So, they would have increased the number of rules per group and reduced the number of groups you can have.

08:57:44 PM

yep. the default is 60 rules per sg and 5 sgs per networking interface so 60*5 is 300. they have a hard limit of 1000 so either the 60 or 5 can increase but that 1000 cannot be exceeded.

https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/

Steven

09:50:27 PM

That’s an improvement. Good to know.

rms1000watt

11:29:14 PM

Does anyone use Aurora (postgres) and love their experiences with it? Considering migrating from RDS (postgres) to it. I know a few years ago there was reliability concerns, but not sure about in 2020 at scale.

jose.amengual

11:32:10 PM

MMMM I guess the question is more like, Do you like Aurora in general?

jose.amengual

11:32:32 PM

Aurora and aws have way better support for mysql than postgress

jose.amengual

11:33:19 PM

aurora storage has some limitations and if heavy write workloads you can easily kill a cluster by writing too fast

jose.amengual

11:34:03 PM

if you do not have any of those problems and you do not need to tune up mysql or postgres aurora is great

rms1000watt

11:34:16 PM

i think we’re not write heavy

jose.amengual

11:34:47 PM

if you had Aurora you can check Performance Insights to answer that question lol

rms1000watt

11:37:22 PM

hahaha nice. got me there

rms1000watt

11:37:57 PM

basically i’m asking the super n00b question of is there a point to cut over to Aurora if RDS is working OK?

jose.amengual

11:40:47 PM

I guess it depends on what you want, if you are looking for automatic failover, updates, elastic storage and HA aurora is nice

rms1000watt

11:42:40 PM

and in your experience, in prod, has it been reliable for ya?

jose.amengual

11:47:45 PM

yes, no issues but again we did hit the underlaying storage problem I described earlier because we write a huge amount of stuff

jose.amengual

11:48:02 PM

apart from that issue it works just fine

rms1000watt

11:48:55 PM

so was it like, a lot of writes, caused replication lag, and caused requests to return slowly?

rms1000watt

11:48:59 PM

or like, data was lost?

jose.amengual

11:54:43 PM

no, we basically find a way to failover the cluster at will by doing specific operations

sheldonh

12:03:20 AM

Is there a full fledged project like a terraform module or something I can use to establish a home account for IAM users + define groups/roles to assume for all users across my accounts? I see a lot of pieces in github, but before i mess around, was wondering if anyone/or other project/ has a “best practice complete layout for home account user provisioning” so I can implement a pull request driven workflow for users provisioning.

Again, I’ve seen pieces, but a full fledged “best practice” layout or service is what I’m wanting to explore tomorrow

2020-08-20

vFondevilla

08:21:49 AM

I had some issues with lockdowns in Aurora in stress moments, leaving the database zombie. From the AWS perspective the Database is alive as their user (locally runned for monitoring) is able to do stuff, but the cluster stops answering connections until we reboot it. This happened 2 times in 6 months, but apart from that it’s pretty smooth.

Darren Cunningham

02:14:32 PM

sounds like the cluster isn’t CPU pegged; have you already validated that you’re not hitting a max connection limits?

Darren Cunningham

02:17:35 PM

highly recommend AWS Support if you’re not paying for it. We have Business level in our Production account. We’ve had a few instances similar to this and they were able to dig into the logs/configuration and determine root cause for us. Worth every penny.

vFondevilla

02:18:44 PM

In our case business support couldn’t find a root cause apart from not being in the last patchlevel. After enabling the detailed logging (every request) the issue didn’t happened again.

Darren Cunningham

02:19:50 PM

could try opening a new case and see if the next engineer has better luck

vFondevilla

02:20:34 PM

Just to be completely sure I deployed a new database from an snapshot and nuked the database cluster

Darren Cunningham

02:22:34 PM

if you can associate metrics to “stress moments” probably worth setting up an alert that the team can get notified that things may be going sideways in the future

vFondevilla

02:23:39 PM

now we have an automated probe testing the connection every minute so I’m pretty confident about that. It opens an mysql connection, query a table with a select with limit 1 and if everything it’s ok it will close the mysql connection.

jose.amengual

04:25:47 PM

mysql ? version ? serverless ? workflow is write heavy ?

jose.amengual

04:26:01 PM

how many replicas?

vFondevilla

06:25:14 PM

MySQL Aurora with MySQL 5.7, single master and one replica. Workflow is primarily read as it’s a drupal website. Sometimes (with cache expirations), every node will launch a pool of connections against the MySQL server (every node at the same time as the cache was located in the database), and in that moment, when receiving about 1500 new connections (the instance size it’s an r5.2xlarge with max_connections default at 3000 connections), the mysql became zombie. After the second time happening the same, we did a change in the Drupal cache expiration and it never happened again.

vFondevilla

06:26:16 PM

Support was completely clueless and with the Drupal changes on the cache we couldn’t replicate the issue anymore.

jose.amengual

06:27:03 PM

we had a weird writing pattern issue that will trigger a failover immediately

jose.amengual

06:27:50 PM

I think at the end read or write the underlaying storage can’t keep up and it kills the writer

jose.amengual

06:28:01 PM

in you case you could leverage the RDS proxy

jose.amengual

06:28:11 PM

which is now GA

jose.amengual

06:28:19 PM

and same as you Support did not have a clue

jose.amengual

06:28:45 PM

and when I was in re:Invent asked this question and they did not answer

vFondevilla

08:22:22 AM

(Running Aurora MySQL, for more information)

Darren Cunningham

02:23:43 PM

When using a Lambda to process SQS, are you always using batch size 1 or do you handle failures of messages individually? if the later, how?

2020-08-21

09:29:28 PM

for the people who are using https://github.com/Nike-Inc/gimme-aws-creds

Nike-Inc/gimme-aws-creds

A CLI that utilizes Okta IdP via SAML to acquire temporary AWS credentials - Nike-Inc/gimme-aws-creds

09:29:37 PM

how do you use the same app to login to aws console ui ?

Nike-Inc/gimme-aws-creds

A CLI that utilizes Okta IdP via SAML to acquire temporary AWS credentials - Nike-Inc/gimme-aws-creds

09:32:28 PM

i know gimme-aws-creds will dump out the access key id and secret access key which can then be used to hit aws’ federated endpoint to create a session in aws console.

https://stackoverflow.com/questions/59952757/how-to-login-to-aws-console-using-access-key-secret-key-and-session-token

im wondering if there is an easier, more integrated way

09:33:52 PM

for example, using aws-vault, it has a login command that will open aws console for you.

# open a browser window and login to the AWS Console
$ aws-vault login jonsmith

but to use this tool with nike’s gimme-aws-creds, i’d have to do the following.

get creds from gimme-aws-creds
enter them into aws-vault which is a pain since these creds are temporary
then run aws-vault login

jose.amengual

09:36:00 PM

so you want to trigger a console login from cli ?

09:36:07 PM

yes

09:37:30 PM

i suppose https://github.com/versent/saml2aws has a console arg so maybe that tool would be better

Versent/saml2aws

CLI tool which enables you to login and retrieve AWS temporary credentials using a SAML IDP - Versent/saml2aws

Erik Osterman (Cloud Posse)

09:38:05 PM

We use saml2aws

Erik Osterman (Cloud Posse)

09:38:38 PM

We previously used aws-okta which was literally a fork of aws-vault, but support has dropped

Zach

09:39:37 PM

yah we have okta federation. You log into the Console with an okta saml redirect. You log into the CLI with gimme-aws which authenticates you and plops creds into the shell env

09:42:29 PM

@Erik Osterman (Cloud Posse) ah ok so you have okta saml setup with saml2aws so you can get keys and use the saml2aws console to quickly login to the aws console

Erik Osterman (Cloud Posse)

09:42:48 PM

Yep and it works inside a container like geodesic

09:42:51 PM

that makes a lot of sense. it’s too bad it doesn’t use oidc but i guess it doesn’t matter what kind of auth, as long as it works.

Erik Osterman (Cloud Posse)

09:43:02 PM

We use it with okta and gsuite depending on the customer

09:43:36 PM

@Zach you use gimme-aws to get aws creds and then you have an okta app that allows you login to aws console ? could you explain more about that ?

Zach

09:43:54 PM

its 2 differnet means of doing the same thing

Zach

09:44:28 PM

If you want to go to the AWS Console, you click the AWS ‘app’ in Okta (the plugin). You auth, and then you assume an IAM role that your okta group maps to or allows.

Zach

09:45:14 PM

At the CLI the gimme-aws-creds does basically the same thing, you tell it what account and role in AWS you want to assume

09:45:29 PM

ah ok that makes sense. so i imagine the cli method and the okta app use the same client_id / client_secrets ?

Zach

09:45:52 PM

the AWS keys? doubt it

09:45:58 PM

no the oidc creds

09:46:18 PM

oh i see, so probably different oidc client id and client secret

Zach

09:49:43 PM

Hmmm not sure - the gimme-aws si configured with an okta secret. The aws side is … complicated? the okta docs aren’t great but if you follow them to the letter it all works

10:06:53 PM

i may have to ping you and others more about this. IT at our place still holds the whole sso thing close to their hearts so waiting on them before i can configure

Zach

08:59:13 PM

thats funny, our actual IT maintains the Azure AD and we just bypass all that with Okta for our engineering team

zeid.derhally

09:47:01 PM

I’m switching away from aws-okta and was wondering if anyone has thoughts on aws-okta-processor? I’ve used it and like it.

https://github.com/godaddy/aws-okta-processor

godaddy/aws-okta-processor

Okta credential processor for AWS CLI. Contribute to godaddy/aws-okta-processor development by creating an account on GitHub.

loren

09:49:47 PM

i like it a lot. one of the few that manage to handle the credential cache both for the sso session and for the aws sts session

godaddy/aws-okta-processor

Okta credential processor for AWS CLI. Contribute to godaddy/aws-okta-processor development by creating an account on GitHub.

loren

09:50:52 PM

project is well structured also, and the maintainers are responsive and accepting of prs

2020-08-22

2020-08-23

Igor

07:49:06 PM

Does anyone know of a way to set up AWS-VAULT so CloudTrail recognizes that the login is with MFA?

roth.andy

07:53:34 PM

https://github.com/99designs/aws-vault#roles-and-mfa

Add mfa_serial to the profile in $HOME/.aws/config

99designs/aws-vault

A vault for securely storing and accessing AWS credentials in development environments - 99designs/aws-vault

Igor

07:53:59 PM

I have the login working with MFA

Igor

07:54:04 PM

But CloudTrail says additionalEventData.MFAUsed

Igor

07:54:09 PM

As No

roth.andy

07:56:56 PM

Igor

07:57:55 PM

Let me try with –no-session

Igor

08:05:54 PM

Still “No”

ismail yenigul

02:41:51 PM

@Igor what is your ~/.aws/config for that profile and What is the full error? additionalEventData.MFAUsed is not an error

Igor

02:42:29 PM

There is no error. That’s a property in the CloudTrail event log that states that MFA wasn’t used on login.

Igor

02:43:22 PM

config is as linked. I have role_arn and mfa_serial on the profile

ismail yenigul

02:43:29 PM

is this aws console login or aws-vault ? can you paste the full event log

ismail yenigul

02:45:21 PM

and did you enforce MFA in your assume role?

2020-08-24

Zach

12:15:21 PM

^ similar question but I’ll fork off for gimme-aws-creds if anyone knows how to make CloudTrail recognize that I have an MFA in a session

06:05:10 PM

anyone netflix’ dispatch to work ?

https://netflixtechblog.com/introducing-dispatch-da4b8a2a8072

06:05:20 PM

install guide: https://hawkins.gitbook.io/dispatch/installation

06:05:38 PM

repo: https://github.com/Netflix/dispatch

Netflix/dispatch

All of the ad-hoc things you’re doing to manage incidents today, done for you, and much more! - Netflix/dispatch

kskewes

10:11:50 PM

Someone at ours had a look and said too early/raw and went to zapier instead.

10:36:35 PM

interesting. someone here had the same thought cause it had bugs.

10:36:39 PM

i guess we wait

Brij S

11:56:40 PM

I was wondering if there were any jmespath gurus here, I’ve got the following command

 aws s3api list-buckets --query "Buckets[?starts_with(Name, \`tf-app-npd\`)]|[?contains(Name, \`state\`)].Name"

This works just fine, however it returns the following list

[
    "tf-app-npd-kmstest-state",
    "tf-app-npd-pr1-state",
    "tf-app-npd-shared-state",
    "tf-app-npd-stage-state",
    "tf-app-npd-state"
]

I’d like to exclude buckets such as tf-app-npd-shared-state or tf-app-npd-state, but I’m stuck - any ideas?

bradym

12:16:27 AM

Something like this should work:

aws s3api list-buckets --query "Buckets[?starts_with(Name, \`tf-app-npd\`)]|[?contains(Name, \`state\`)]|[?!(contains(Name, \`tf-app-npd-shared-state\`))].Name"

bradym

12:17:28 AM

jmespath has an or operator so theoretically you could use that to exclude multiple buckets

Brij S

12:22:38 AM

ohh interesting! How would I use the or operator with

[?!(contains(Name, \`tf-app-npd-shared-state\`))]

Brij S

12:22:51 AM

to exclude tf-app-npd-state

bradym

12:23:04 AM

I’m not entirely sure

bradym

12:27:46 AM

Looks like this works:

[?!(contains(Name, \`tf-app-npd-shared-state\`) || (contains(Name, \`tf-app-npd-state\`)))]

Brij S

12:34:18 AM

works like a charm! Thank you!

bradym

12:34:49 AM

happy to help

Brij S

12:43:44 AM

you wouldn’t happen to be a pro with sed would you?

bradym

12:48:14 AM

Not sure I’d call myself a pro, but I use it quite often. Happy to look at whatever you’re trying to do.

Brij S

12:49:41 AM

the result of the command above has the following output

[
    "tf-app-npd-kmstest-state",
    "tf-app-npd-pr16-state",
    "tf-app-npd-stage-state"
]

essentially, I’d like to ‘remove’ the tf-app-npd- and -state parts. So for the first bucket I’d be left with kmstest

bradym

12:51:47 AM

sed 's/tf-app-npd-\|-state//g'

Brij S

12:54:33 AM

hmm, I piped that to the end of the awscli command but the result is the same

| sed 's/tf-app-npd-\|-state//g'

bradym

12:56:38 AM

just to confirm, the pipe to sed is outside the "" on the aws command right?

Brij S

12:57:49 AM

yup, the whole command is

aws s3api list-buckets --output text --query "Buckets[?starts_with(Name, \`tf-app-npd\`)]|[?contains(Name, \`state\`)]|[?!(contains(Name, \`tf-app-npd-shared-state\`) || (contains(Name, \`tf-app-npd-state\`)))].[Name]" | sed 's/tf-app-npd-\|-state//g'

bradym

12:59:34 AM

Odd… it worked for me when I copied your output

$ OUT='[                        
    "tf-app-npd-kmstest-state",
    "tf-app-npd-pr16-state",
    "tf-app-npd-stage-state"
]'

$ echo $OUT
[ "tf-app-npd-kmstest-state", "tf-app-npd-pr16-state", "tf-app-npd-stage-state" ]

$ echo $OUT | sed 's/tf-app-npd-\|-state//g'
[ "kmstest", "pr16", "stage" ]

Brij S

01:00:18 AM

hmm, are you using a macbook? I know sed on macos is slightly different

bradym

01:00:24 AM

nope

bradym

01:00:29 AM

ubuntu 18

bradym

01:01:14 AM

and the command without sed still works right?

Brij S

01:01:22 AM

yes

bradym

01:04:32 AM

Experimenting with my own buckets, looks like that sed or isn’t working right.

Brij S

01:05:01 AM

sed or isnt?

bradym

01:05:18 AM

nevermind, I had a typo

bradym

01:06:14 AM

what’s your sed --version?

Brij S

01:07:30 AM

HA, I dont think the macos sed has a version https://stackoverflow.com/questions/37639496/how-can-i-check-the-version-of-sed-in-os-x

How can I check the version of sed in OS X?

I know if sed is GNU version, version check can be done like $ sed –version But this doesn’t work in OS X. How can I do that?

Brij S

01:11:47 AM

got it!

Brij S

01:12:05 AM

so, on macos you need to run brew install gnu-sed

Brij S

01:12:16 AM

that gives me the version of sed that you’re probably using

bradym

01:13:46 AM

or close enough to it

Brij S

01:14:03 AM

yep! thanks again! Appreciate the help

bradym

01:14:15 AM

Brij S

06:27:04 PM

have you ever looped through a json list with bash

bradym

07:49:27 PM

You’re gonna want jq for that - https://stedolan.github.io/jq/

Brij S

08:45:49 PM

Hey @bradym - you around for a quick bash/awscli question?

bradym

08:46:18 PM

Sure, what’s up?

Brij S

08:49:38 PM

:slightly_smiling_face: I had an old awscli command I used to delete a versioned object in an S3 bucket like this

aws s3api delete-objects --bucket ${REMOTE_STATE_BUCKET} --delete "$(aws s3api list-object-versions --bucket ${REMOTE_STATE_BUCKET} --query='{Objects:Versions[].{Key:Key,VersionId:VersionId}}')"

This worked fine, however we decided to store more in this bucket so I wanted to delete only objects with a certain key, I ended up with this

aws s3api delete-objects --bucket ${REMOTE_STATE_BUCKET} --delete "$(aws s3api list-object-versions --bucket ${REMOTE_STATE_BUCKET} --output=json --query="Versions[?starts_with(Key,\`${STAGE}\`)].{Key:Key,VersionId:VersionId}")"

but now I get the following error

Error parsing parameter '--delete': Invalid JSON:
[
    {
        "Key": "stage/terraform.tfstate",
        "VersionId": ".oKrS6dg8TJGGjaDGeAvF7RryDqok.wy"
    }
]

Brij S

09:02:02 PM

any idea what its complaining about

bradym

09:04:49 PM

Take a look at aws s3api list-object-versions help – there’s an example of what the JSON syntax for that command should be, and it looks like yours is not formatted quite right

msharma24

05:41:27 AM

Hello -I would like to keep 100s of GBS of files in sync between to cross account same region S3 buckets with the ability to delete the files from destination bucket when I delete or replace the files in the source bucket ? The s3 replication feature does not solve this issue as S3 does not do replicate delete , the aws s3 sync also wont help here since it would not delete the files from the remote bucket ?

Do I need to build some kind of manifest to keep log of the files which will command what files remains in sync ?

roth.andy

12:31:35 PM

Check out rclone

msharma24

01:00:33 PM

Thanks

Steven

02:15:55 PM

aws s3 sync will delete. You just need to add the –delete option But for speed, consider using s3 replication for the copy. s3 replication can also do deletes. https://aws.amazon.com/blogs/storage/managing-delete-marker-replication-in-amazon-s3/

Managing delete marker replication in Amazon S3 | Amazon Web Services attachment image

Customers use Amazon S3 Replication to create a copy of their data within the same AWS Region or in another AWS Region for compliance, lower latency, or sharing data across accounts. In environments where data is constantly changing, customers have different replication needs for objects that have been, or will be, deleted. For some use cases, […]

2020-08-25

walicolc

09:55:39 AM

Looks like RDS is down for those using AWS Europe - London Region https://downdetector.co.uk/status/aws-amazon-web-services/

Amazon Web Services down? Realtime overview of AWS status, issues and outages

Real-time overview of issues with Amazon Web Services. Is your service not functioning properly? Here you learn whats is going on.

walicolc

09:56:35 AM

AWS have yet to report on it , status checks still indicate all green

walicolc

10:36:52 AM

AWS have now reported on this

Karoline Pauls

10:18:41 AM

AWS VPC DNS resolution is so dumb.

Do they seriously think it is OK to implicitly resolve to private addresses when peering/transit gateway is set up? What if it’s set up badly?

01:45:26 PM

what’s a good minimum_protocol_version to set a cloudfront distribution to if it has an acm cert for a static s3 site. I’m currently using TLSv1.1_2016 but I think I should go to TLSv1.2_2019

https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudfront_distribution#minimum_protocol_version

Thoughts?

01:46:10 PM

I may have answered my own question.

https://aws.amazon.com/about-aws/whats-new/2020/07/cloudfront-tls-security-policy/

01:46:22 PM

When you create a new distribution using a custom SSL certificate, TLSv1.2_2019 will be the default policy option selected. You may use the AWS Management Console, Amazon CloudFront APIs, or AWS CloudFormation to update your existing distribution configuration to use this new security policy.

loren

01:52:09 PM

nice, i’ve wondered the same thing several times

07:58:29 PM

https://sweetops.slack.com/archives/CDKPAK81Z/p1598385502001200

anyone got oidc working with atlantis on an aws alb ? we’re using okta with the following settings

    issuer                              = "<https://company.okta.com/>"
    token_endpoint                      = "<https://company.okta.com/oauth2/default/v1/token>"
    user_info_endpoint                  = "<https://company.okta.com/oauth2/default/v1/userinfo>"
    authorization_endpoint              = "<https://company.okta.com/oauth2/default/v1/authorize>"
    authentication_request_extra_params = {}

is this correct ? we created a Web integration with OpenID Connect to get a client_id and client_secret

2020-08-26

mfridh

08:53:29 AM

Do you know? Having an Imported certificate in ACM, assigned to some ALB listeners - when updating said imported certificate by uploading a new to ACM - are the load balancer listeners all supposed to propagate to use that new certificate?

10:11:03 AM

Yes

mfridh

11:11:36 AM

Ok… Then we have something odd going on…

11:29:04 AM

id create a support ticket on this just in case

11:29:18 AM

you could check and create a new load balancer with the same acm cert and see if that works as expected

11:29:41 AM

but the new acm cert should propagate to all load balancers /listeners that already use it.

mfridh

08:34:24 AM

things are fine, nothing to see here. Someone was trolling with adding an additional listener certificate.

walicolc

11:50:01 AM

Is there a way of using logical OR in IAMs instead of implementing it by writing separate blocks ?

walicolc

11:57:29 AM

Solved.

11:57:58 AM

whats the solution ?

walicolc

11:59:08 AM

AWS treats this as OR

"Condition": {
         "StringEquals": {
           "aws:sourceVpc": ["vpc-111bbccc", "vpc-111bbddd"]
         }
       }

walicolc

11:59:20 AM

Stupid bc it’s not obvioous at first but hey if it works it works

walicolc

12:00:07 PM

Also can use ForAnyValue

12:27:28 PM

cool, i did not know you can do that

loren

10:10:22 PM

anyone have experience using aws session manager with a .ssh/config, such that a git-over-ssh connection would utilize session manager? we have gitlab running in a private subnet, and would like to support an ssh remote without opening ssh via an ELB in a public subnet…

loren

10:41:20 PM

i’m guessing something like this, just based on some googling…

Host bastion
  ProxyCommand sh -c "aws ssm start-session --target <bastion-instance-id> --document-name AWS-StartSSHSession --parameters 'portNumber=%p' --region <region>"

Host <gitlab.remote>
  IdentityFile  <my key>
  User git
  ProxyCommand ssh -W %h:%p  ec2-user@bastion

loren

10:45:05 PM

or perhaps with ProxyJump?

Host bastion
  User ec2-user
  ProxyCommand sh -c "aws ssm start-session --target <bastion-instance-id> --document-name AWS-StartSSHSession --parameters 'portNumber=%p' --region <region>"

Host <gitlab.remote>
  IdentityFile  <my key>
  User git
  ProxyJump bastion

jose.amengual

10:48:24 PM

like this :

host i-* mi-*
    ProxyCommand sh -c "aws ssm start-session --target %h --document-name AWS-StartSSHSession --parameters 'portNumber=%p' --profile production"

jose.amengual

10:48:57 PM

that works for me

loren

10:51:03 PM

roger, but that works where the host in the ssh command is the instance. in this case, the host is the gitlab remote (and the user does not have ssm:StartSshSession permissions to the gitlab host. but they do have a user in gitlab and an ssh key loaded in their gitlab profile)

maarten

10:33:59 AM

https://github.com/Flaconi/terraform-aws-bastion-ssm-iam/blob/master/client/ssh_tunnel#L59

Flaconi/terraform-aws-bastion-ssm-iam

AWS Bastion server which can reside in the private subnet utilizing Systems Manager Sessions - Flaconi/terraform-aws-bastion-ssm-iam

jose.amengual

05:20:06 PM

Mmm but ssm does not require ssh keys to connect, in that case you might want to ck slider instance connect

2020-08-27

walicolc

03:14:43 PM

Does one know how to get codebuild to git clone my codecommit repo instead of zipping it up? I’m unable to execute git commands bc it isn’t a git repo

loren

03:17:00 PM

Is codepipeline involved, by any chance?

walicolc

03:17:20 PM

yes I’m using codepipeline

loren

03:22:30 PM

Then no, I was never able to work this out. It’s just what codepipeline does… Takes your source repo, exports it to a zip hosted in s3, then codebuild retrieves the zip. I stopped using codepipeline as a result

loren

03:24:14 PM

If you create a codebuild job with the source as your codecommit repo, and trigger the job directly, without codepipeline, then it works as you expect

walicolc

03:38:28 PM

Resolved with this -> https://itnext.io/how-to-access-git-metadata-in-codebuild-when-using-codepipeline-codecommit-ceacf2c5c1dc

How to access git metadata in CodeBuild when using CodePipeline/CodeCommit attachment image

Read this if you want your AWS CodePipeline build agents to have access to git history

walicolc

03:39:03 PM

Thank you Loren!

Adrian

03:39:33 PM

With git as workaround I don’t remember why it’s like this:

      - git init
      - git remote add origin https://${GITHUB_TOKEN}@github.com/owner/repo
      - git fetch --tags
      - git reset --hard origin/master

walicolc

03:40:23 PM

would this work for codecommit as well

Adrian

03:40:59 PM

I don’t use CodeCommit

walicolc

03:42:30 PM

OK - I’ll see if we can transition to anything but codecommit, there’s another functionality that doesn’t seem to work on codecommit which I spotted earlier. For now that medium blog will do. Thank you!

loren

03:53:29 PM

yeah, if you clone the repo as a codebuild step, then the “source” step of codepipeline is rather pointless

walicolc

04:37:16 PM

It works for us for now - so it’s OK. I’ll end up moving to github at a later stage.

tomv

04:54:11 PM

Is it just me or is the EMR spot market in us-west-2 for the past week.. non existant? we’re having capacity trouble for all sorts of instance types

09:03:10 PM

anyone done a cost benefit analysis of migrating ECS to EKS ?

09:30:46 PM

thread

sheldonh

09:18:01 PM

Ok…. I’m done with AWS SSM as my long-term plan. Too slow to iterate and lots of edge bugs for my use.

I want to bring a company wide consistency to config tooling, no more Choco only for windows and Linux left out to dry :-)

Best in class for cross platform and ease of maintenance I’m leaning towards is AWS opsworks puppet enterprise. While we have some ansible already I want state to be checked + run through ssm when possible. Folks here don’t use Ruby but lots have dabbled in python

The key requirement is simplify runs when possible by using AWS ssm associations and running through that. Winrm seems problematic in comparison for 200+ instances.

Puppet?

2020-08-28

drexler

03:26:49 PM

Hi anyone encountered this issue before: UnsupportedAvailabilityZoneException: Cannot create cluster 'eks-cluster-platform' because us-east-1e, the targeted availability zone, does not currently have sufficient capacity to support the cluster ??

pjaudiomv

03:52:32 PM

I have, in my case I just tried again later and it worked

07:45:11 PM

is there an easy way to see when a new ecs service is being deployed ? if there is an event, i’d like to be able to hit up a slack channel so we can keep track of production deployments

Igor

08:19:56 PM

I asked this before.. I was told to use lambda… so looks like no events out-of-the-box

pjaudiomv

08:24:01 PM

I have used cloudwatch events and lambdas for that in the past

Igor

08:27:36 PM

I found this in the archive.. https://github.com/bitflight-public/terraform-aws-ecs-events

bitflight-public/terraform-aws-ecs-events

Add on an SNS topic for capturing ECS events. Contribute to bitflight-public/terraform-aws-ecs-events development by creating an account on GitHub.

Igor

08:28:12 PM

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs_cwet.html

Tutorial: Listening for Amazon ECS CloudWatch Events - Amazon Elastic Container Service

In this tutorial, you set up a simple AWS Lambda function that listens for Amazon ECS task events and writes them out to a CloudWatch Logs log stream.

pjaudiomv

08:31:13 PM

nooice

sheldonh

12:37:04 AM

Help. Just need to know how to get past this failed helm release. Brand new to this and using a docker release library for gitpod.

Error: cannot re-use a name that is still in use

  on modules/gitpod/main.tf line 9, in resource "helm_release" "gitpod":
   9: resource "helm_release" "gitpod" {

I have no idea how to get it removed or whatever as I don’t see anything successful yet in AWS EKS

zidan

06:05:13 AM

#aws 6 tips that I apply to optimize our cost in AWS, check them out and let me know how many of them do you apply? https://www.dailytask.co/task/6-tips-that-you-should-think-about-them-to-optimize-your-costs-in-aws-ahmed-zidan

6 tips that you should think about them to optimize your costs in AWS.

6 tips that you should think about them to optimize your costs in AWS. written by Ahmed Zidan

2020-08-31

04:13:30 PM

what’s a good way for the container to know if it has been deployed in fargate or ecs ?

04:14:52 PM

i know about the ECS_CONTAINER_METADATA_URI which is handy but this env variable is set in both ecs ec2 and fargate

04:14:57 PM

ref: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-metadata-endpoint-v3.html

Task Metadata Endpoint version 3 - Amazon Elastic Container Service

Beginning with version 1.21.0 of the Amazon ECS container agent, the agent injects an environment variable called ECS_CONTAINER_METADATA_URI into each container in a task. When you query the task metadata version 3 endpoint, various task metadata and

jose.amengual

04:15:08 PM

we subscribe to a sns topic that then notify on slack

04:15:31 PM

thats a good way for the humans to know

jose.amengual

04:15:38 PM

with some lambda and that aws bot thingy

04:15:39 PM

what’s a good way for the container itself to know

jose.amengual

04:15:47 PM

ahhhh

jose.amengual

04:16:14 PM

we use local healthcheck as like a readiness test

jose.amengual

04:16:50 PM

to curl itself basically

04:17:32 PM

that’s a good way to get the current health of the container

04:17:49 PM

but what’s a good way to determine, for the container, if it’s an ecs ec2 task vs an ecs fargate task ?

04:18:27 PM

the only thing i can think of is if it hits the task metadata, gets the task arn, describes the arn, and determined the launch type from that.

jose.amengual

04:18:33 PM

OMG I had to read 3 time you question to realize it was something else

jose.amengual

04:19:00 PM

can you just add a ENV variable that the deployment set?

04:19:37 PM

the other way we’re thinking is that the ip address looks different betw ecs and ec2, where for ecs ec2 the ip is an ip in our vpc, whereas the fargate ip is the docker bridge ip that is not in our vpc

04:19:56 PM

ya we could probably add an env variable to all the tasks. that would be one solution.

04:20:02 PM

i was hoping for something more dynamic

jose.amengual

04:20:16 PM

is there a metadata endpoint that you can curl and set a ENV variable?

jose.amengual

04:20:37 PM

you could use the same local healthcheck to actually set it

04:22:21 PM

we’d have to set the env variable in the task definition and it would be a lot to update all of our tds.

the metadata endpoint is something we can curl from the container

04:22:33 PM

cant believe amazon doesnt deliver the l aunch type in the metadata

04:27:13 PM

a label may be a better option than an env variable

04:27:21 PM

since the labels can be queried from the /tasks endpoint

04:40:26 PM

@jose.amengual check this out

https://github.com/mackerelio/mackerel-container-agent/blob/c70de86ba1256fb0bfadba0a98237bf91a75b5db/platform/ecs/ecs.go#L21

basically we can query off of the env variable AWS_EXECUTION_ENV which can either be AWS_ECS_FARGATE or AWS_ECS_EC2

04:41:01 PM

and stackoverflow confirmation https://stackoverflow.com/questions/54177061/is-it-possible-to-detect-fargate-without-trying-the-metadata-api

Is it possible to detect Fargate without trying the metadata API

Is there a possibility for an application that is launched as Fargate task to determine if it runs inside Amazon ECS without trying the task metadata endpoint? It would be great if there are envir…

jose.amengual

04:42:52 PM

ohhhh cool

Zach

04:53:12 PM

I’m curious what the use-case is for this

jose.amengual

05:03:01 PM

imagine you have task on fargate that you are migrating to ecs+ec2 and they set certain cpu and memory attributes base on available memory, you could use this to set certain different defaults so the tasks can run without issues

Zach

05:04:05 PM

So you’d allow the container, once already launched, to modify its own task definition?

05:15:54 PM

We use it for a fatlib that gets the ddagent hostname ip

jose.amengual

06:55:17 PM

ddagent as datadog agent?

07:03:53 PM

yessir

jose.amengual

07:06:56 PM

for APM tracing you need that?

07:14:01 PM

yessir

jose.amengual

07:15:50 PM

but I thought running as a daemon you could use the hostname?

07:23:10 PM

how so ?

07:23:14 PM

what host name ?

07:23:49 PM

for ecs, we’ve been using the hostname as the ip of the ec2 via the metadata url, for fargate, we’ve been doing something similar

07:23:54 PM

is it best to use a different string ?

jose.amengual

07:50:10 PM

is this for this bit of the dd docs :

Assign the private IP address for each underlying instance your containers are running on in your application container to the DD_AGENT_HOST environment variable. This allows your application traces to be shipped to the Agent. The Amazon's EC2 metadata endpoint allows discovery of the private IP address. To get the private IP address for each host, curl the following URL:

curl <http://169.254.169.254/latest/meta-data/local-ipv4>

and set the result as your Trace Agent Hostname environment variable for each application container shipping to APM:

os.environ['DD_AGENT_HOST'] = <EC2_PRIVATE_IP>

In cases where variables on your ECS application are set at launch time, you must set the hostname as an environment variable with DD_AGENT_HOST. Otherwise, you can set the hostname in your application's source code for Python, Javascript, or Ruby. For Java and .NET you can set the hostname in the ECS task. For example:

jose.amengual

07:50:27 PM

from : https://docs.datadoghq.com/integrations/amazon_ecs/?tab=awscli

07:59:27 PM

yep so thats how we do it for ecs ec2

07:59:31 PM

how do you do it for fargate

jose.amengual

08:31:28 PM

we do not use fargate for the stuff I manage

jose.amengual

08:32:44 PM

but since it is in deamon mode the hostname of the datadog task can be passed as a parameter to the collector

jose.amengual

08:33:58 PM

like :

jose.amengual

08:33:58 PM

https://docs.datadoghq.com/tracing/setup/java/

jose.amengual

08:34:11 PM

dd.agent.host

jose.amengual

08:34:26 PM

which will be the hostname of the container running in daemon mode

jose.amengual

08:34:35 PM

by default is datadog

jose.amengual

08:34:37 PM

I think

08:36:10 PM

thats for java tho. we use python. would it be the same config

jose.amengual

08:36:42 PM

same

jose.amengual

08:36:42 PM

https://docs.datadoghq.com/tracing/setup/python/

Yoni Leitersdorf (Indeni Cloudrail)

07:26:38 PM

Something I’d like to verify: The public/private of an Aurora cluster is dependent on the public flag on the instances within the cluster (and of course, routing to the IGW).

Is that correct?

I’m looking here.

Cameron Boulton

09:26:36 PM

Yea, mostly. That config controls whether the instance has a publically routable Internet IP address or a private one from your subnet(s) space if that makes sense.

Cameron Boulton

09:27:36 PM

As you say, that has no bearing on whether the path is routable (route tables/IGW and/or network ACL) or firewalled (security group).

Yoni Leitersdorf (Indeni Cloudrail)

09:34:51 PM

So the cluster “itself” doesn’t have its own public/private setting right?

Cameron Boulton

09:36:36 PM

Yea, I think that’s right; seems like it’s implied by the instance level setting.

Yoni Leitersdorf (Indeni Cloudrail)

09:36:50 PM

OK thanks

Cameron Boulton

09:37:14 PM

I’ve never tried adding one instance public and one instance private; not sure what would happen there. Probably an error.

Yoni Leitersdorf (Indeni Cloudrail)

09:37:38 PM

I just tried that. not an error. The public/private ended up being set by the first instance in the cluster.

Cameron Boulton

09:38:20 PM

Wonder if the second (private) would be accessible by the cluster’s reader endpoint then (assuming it resolve to public IPs).

Yoni Leitersdorf (Indeni Cloudrail)

09:38:45 PM

Happy to give you my TF code to check for yourself

Cameron Boulton

09:38:50 PM

All good; just wondering out loud