#aws (2020-08)

aws Discussion related to Amazon Web Services (AWS)

aws Discussion related to Amazon Web Services (AWS)

Archive: https://archive.sweetops.com/aws/

2020-08-03

RB avatar

Just released 2 container modules (datadog and fluentbit) on the tf registry to make fargate and datadog integration easier.

2020-08-04

contact871 avatar
contact871

Is it possible to use "PropagateTags": "TASK_DEFINITION" when triggering an ECS task with CloudWatch Event rule?

contact871 avatar
contact871
[Fargate, ECS] [Tagging]: Support tagging when starting a task from CWE · Issue #89 · aws/containers-roadmap

Tell us about your request Support for tagging a task started through CloudWatch Events. Which service(s) is this request for? Fargate, ECS Tell us about the problem you're trying to solve. Wha…

Luis avatar

Hi! About https://github.com/cloudposse/terraform-aws-eks-cluster/ and https://github.com/cloudposse/terraform-aws-eks-node-group. I am currently testing the bugfix implemented in 0.22.0 : https://github.com/cloudposse/terraform-aws-eks-cluster/releases/tag/0.22.0 In the example, https://github.com/cloudposse/terraform-aws-eks-cluster/blob/master/examples/complete/main.tf

data "null_data_source" "wait_for_cluster_and_kubernetes_configmap" { module "eks_node_group" { cluster_name = data.null_data_source.wait_for_cluster_and_kubernetes_configmap.outputs["cluster_name"]

I have this in my “main.tf”, but when I apply Terraform I get the following error: Error: Cycle: module.eks_cluster.kubernetes_config_map.aws_auth, module.eks_node_group.module.label.output.tags, module.eks_node_group.aws_iam_role.default, module.eks_node_group.output.eks_node_group_role_arn, module.eks_cluster.var.workers_role_arns, module.eks_cluster.local.map_worker_roles, module.eks_cluster.kubernetes_config_map.aws_auth_ignore_changes, module.eks_cluster.output.kubernetes_config_map_id, data.null_data_source.wait_for_cluster_and_kubernetes_configmap, module.eks_node_group.var.cluster_name, module.eks_node_group.local.tags, module.eks_node_group.module.label.var.tags, module.eks_node_group.module.label.local.tags

Has this been tested like in the example? Thanks!

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Every pull request runs automated tests using terratest against examples/complete

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

tests are in test/src

Pedro Henriques avatar
Pedro Henriques

Hello everyone Do you mind taking a look into this PR please? https://github.com/cloudposse/terraform-aws-elasticsearch/pull/63

Dynamic cognito_options inner block to avoid permission problems in AWS China by brdasilva · Pull Request #63 · cloudposse/terraform-aws-elasticsearch

what Amazon Cognito authentication for Kibana is not supported on AWS China. Therefore we need to have a way to avoid setting the cognito options inner block on the aws_elasticsearch_domain terrafo…

1
jose.amengual avatar
jose.amengual
Release 0.19.0: Transformed cognito_options inner block into a dynamic block to avoid… · cloudposse/terraform-aws-elasticsearch

what Amazon Cognito authentication for Kibana is not supported on AWS China. Therefore we need to have a way to avoid setting the cognito options inner block on the aws_elasticsearch_domain terraf…

2020-08-05

Milosb avatar

Hi all, Do you know if I can share Transit Gateway between regions with RAM in same account?

Alan Kis avatar
Alan Kis

You can’t share the resource, at least this particular between regions. You can share it between the different accounts, but for building a regional network using Transit Gateway, you need to create a peering connections between Transit Gateways in different AWS regions.

Milosb avatar

i saw that peering attachment option

Milosb avatar

but that means i need transit gateway in each region

Alan Kis avatar
Alan Kis

Exactly. Or you can use other ways to create connection over regions, like using IPSec tunnels

Milosb avatar

I have vpc peering, but i wanted to use something to rule them all

Milosb avatar

thanks

raghu avatar

You should do tgw peering across region

Milosb avatar

should is hard word I wanted to avoid that

1
Milosb avatar

if i see it right there will be at least one additional tg-attachment in that case ( more if you want to connect more regions ) edit: actually its x2

raghu avatar

Without peering, i dont think you can connect cross region

Zach avatar

I was looking at the ASG max instance lifetime setting … the units are seconds but it has a minimum value of 604800

2020-08-06

Prasad avatar

hello all, The documentation of Application load balancer says SSL termination at LB level…if we configure https listeners for the target… How does the traffic flow from ALB to the target servers? is it not encrypted again from ALB to target servers?

Zach avatar

Yes it re-encrypts, although they don’t do any cert validation on the backend, its fire & forget

pjaudiomv avatar
pjaudiomv

Anybody play with the python CDK

Jaeson avatar

Just played with the TF CDK for Python yesterday. My experience was pretty awful. What is installed and used for python CDK is actually a skeleton that converts javascript to python and runs it. It was pretty slow, and difficult to find what I was looking for. I use tfswitch to manage the TF version in a container, which requires the TF version to be pinned, but couldn’t figure out how to pin the TF version with the CDK. So my experience was a short but painful one.

I’ve used CDK for AWS as well, and the experience was better, though I ran into CFN limitations, which is one of the reasons why TF interests me.

So, from my perspective, CDK is still a ways out from being very useful.

pjaudiomv avatar
pjaudiomv

Thank you I shall wait to even go as far as you did then

1
Steen avatar

Have been using the CDK (sans TF) with Python for a couple of weeks in a real-world scenario. I really like the programmatic feel of the setup, although having Python convert to Typescript behind the scenes ruins the usual developer toolchain for me (i.e. especially not being able to simply throw pdb into the mix); I could of course just go with Typescript but, aaaah. The documentation is autogenerated and sucks big time. They have taken idioms from other languages and pulled them down over Python and the result is not very pythonic. But for me, it sure beats HCL, which is an abhorrent nightmare of the NIH syndrome backed by naïve enthusiasm and silicon capital. In the words of L Peter Deutsch: “Every now and then I feel a temptation to design a programming language but then I just lie down until it goes away.”. Once you accept the tie-in to AWS and their stack concept, accept that CDK uses Python in name only and that there is no real state tracking, CDK feels welcoming especially for people with developer backgrounds (caveat: that’s me)

pjaudiomv avatar
pjaudiomv

Or the CDK in general

pjaudiomv avatar
pjaudiomv

I’m interested to see what the terraform CDK adoption is gonna be like

pjaudiomv avatar
pjaudiomv

Why would one use the terraform CDK over aws one if only using aws

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

I’ll bring this up in #office-hours today

1
Prasad avatar

@pjaudiomv may be we would want to migrate to different cloud down the line. we never know:)

1
RB avatar

i think its to write up the terraform code programatically without having to write up terraform manually

2
RB avatar

if i understand it correctly

  1. write cdk in coding language of your choice like python (similar to pulumi)
  2. run cdk to generate terraform
  3. terraform apply
loren avatar

i’ve also had use cases where i needed to generate/template the terraform hcl, to workaround some limitation of terraform. that particular use case was addressed by for_each, but i expect other similar cases where generating the hcl from a more expressive language has advantages

1
loren avatar

maybe also as a different abstraction layer for vars/inputs, a wrapper that takes inputs in your form of choice, and writes the values into the hcl. something of a workaround for the annoying decision to warn (and maybe error) when a tfvars file has undeclared vars

Zach avatar

Is there some cloudformation juju to lookup an existing aws resource that is not part of a stack? ie I have a kms key alias and I need the arn

loren avatar

terraform

Zach avatar

yah thats the model I’m coming from

Zach avatar

but I’m trying to deploy a lambda in this new fangled Serverless model that uses cloudformation

loren avatar

right, so, no. huge limitation of cfn. you can construct the arn, since the format is known. or you can write a custom resource that basically runs describekey (or whatever the api call is) and returns the result

Zach avatar

wow thats awful

loren avatar

or you can create the key in the stack, so you have access to its attributes

loren avatar

or just take the arn as a parameter and let the user figure out how to get it to you

Zach avatar

I guess option 3 is that I create the IAM role/policy in terraform and have the serverless config reference that by name to add to the lambda

loren avatar

speaking of cdk, this is why i think cloudformation templates should always themselves be an artifact generated by something else. the cdk is great for this

pjaudiomv avatar
pjaudiomv

Yea I suspect that’s most common reason, just limitations of hcl

jose.amengual avatar
jose.amengual

is it possible to use Service Discovery with ECS+EC2 setups ?

jose.amengual avatar
jose.amengual

the examples and docs point to fargate

jose.amengual avatar
jose.amengual

I’m trying to use AppMesh with ECS+EC2

Matt Gowie avatar
Matt Gowie

I’ve done it.

Matt Gowie avatar
Matt Gowie

Got some code — stand by.

Matt Gowie avatar
Matt Gowie

I setup Service Discovery as part of a talk I did on ECS. Here is the repo that went with the talk:

https://github.com/masterpointio/ecs-101-demo

masterpointio/ecs-101-demo

A small demo application for an ECS 101 talk I’m giving @ AWSMeetupGroup - masterpointio/ecs-101-demo

Matt Gowie avatar
Matt Gowie
masterpointio/ecs-101-demo

A small demo application for an ECS 101 talk I’m giving @ AWSMeetupGroup - masterpointio/ecs-101-demo

masterpointio/ecs-101-demo

A small demo application for an ECS 101 talk I’m giving @ AWSMeetupGroup - masterpointio/ecs-101-demo

Matt Gowie avatar
Matt Gowie

Though this doesn’t include App Mesh, so maybe it won’t fit what you’re after. But that service / DB task that I wire up to Service Discovery is running on EC2 and not Fargate.

jose.amengual avatar
jose.amengual

Awesome, it is not clear in the docs if you can use your current alb as the endpoint for. Appmesh instead of service discovery but I found one of the aws training labs were thru show it using an internal alb instead of a service discovery endpoint

jose.amengual avatar
jose.amengual

As always very confusing

Matt Gowie avatar
Matt Gowie

Yeah… you gotta dig deep in the docs for the more complicated shit in AWS.

jose.amengual avatar
jose.amengual

I’m going to write a module

Matt Gowie avatar
Matt Gowie

2020-08-07

2020-08-08

RB avatar

Anyone here setup netflix’s repokid or ardvaark? Would love to know your deployment, caveats, and ways to simplify getting it setup

jose.amengual avatar
jose.amengual

I’m very interested on this too

2020-08-09

2020-08-10

contact871 avatar
contact871

Can I track EFS costs per Access Point? In other words when I set an Access Point tag will I be able to see the EFS cost for this tag in Cost Explorer?

Phuc avatar

only if you enalbe allocation tag on that specific tag, just after that the cost can be sorted out in Cost Explorer with tag

dalekurt avatar
dalekurt

Has anyone had issues cloning a git repo over SSH while connected to AWS VPN?

Eric Berg avatar
Eric Berg

Take a look at how your VPN is configured. Generally, you can set it to route only the traffic to the VPN-connected network or to route all traffic. If it’s all traffic, there may be some network rules in the VPN VPC that you connect to that keep SSH from egressing the network.

jason einon avatar
jason einon

hey, what error are you getting ?

2020-08-11

dalekurt avatar
dalekurt

@jason einon I will have the get the exact error, but what happens is that once I’m connected to VPN I’m unable to git clone or git push over SSH

jason einon avatar
jason einon

is this for any git repo? its very possible that the vpn connection does not have the correct port open for ssh tcp 22 usually

2020-08-12

RB avatar

anyone know of an ssm command line tool where you can specify the command and list of instance ids to run the command ?

Issif avatar

I don’t have this, but if you used ssm, you could find useful a tool I’ve made last year : https://github.com/claranet/sshm

2
rajeshb avatar
rajeshb

i havent tried ssm command line to filter instances. but, i have created a doc and association with tags and applied that association using tf ,

resource "aws_ssm_association" "config-files-load" {
  depends_on = [aws_s3_bucket_object.monitoring-config-files-upload]
  name       = aws_ssm_document.shell-config-update-doc.name
  targets {
    key    = local.monitoring_identifation_tag_name
    values = [local.monitoring_identifation_tag_value]
  }
  association_name = "${aws_ssm_document.shell-config-update-doc.name}-association"
}

resource "null_resource" "example2" {
  depends_on = [aws_ssm_association.config-files-load]
  provisioner "local-exec" {
    command = "aws ssm start-associations-once --association-id ${aws_ssm_association.config-files-load.association_id}  --region ${local.region}"
  }
}
jose.amengual avatar
jose.amengual

I use sshm

1
jose.amengual avatar
jose.amengual

if you have ssh over ssm working you can use cssh

jose.amengual avatar
jose.amengual

and SSM RunCommand does not do this already?

sheldonh avatar
sheldonh

This is run Powershell ask or any other sdk/CLI does . Do you need a specific example?

Juan Soto avatar
Juan Soto

Looking like wafv2 doesn’t allow geoblocking for all the evil countries. Is the easiest way to fix this to apply geolocation routing in r53? Where would you route the bad traffic to? An S3 bucket that says “you are not allowed”? or what?

Issif avatar

who agree the new EC2 console is ugly and really inconvenient?

5
kskewes avatar
kskewes

Especially route53!!

vFondevilla avatar
vFondevilla

@Juan Soto 127.0.0.1

vFondevilla avatar
vFondevilla

If you send them to S3, it will cost you money. If you send themselves to 127.0.0.1 it will be free

1
Juan Soto avatar
Juan Soto

good idea, let me check it

2020-08-13

Suresh avatar

Hello guys,

quick AWS query, I have a use case of hitting private hosted zone domain from public api gateway, the HTTP integration request of the API gateway is not happy with the private hosted zone domain name. Did anyone tried this before?

Issif avatar

Have you declared your VPC in allowed VPC for the domain?

Suresh avatar

Hey, thanks for the reply, sorted this out with the VPC Link.

1
RB avatar

can i use the same security group in different vpcs ? or do i need to recreate the security group ?

if i have to recreate the security group per vpc, is there a cool aws way to reuse the security group rules (already reusing them at the moment using tf but wondering if there is a better way)

RB avatar

optional thread

loren avatar

If the VPCs are peered, you can use peered vpc security group references. Otherwise, I don’t think there is a way to use the same security group, nor share the rules

loren avatar

Still probably not quite what you’re looking for though… https://docs.aws.amazon.com/vpc/latest/peering/vpc-peering-security-groups.html

Updating your security groups to reference peer VPC groups - Amazon Virtual Private Cloud

Update your security group rules to reference security groups in the peer VPC.

vFondevilla avatar
vFondevilla

This is not working for the Transit Gateway connected VPCs.

loren avatar

No, it does not, it’s specific to vpc peering (at the moment, anyway)

vFondevilla avatar
vFondevilla

This week I tried to reference one SG from another VPC and it didn’t worked.

RB avatar

interesting. i should have mentioned the vpcs are in different regions
You cannot reference the security group of a peer VPC that’s in a different region. Instead, use the CIDR block of the peer VPC.

RB avatar

thanks for clearing that up folks!

RB avatar

i guess i’ll just stick with the module that duplicates the same sgs per vpc-region

1
jose.amengual avatar
jose.amengual

if you are in the same region and you use Shared VPCs ( which is newish) you can do that

2020-08-14

MrAtheist avatar
MrAtheist

Anyone has some insights into how to appropriate configure an idle timeout for ALB? When a request comes in from a client, i have a rails api, with no nginx involved, that goes to rds and fetch whatever is needed and serves it back as a csv (a very typical workflow i assume…) Are there any downside to just bump up the idle timeout to say 10x default = 600s? Or should i really be looking at nginx of that sort or retweak my app to make it more async?

I’m currently going thru this blog and hope someone can chime in on this topic.  https://sigopt.com/blog/the-case-of-the-mysterious-aws-elb-504-errors/

maarten avatar
maarten

I think it depends case by case. If it is a backend kind of application with a json report generation it doesn’t hurt much to do so. If it’s a critical consumer facing application then high idle timeouts make the application easily doss’able, async is the much nicer option.

1

2020-08-15

Prasad avatar

Hello all, I just wanted to understand the 2 options and how they differ in terms of usage as i’m just not able to differentiate them

1)kms:ViaService

2)kms:GrantIsForAWSResource

Problem: My initial thought of a policy required for user to start ec2 instance which had a CMK key encrypted volume was that i needed to provide decrypt permission with a condition statement for the ec2 instance service so that it can call kms to get the plain text data key on to the memory.

“Action”: [     ”kms:Decrypt”,

  ”Condition”: {     ”StringEquals”: {       ”kms<i class=”em em-ViaService””></i> [         ”ec2.us-west-2.amazonaws.com”,

The AWS documentation and a google search shows to use kms:CreateGrant and kms:GrantIsForAWSResource true to allow an user to start EC2 with KMS CMK encrypted volume

  ”Action”: [     ”kms:CreateGrant”,   ],   ”Resource”: “*”,   ”Condition”: {“Bool”: {“kms<i class=”em em-GrantIsForAWSResource””></i> true}}

Juan Soto avatar
Juan Soto
Amazon S3 Path Deprecation Plan – The Rest of the Story | Amazon Web Servicesattachment image

Last week we made a fairly quiet (too quiet, in fact) announcement of our plan to slowly and carefully deprecate the path-based access model that is used to specify the address of an object in an S3 bucket. I spent some time talking to the S3 team in order to get a better understanding of […]

RB avatar

Sigh. They still don’t have a plan for bucket names that include dots
Bucket Names with Dots – It is important to note that bucket names with “.” characters are perfectly valid for website hosting and other use cases. However, there are some known issues with TLS and with SSL certificates. We are hard at work on a plan to support virtual-host requests to these buckets, and will share the details well ahead of September 30, 2020.

Amazon S3 Path Deprecation Plan – The Rest of the Story | Amazon Web Servicesattachment image

Last week we made a fairly quiet (too quiet, in fact) announcement of our plan to slowly and carefully deprecate the path-based access model that is used to specify the address of an object in an S3 bucket. I spent some time talking to the S3 team in order to get a better understanding of […]

2020-08-16

2020-08-17

walicolc avatar
walicolc

Ello peoples, anyone faced an issue where their user_data script wasn’t executed on startup ?

walicolc avatar
walicolc

Via the console that is

walicolc avatar
walicolc

fixed, forgot to include #!/bin/bash

2
roth.andy avatar
roth.andy

LPT: Use #!/usr/bin/env bash. It is far more universally compatible

roth.andy avatar
roth.andy
What is the preferred Bash shebang?

Is there any Bash shebang objectively better than the others for most uses? #!/usr/bin/env bash #!/bin/bash #!/bin/sh #!/bin/sh - etc I vaguely recall a long time ago hearing that adding a dash t…

1
walicolc avatar
walicolc

Thanks Andrew!

Zach avatar

classic problem, I have that happen so frequently

2
RB avatar

Hi All. Anyone know of any tool that accepts multiple IP/CIDRs and creates a map of used and unused IP ranges ?

2020-08-18

Karoline Pauls avatar
Karoline Pauls

Are there any implications of which “direction” a peering connection goes within a single region and a single AWS account?

bradym avatar

I have several peering connections like what you describe and I’ve never come across any implications or issues related to the direction of a peering connection.

Satish avatar

Hello, we have EKS workloads running in separate AWS accounts for non-prod and prod environments. I’m thinking of creating a “SharedServices” AWS account and setting up ECR repositories that can be used by both non-prod and prod environments. Any downsides with this approach? Other recommendations?

Steven avatar

That is what I do. But I do 2 ECR. 1 in dev account for CI builds and 1 in shared account for candidates that have passed testing. Reduces risk of really bad code being able to get to most environments

1
Eric Berg avatar
Eric Berg

We have a single ECR for multiple environments. We grant access to all of the accounts for each repo.

2020-08-19

Karoline Pauls avatar
Karoline Pauls


AWS currently does not support unicast reverse path forwarding in VPC peering connections that checks the source IP of packets and routes reply packets back to the source.
https://docs.aws.amazon.com/vpc/latest/peering/peering-configurations-partial-access.html#peering-incorrect-response-routing This means that in a “star” peering configuration (multiple side VPCs to one central), side VPCs in practice simply cannot share subnet ranges, even though it is theoretically possible.

If could work if one picked non-overlapping subnets from each “side” VPC and a routing table was defined for that. But that’s impractical.

Am I right?

loren avatar

correct, cidrs are not allowed to overlap

Karoline Pauls avatar
Karoline Pauls

thanks

Karoline Pauls avatar
Karoline Pauls

though i edited it to clarify, because I think peering can be established, but routing will not work well

loren avatar

when i’ve tried it in the past, i received errors when the cidrs overlapped that did not allow the peering connection to be created

Karoline Pauls avatar
Karoline Pauls

even when they transitively overlapped? (i’m not trying to do that, just wondering)

loren avatar

yes

loren avatar

it seems it no longer errors when creating the peering connection, that’s interesting

walicolc avatar
walicolc

ello peoples, anyone know of any good resources on implementing ci/cd on aws with terraform. In particular best practices on managing the plan and apply commands in the build phase using codebuild and interacting with s3 state files?

rajeshb avatar
rajeshb

GOCD?

walicolc avatar
walicolc

i’d like to stick with aws products, so codebuild, and codedeploy

1
RB avatar

We use an office security group to allow ingress into our vpc. We’re approaching the 60 security group rule limit. What’s a good way to scale past this limit ?

RB avatar

looking at waf as an option but it’s expensive

RB avatar

we were thinking about perhaps splitting the security groups up but that kind of kicks the can down the road. once we’re at 60 ipcidrs for office ips, what do we do next ?

Steven avatar

You can do up to 5 security groups, move to WAF, or create your own solution

RB avatar

for now i asked aws to increase the limit from 60 to 100 per sg per vpc in 1 region and they complied which is nice. i guess in the future, we’ll have to consider a waf or get more creative with a solution

Steven avatar

Unless things have changed (I did this 2 years ago). There is a max of 300 rules. So, they would have increased the number of rules per group and reduced the number of groups you can have.

RB avatar

yep. the default is 60 rules per sg and 5 sgs per networking interface so 60*5 is 300. they have a hard limit of 1000 so either the 60 or 5 can increase but that 1000 cannot be exceeded.

https://aws.amazon.com/premiumsupport/knowledge-center/increase-security-group-rule-limit/

Steven avatar

That’s an improvement. Good to know.

rms1000watt avatar
rms1000watt

Does anyone use Aurora (postgres) and love their experiences with it? Considering migrating from RDS (postgres) to it. I know a few years ago there was reliability concerns, but not sure about in 2020 at scale.

jose.amengual avatar
jose.amengual

MMMM I guess the question is more like, Do you like Aurora in general?

jose.amengual avatar
jose.amengual

Aurora and aws have way better support for mysql than postgress

jose.amengual avatar
jose.amengual

aurora storage has some limitations and if heavy write workloads you can easily kill a cluster by writing too fast

1
jose.amengual avatar
jose.amengual

if you do not have any of those problems and you do not need to tune up mysql or postgres aurora is great

rms1000watt avatar
rms1000watt

i think we’re not write heavy

jose.amengual avatar
jose.amengual

if you had Aurora you can check Performance Insights to answer that question lol

rms1000watt avatar
rms1000watt

hahaha nice. got me there

rms1000watt avatar
rms1000watt

basically i’m asking the super n00b question of is there a point to cut over to Aurora if RDS is working OK?

jose.amengual avatar
jose.amengual

I guess it depends on what you want, if you are looking for automatic failover, updates, elastic storage and HA aurora is nice

rms1000watt avatar
rms1000watt

and in your experience, in prod, has it been reliable for ya?

jose.amengual avatar
jose.amengual

yes, no issues but again we did hit the underlaying storage problem I described earlier because we write a huge amount of stuff

jose.amengual avatar
jose.amengual

apart from that issue it works just fine

1
rms1000watt avatar
rms1000watt

so was it like, a lot of writes, caused replication lag, and caused requests to return slowly?

rms1000watt avatar
rms1000watt

or like, data was lost?

jose.amengual avatar
jose.amengual

no, we basically find a way to failover the cluster at will by doing specific operations

sheldonh avatar
sheldonh

Is there a full fledged project like a terraform module or something I can use to establish a home account for IAM users + define groups/roles to assume for all users across my accounts? I see a lot of pieces in github, but before i mess around, was wondering if anyone/or other project/ has a “best practice complete layout for home account user provisioning” so I can implement a pull request driven workflow for users provisioning.

Again, I’ve seen pieces, but a full fledged “best practice” layout or service is what I’m wanting to explore tomorrow

2020-08-20

vFondevilla avatar
vFondevilla

I had some issues with lockdowns in Aurora in stress moments, leaving the database zombie. From the AWS perspective the Database is alive as their user (locally runned for monitoring) is able to do stuff, but the cluster stops answering connections until we reboot it. This happened 2 times in 6 months, but apart from that it’s pretty smooth.

Darren Cunningham avatar
Darren Cunningham

sounds like the cluster isn’t CPU pegged; have you already validated that you’re not hitting a max connection limits?

Darren Cunningham avatar
Darren Cunningham

highly recommend AWS Support if you’re not paying for it. We have Business level in our Production account. We’ve had a few instances similar to this and they were able to dig into the logs/configuration and determine root cause for us. Worth every penny.

vFondevilla avatar
vFondevilla

In our case business support couldn’t find a root cause apart from not being in the last patchlevel. After enabling the detailed logging (every request) the issue didn’t happened again.

Darren Cunningham avatar
Darren Cunningham

could try opening a new case and see if the next engineer has better luck

vFondevilla avatar
vFondevilla

Just to be completely sure I deployed a new database from an snapshot and nuked the database cluster

Darren Cunningham avatar
Darren Cunningham

if you can associate metrics to “stress moments” probably worth setting up an alert that the team can get notified that things may be going sideways in the future

vFondevilla avatar
vFondevilla

now we have an automated probe testing the connection every minute so I’m pretty confident about that. It opens an mysql connection, query a table with a select with limit 1 and if everything it’s ok it will close the mysql connection.

1
jose.amengual avatar
jose.amengual

mysql ? version ? serverless ? workflow is write heavy ?

jose.amengual avatar
jose.amengual

how many replicas?

vFondevilla avatar
vFondevilla

MySQL Aurora with MySQL 5.7, single master and one replica. Workflow is primarily read as it’s a drupal website. Sometimes (with cache expirations), every node will launch a pool of connections against the MySQL server (every node at the same time as the cache was located in the database), and in that moment, when receiving about 1500 new connections (the instance size it’s an r5.2xlarge with max_connections default at 3000 connections), the mysql became zombie. After the second time happening the same, we did a change in the Drupal cache expiration and it never happened again.

vFondevilla avatar
vFondevilla

Support was completely clueless and with the Drupal changes on the cache we couldn’t replicate the issue anymore.

jose.amengual avatar
jose.amengual

we had a weird writing pattern issue that will trigger a failover immediately

jose.amengual avatar
jose.amengual

I think at the end read or write the underlaying storage can’t keep up and it kills the writer

jose.amengual avatar
jose.amengual

in you case you could leverage the RDS proxy

jose.amengual avatar
jose.amengual

which is now GA

jose.amengual avatar
jose.amengual

and same as you Support did not have a clue

jose.amengual avatar
jose.amengual

and when I was in re:Invent asked this question and they did not answer

vFondevilla avatar
vFondevilla

(Running Aurora MySQL, for more information)

Darren Cunningham avatar
Darren Cunningham

When using a Lambda to process SQS, are you always using batch size 1 or do you handle failures of messages individually? if the later, how?

2020-08-21

RB avatar

for the people who are using https://github.com/Nike-Inc/gimme-aws-creds

Nike-Inc/gimme-aws-creds

A CLI that utilizes Okta IdP via SAML to acquire temporary AWS credentials - Nike-Inc/gimme-aws-creds

RB avatar

how do you use the same app to login to aws console ui ?

Nike-Inc/gimme-aws-creds

A CLI that utilizes Okta IdP via SAML to acquire temporary AWS credentials - Nike-Inc/gimme-aws-creds

RB avatar

i know gimme-aws-creds will dump out the access key id and secret access key which can then be used to hit aws’ federated endpoint to create a session in aws console.

https://stackoverflow.com/questions/59952757/how-to-login-to-aws-console-using-access-key-secret-key-and-session-token

im wondering if there is an easier, more integrated way

RB avatar

for example, using aws-vault, it has a login command that will open aws console for you.

# open a browser window and login to the AWS Console
$ aws-vault login jonsmith

but to use this tool with nike’s gimme-aws-creds, i’d have to do the following.

  1. get creds from gimme-aws-creds
  2. enter them into aws-vault which is a pain since these creds are temporary
  3. then run aws-vault login
jose.amengual avatar
jose.amengual

so you want to trigger a console login from cli ?

RB avatar

i suppose https://github.com/versent/saml2aws has a console arg so maybe that tool would be better

Versent/saml2aws

CLI tool which enables you to login and retrieve AWS temporary credentials using a SAML IDP - Versent/saml2aws

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

We use saml2aws

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

We previously used aws-okta which was literally a fork of aws-vault, but support has dropped

Zach avatar

yah we have okta federation. You log into the Console with an okta saml redirect. You log into the CLI with gimme-aws which authenticates you and plops creds into the shell env

RB avatar

@Erik Osterman (Cloud Posse) ah ok so you have okta saml setup with saml2aws so you can get keys and use the saml2aws console to quickly login to the aws console

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Yep and it works inside a container like geodesic

RB avatar

that makes a lot of sense. it’s too bad it doesn’t use oidc but i guess it doesn’t matter what kind of auth, as long as it works.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

We use it with okta and gsuite depending on the customer

1
RB avatar

@Zach you use gimme-aws to get aws creds and then you have an okta app that allows you login to aws console ? could you explain more about that ?

Zach avatar

its 2 differnet means of doing the same thing

Zach avatar

If you want to go to the AWS Console, you click the AWS ‘app’ in Okta (the plugin). You auth, and then you assume an IAM role that your okta group maps to or allows.

Zach avatar

At the CLI the gimme-aws-creds does basically the same thing, you tell it what account and role in AWS you want to assume

RB avatar

ah ok that makes sense. so i imagine the cli method and the okta app use the same client_id / client_secrets ?

Zach avatar

the AWS keys? doubt it

RB avatar

no the oidc creds

RB avatar

oh i see, so probably different oidc client id and client secret

Zach avatar

Hmmm not sure - the gimme-aws si configured with an okta secret. The aws side is … complicated? the okta docs aren’t great but if you follow them to the letter it all works

1
RB avatar

i may have to ping you and others more about this. IT at our place still holds the whole sso thing close to their hearts so waiting on them before i can configure

Zach avatar

thats funny, our actual IT maintains the Azure AD and we just bypass all that with Okta for our engineering team

zeid.derhally avatar
zeid.derhally

I’m switching away from aws-okta and was wondering if anyone has thoughts on aws-okta-processor? I’ve used it and like it.

https://github.com/godaddy/aws-okta-processor

godaddy/aws-okta-processor

Okta credential processor for AWS CLI. Contribute to godaddy/aws-okta-processor development by creating an account on GitHub.

loren avatar

i like it a lot. one of the few that manage to handle the credential cache both for the sso session and for the aws sts session

godaddy/aws-okta-processor

Okta credential processor for AWS CLI. Contribute to godaddy/aws-okta-processor development by creating an account on GitHub.

loren avatar

project is well structured also, and the maintainers are responsive and accepting of prs

2020-08-22

2020-08-23

Igor avatar

Does anyone know of a way to set up AWS-VAULT so CloudTrail recognizes that the login is with MFA?

roth.andy avatar
roth.andy

https://github.com/99designs/aws-vault#roles-and-mfa

Add mfa_serial to the profile in $HOME/.aws/config

99designs/aws-vault

A vault for securely storing and accessing AWS credentials in development environments - 99designs/aws-vault

Igor avatar

I have the login working with MFA

Igor avatar

But CloudTrail says additionalEventData.MFAUsed

Igor avatar

As No

roth.andy avatar
roth.andy

Igor avatar

Let me try with –no-session

Igor avatar

Still “No”

ismail yenigul avatar
ismail yenigul

@Igor what is your ~/.aws/config for that profile and What is the full error? additionalEventData.MFAUsed is not an error

Igor avatar

There is no error. That’s a property in the CloudTrail event log that states that MFA wasn’t used on login.

Igor avatar

config is as linked. I have role_arn and mfa_serial on the profile

ismail yenigul avatar
ismail yenigul

is this aws console login or aws-vault ? can you paste the full event log

ismail yenigul avatar
ismail yenigul

and did you enforce MFA in your assume role?

2020-08-24

Zach avatar

^ similar question but I’ll fork off for gimme-aws-creds if anyone knows how to make CloudTrail recognize that I have an MFA in a session

RB avatar
RB avatar
Netflix/dispatch

All of the ad-hoc things you’re doing to manage incidents today, done for you, and much more! - Netflix/dispatch

kskewes avatar
kskewes

Someone at ours had a look and said too early/raw and went to zapier instead.

RB avatar

interesting. someone here had the same thought cause it had bugs.

RB avatar

i guess we wait

Brij S avatar

I was wondering if there were any jmespath gurus here, I’ve got the following command

 aws s3api list-buckets --query "Buckets[?starts_with(Name, \`tf-app-npd\`)]|[?contains(Name, \`state\`)].Name"

This works just fine, however it returns the following list

[
    "tf-app-npd-kmstest-state",
    "tf-app-npd-pr1-state",
    "tf-app-npd-shared-state",
    "tf-app-npd-stage-state",
    "tf-app-npd-state"
]

I’d like to exclude buckets such as tf-app-npd-shared-state or tf-app-npd-state, but I’m stuck - any ideas?

bradym avatar

Something like this should work:

aws s3api list-buckets --query "Buckets[?starts_with(Name, \`tf-app-npd\`)]|[?contains(Name, \`state\`)]|[?!(contains(Name, \`tf-app-npd-shared-state\`))].Name"
bradym avatar

jmespath has an or operator so theoretically you could use that to exclude multiple buckets

Brij S avatar

ohh interesting! How would I use the or operator with

[?!(contains(Name, \`tf-app-npd-shared-state\`))]
Brij S avatar

to exclude tf-app-npd-state

bradym avatar

I’m not entirely sure

bradym avatar

Looks like this works:

[?!(contains(Name, \`tf-app-npd-shared-state\`) || (contains(Name, \`tf-app-npd-state\`)))]
Brij S avatar

works like a charm! Thank you!

bradym avatar

happy to help

Brij S avatar

you wouldn’t happen to be a pro with sed would you?

bradym avatar

Not sure I’d call myself a pro, but I use it quite often. Happy to look at whatever you’re trying to do.

Brij S avatar

the result of the command above has the following output

[
    "tf-app-npd-kmstest-state",
    "tf-app-npd-pr16-state",
    "tf-app-npd-stage-state"
]

essentially, I’d like to ‘remove’ the tf-app-npd- and -state parts. So for the first bucket I’d be left with kmstest

bradym avatar

sed 's/tf-app-npd-\|-state//g'

Brij S avatar

hmm, I piped that to the end of the awscli command but the result is the same

| sed 's/tf-app-npd-\|-state//g'
bradym avatar

just to confirm, the pipe to sed is outside the "" on the aws command right?

Brij S avatar

yup, the whole command is

aws s3api list-buckets --output text --query "Buckets[?starts_with(Name, \`tf-app-npd\`)]|[?contains(Name, \`state\`)]|[?!(contains(Name, \`tf-app-npd-shared-state\`) || (contains(Name, \`tf-app-npd-state\`)))].[Name]" | sed 's/tf-app-npd-\|-state//g'
bradym avatar

Odd… it worked for me when I copied your output

$ OUT='[                        
    "tf-app-npd-kmstest-state",
    "tf-app-npd-pr16-state",
    "tf-app-npd-stage-state"
]'

$ echo $OUT
[ "tf-app-npd-kmstest-state", "tf-app-npd-pr16-state", "tf-app-npd-stage-state" ]

$ echo $OUT | sed 's/tf-app-npd-\|-state//g'
[ "kmstest", "pr16", "stage" ]
Brij S avatar

hmm, are you using a macbook? I know sed on macos is slightly different

bradym avatar

nope

bradym avatar

ubuntu 18

bradym avatar

and the command without sed still works right?

Brij S avatar

yes

bradym avatar

Experimenting with my own buckets, looks like that sed or isn’t working right.

Brij S avatar

sed or isnt?

bradym avatar

nevermind, I had a typo

bradym avatar

what’s your sed --version?

Brij S avatar
How can I check the version of sed in OS X?

I know if sed is GNU version, version check can be done like $ sed –version But this doesn’t work in OS X. How can I do that?

Brij S avatar

got it!

Brij S avatar

so, on macos you need to run brew install gnu-sed

Brij S avatar

that gives me the version of sed that you’re probably using

bradym avatar

or close enough to it

Brij S avatar

yep! thanks again! Appreciate the help

bradym avatar

np

Brij S avatar

have you ever looped through a json list with bash

bradym avatar

You’re gonna want jq for that - https://stedolan.github.io/jq/

Brij S avatar

Hey @bradym - you around for a quick bash/awscli question?

bradym avatar

Sure, what’s up?

Brij S avatar

:slightly_smiling_face: I had an old awscli command I used to delete a versioned object in an S3 bucket like this

aws s3api delete-objects --bucket ${REMOTE_STATE_BUCKET} --delete "$(aws s3api list-object-versions --bucket ${REMOTE_STATE_BUCKET} --query='{Objects:Versions[].{Key:Key,VersionId:VersionId}}')"

This worked fine, however we decided to store more in this bucket so I wanted to delete only objects with a certain key, I ended up with this

aws s3api delete-objects --bucket ${REMOTE_STATE_BUCKET} --delete "$(aws s3api list-object-versions --bucket ${REMOTE_STATE_BUCKET} --output=json --query="Versions[?starts_with(Key,\`${STAGE}\`)].{Key:Key,VersionId:VersionId}")"

but now I get the following error

Error parsing parameter '--delete': Invalid JSON:
[
    {
        "Key": "stage/terraform.tfstate",
        "VersionId": ".oKrS6dg8TJGGjaDGeAvF7RryDqok.wy"
    }
]
Brij S avatar

any idea what its complaining about

bradym avatar

Take a look at aws s3api list-object-versions help – there’s an example of what the JSON syntax for that command should be, and it looks like yours is not formatted quite right

msharma24 avatar
msharma24

Hello -I would like to keep 100s of GBS of files in sync between to cross account same region S3 buckets with the ability to delete the files from destination bucket when I delete or replace the files in the source bucket ? The s3 replication feature does not solve this issue as S3 does not do replicate delete , the aws s3 sync also wont help here since it would not delete the files from the remote bucket ?

Do I need to build some kind of manifest to keep log of the files which will command what files remains in sync ?

roth.andy avatar
roth.andy

Check out rclone

msharma24 avatar
msharma24

Thanks

Steven avatar

aws s3 sync will delete. You just need to add the –delete option But for speed, consider using s3 replication for the copy. s3 replication can also do deletes. https://aws.amazon.com/blogs/storage/managing-delete-marker-replication-in-amazon-s3/

Managing delete marker replication in Amazon S3 | Amazon Web Servicesattachment image

Customers use Amazon S3 Replication to create a copy of their data within the same AWS Region or in another AWS Region for compliance, lower latency, or sharing data across accounts. In environments where data is constantly changing, customers have different replication needs for objects that have been, or will be, deleted. For some use cases, […]

2020-08-25

walicolc avatar
walicolc

Looks like RDS is down for those using AWS Europe - London Region https://downdetector.co.uk/status/aws-amazon-web-services/

Amazon Web Services down? Realtime overview of AWS status, issues and outagesattachment image

Real-time overview of issues with Amazon Web Services. Is your service not functioning properly? Here you learn whats is going on.

walicolc avatar
walicolc

AWS have yet to report on it , status checks still indicate all green

walicolc avatar
walicolc

AWS have now reported on this

Karoline Pauls avatar
Karoline Pauls

AWS VPC DNS resolution is so dumb.

Do they seriously think it is OK to implicitly resolve to private addresses when peering/transit gateway is set up? What if it’s set up badly?

RB avatar

what’s a good minimum_protocol_version to set a cloudfront distribution to if it has an acm cert for a static s3 site. I’m currently using TLSv1.1_2016 but I think I should go to TLSv1.2_2019

https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudfront_distribution#minimum_protocol_version

Thoughts?

RB avatar


When you create a new distribution using a custom SSL certificate, TLSv1.2_2019 will be the default policy option selected. You may use the AWS Management Console, Amazon CloudFront APIs, or AWS CloudFormation to update your existing distribution configuration to use this new security policy.

2
loren avatar

nice, i’ve wondered the same thing several times

RB avatar

anyone got oidc working with atlantis on an aws alb ? we’re using okta with the following settings

    issuer                              = "<https://company.okta.com/>"
    token_endpoint                      = "<https://company.okta.com/oauth2/default/v1/token>"
    user_info_endpoint                  = "<https://company.okta.com/oauth2/default/v1/userinfo>"
    authorization_endpoint              = "<https://company.okta.com/oauth2/default/v1/authorize>"
    authentication_request_extra_params = {}

is this correct ? we created a Web integration with OpenID Connect to get a client_id and client_secret

2020-08-26

mfridh avatar

Do you know? Having an Imported certificate in ACM, assigned to some ALB listeners - when updating said imported certificate by uploading a new to ACM - are the load balancer listeners all supposed to propagate to use that new certificate?

mfridh avatar

Ok… Then we have something odd going on…

RB avatar

id create a support ticket on this just in case

RB avatar

you could check and create a new load balancer with the same acm cert and see if that works as expected

RB avatar

but the new acm cert should propagate to all load balancers /listeners that already use it.

mfridh avatar

things are fine, nothing to see here. Someone was trolling with adding an additional listener certificate.

1
walicolc avatar
walicolc

Is there a way of using logical OR in IAMs instead of implementing it by writing separate blocks ?

walicolc avatar
walicolc

Solved.

RB avatar

whats the solution ?

walicolc avatar
walicolc

AWS treats this as OR

"Condition": {
         "StringEquals": {
           "aws:sourceVpc": ["vpc-111bbccc", "vpc-111bbddd"]
         }
       }
1
walicolc avatar
walicolc

Stupid bc it’s not obvioous at first but hey if it works it works

walicolc avatar
walicolc

Also can use ForAnyValue

RB avatar

cool, i did not know you can do that

1
loren avatar

anyone have experience using aws session manager with a .ssh/config, such that a git-over-ssh connection would utilize session manager? we have gitlab running in a private subnet, and would like to support an ssh remote without opening ssh via an ELB in a public subnet…

loren avatar

i’m guessing something like this, just based on some googling…

Host bastion
  ProxyCommand sh -c "aws ssm start-session --target <bastion-instance-id> --document-name AWS-StartSSHSession --parameters 'portNumber=%p' --region <region>"

Host <gitlab.remote>
  IdentityFile  <my key>
  User git
  ProxyCommand ssh -W %h:%p  ec2-user@bastion
loren avatar

or perhaps with ProxyJump?

Host bastion
  User ec2-user
  ProxyCommand sh -c "aws ssm start-session --target <bastion-instance-id> --document-name AWS-StartSSHSession --parameters 'portNumber=%p' --region <region>"

Host <gitlab.remote>
  IdentityFile  <my key>
  User git
  ProxyJump bastion
jose.amengual avatar
jose.amengual

like this :

host i-* mi-*
    ProxyCommand sh -c "aws ssm start-session --target %h --document-name AWS-StartSSHSession --parameters 'portNumber=%p' --profile production"
jose.amengual avatar
jose.amengual

that works for me

loren avatar

roger, but that works where the host in the ssh command is the instance. in this case, the host is the gitlab remote (and the user does not have ssm:StartSshSession permissions to the gitlab host. but they do have a user in gitlab and an ssh key loaded in their gitlab profile)

maarten avatar
maarten
Flaconi/terraform-aws-bastion-ssm-iam

AWS Bastion server which can reside in the private subnet utilizing Systems Manager Sessions - Flaconi/terraform-aws-bastion-ssm-iam

1
jose.amengual avatar
jose.amengual

Mmm but ssm does not require ssh keys to connect, in that case you might want to ck slider instance connect

2020-08-27

walicolc avatar
walicolc

Does one know how to get codebuild to git clone my codecommit repo instead of zipping it up? I’m unable to execute git commands bc it isn’t a git repo

loren avatar

Is codepipeline involved, by any chance?

walicolc avatar
walicolc

yes I’m using codepipeline

loren avatar

Then no, I was never able to work this out. It’s just what codepipeline does… Takes your source repo, exports it to a zip hosted in s3, then codebuild retrieves the zip. I stopped using codepipeline as a result

loren avatar

If you create a codebuild job with the source as your codecommit repo, and trigger the job directly, without codepipeline, then it works as you expect

walicolc avatar
walicolc

Thank you Loren!

Adrian avatar

With git as workaround I don’t remember why it’s like this:

      - git init
      - git remote add origin https://${GITHUB_TOKEN}@github.com/owner/repo
      - git fetch --tags
      - git reset --hard origin/master
walicolc avatar
walicolc

would this work for codecommit as well

Adrian avatar

I don’t use CodeCommit

walicolc avatar
walicolc

OK - I’ll see if we can transition to anything but codecommit, there’s another functionality that doesn’t seem to work on codecommit which I spotted earlier. For now that medium blog will do. Thank you!

loren avatar

yeah, if you clone the repo as a codebuild step, then the “source” step of codepipeline is rather pointless

walicolc avatar
walicolc

It works for us for now - so it’s OK. I’ll end up moving to github at a later stage.

tomv avatar

Is it just me or is the EMR spot market in us-west-2 for the past week.. non existant? we’re having capacity trouble for all sorts of instance types

RB avatar

anyone done a cost benefit analysis of migrating ECS to EKS ?

RB avatar

thread

sheldonh avatar
sheldonh

Ok…. I’m done with AWS SSM as my long-term plan. Too slow to iterate and lots of edge bugs for my use.

I want to bring a company wide consistency to config tooling, no more Choco only for windows and Linux left out to dry :-)

Best in class for cross platform and ease of maintenance I’m leaning towards is AWS opsworks puppet enterprise. While we have some ansible already I want state to be checked + run through ssm when possible. Folks here don’t use Ruby but lots have dabbled in python

The key requirement is simplify runs when possible by using AWS ssm associations and running through that. Winrm seems problematic in comparison for 200+ instances.

Puppet?

2020-08-28

drexler avatar
drexler

Hi anyone encountered this issue before: UnsupportedAvailabilityZoneException: Cannot create cluster 'eks-cluster-platform' because us-east-1e, the targeted availability zone, does not currently have sufficient capacity to support the cluster ??

pjaudiomv avatar
pjaudiomv

I have, in my case I just tried again later and it worked

1
RB avatar

is there an easy way to see when a new ecs service is being deployed ? if there is an event, i’d like to be able to hit up a slack channel so we can keep track of production deployments

Igor avatar

I asked this before.. I was told to use lambda… so looks like no events out-of-the-box

2
pjaudiomv avatar
pjaudiomv

I have used cloudwatch events and lambdas for that in the past

Igor avatar
bitflight-public/terraform-aws-ecs-events

Add on an SNS topic for capturing ECS events. Contribute to bitflight-public/terraform-aws-ecs-events development by creating an account on GitHub.

Igor avatar
Tutorial: Listening for Amazon ECS CloudWatch Events - Amazon Elastic Container Service

In this tutorial, you set up a simple AWS Lambda function that listens for Amazon ECS task events and writes them out to a CloudWatch Logs log stream.

pjaudiomv avatar
pjaudiomv

nooice

sheldonh avatar
sheldonh

Help. Just need to know how to get past this failed helm release. Brand new to this and using a docker release library for gitpod.

Error: cannot re-use a name that is still in use

  on modules/gitpod/main.tf line 9, in resource "helm_release" "gitpod":
   9: resource "helm_release" "gitpod" {

I have no idea how to get it removed or whatever as I don’t see anything successful yet in AWS EKS

zidan avatar

#aws 6 tips that I apply to optimize our cost in AWS, check them out and let me know how many of them do you apply? https://www.dailytask.co/task/6-tips-that-you-should-think-about-them-to-optimize-your-costs-in-aws-ahmed-zidan

6 tips that you should think about them to optimize your costs in AWS.

6 tips that you should think about them to optimize your costs in AWS. written by Ahmed Zidan

1

2020-08-31

RB avatar

what’s a good way for the container to know if it has been deployed in fargate or ecs ?

RB avatar

i know about the ECS_CONTAINER_METADATA_URI which is handy but this env variable is set in both ecs ec2 and fargate

RB avatar
Task Metadata Endpoint version 3 - Amazon Elastic Container Service

Beginning with version 1.21.0 of the Amazon ECS container agent, the agent injects an environment variable called ECS_CONTAINER_METADATA_URI into each container in a task. When you query the task metadata version 3 endpoint, various task metadata and

jose.amengual avatar
jose.amengual

we subscribe to a sns topic that then notify on slack

RB avatar

thats a good way for the humans to know

jose.amengual avatar
jose.amengual

with some lambda and that aws bot thingy

RB avatar

what’s a good way for the container itself to know

jose.amengual avatar
jose.amengual

ahhhh

jose.amengual avatar
jose.amengual

we use local healthcheck as like a readiness test

jose.amengual avatar
jose.amengual

to curl itself basically

RB avatar

that’s a good way to get the current health of the container

RB avatar

but what’s a good way to determine, for the container, if it’s an ecs ec2 task vs an ecs fargate task ?

RB avatar

the only thing i can think of is if it hits the task metadata, gets the task arn, describes the arn, and determined the launch type from that.

jose.amengual avatar
jose.amengual

OMG I had to read 3 time you question to realize it was something else

jose.amengual avatar
jose.amengual

can you just add a ENV variable that the deployment set?

RB avatar

the other way we’re thinking is that the ip address looks different betw ecs and ec2, where for ecs ec2 the ip is an ip in our vpc, whereas the fargate ip is the docker bridge ip that is not in our vpc

RB avatar

ya we could probably add an env variable to all the tasks. that would be one solution.

RB avatar

i was hoping for something more dynamic

jose.amengual avatar
jose.amengual

is there a metadata endpoint that you can curl and set a ENV variable?

jose.amengual avatar
jose.amengual

you could use the same local healthcheck to actually set it

RB avatar

we’d have to set the env variable in the task definition and it would be a lot to update all of our tds.

the metadata endpoint is something we can curl from the container

RB avatar

cant believe amazon doesnt deliver the l aunch type in the metadata

RB avatar

a label may be a better option than an env variable

RB avatar

since the labels can be queried from the /tasks endpoint

RB avatar

@jose.amengual check this out

https://github.com/mackerelio/mackerel-container-agent/blob/c70de86ba1256fb0bfadba0a98237bf91a75b5db/platform/ecs/ecs.go#L21

basically we can query off of the env variable AWS_EXECUTION_ENV which can either be AWS_ECS_FARGATE or AWS_ECS_EC2

RB avatar
Is it possible to detect Fargate without trying the metadata API

Is there a possibility for an application that is launched as Fargate task to determine if it runs inside Amazon ECS without trying the task metadata endpoint? It would be great if there are envir…

jose.amengual avatar
jose.amengual

ohhhh cool

Zach avatar

I’m curious what the use-case is for this

2
jose.amengual avatar
jose.amengual

imagine you have task on fargate that you are migrating to ecs+ec2 and they set certain cpu and memory attributes base on available memory, you could use this to set certain different defaults so the tasks can run without issues

Zach avatar

So you’d allow the container, once already launched, to modify its own task definition?

RB avatar

We use it for a fatlib that gets the ddagent hostname ip

jose.amengual avatar
jose.amengual

ddagent as datadog agent?

RB avatar

yessir

jose.amengual avatar
jose.amengual

for APM tracing you need that?

RB avatar

yessir

jose.amengual avatar
jose.amengual

but I thought running as a daemon you could use the hostname?

RB avatar

how so ?

RB avatar

what host name ?

RB avatar

for ecs, we’ve been using the hostname as the ip of the ec2 via the metadata url, for fargate, we’ve been doing something similar

RB avatar

is it best to use a different string ?

jose.amengual avatar
jose.amengual

is this for this bit of the dd docs :

Assign the private IP address for each underlying instance your containers are running on in your application container to the DD_AGENT_HOST environment variable. This allows your application traces to be shipped to the Agent. The Amazon's EC2 metadata endpoint allows discovery of the private IP address. To get the private IP address for each host, curl the following URL:

curl <http://169.254.169.254/latest/meta-data/local-ipv4>

and set the result as your Trace Agent Hostname environment variable for each application container shipping to APM:

os.environ['DD_AGENT_HOST'] = <EC2_PRIVATE_IP>

In cases where variables on your ECS application are set at launch time, you must set the hostname as an environment variable with DD_AGENT_HOST. Otherwise, you can set the hostname in your application's source code for Python, Javascript, or Ruby. For Java and .NET you can set the hostname in the ECS task. For example:
RB avatar

yep so thats how we do it for ecs ec2

RB avatar

how do you do it for fargate

jose.amengual avatar
jose.amengual

we do not use fargate for the stuff I manage

jose.amengual avatar
jose.amengual

but since it is in deamon mode the hostname of the datadog task can be passed as a parameter to the collector

jose.amengual avatar
jose.amengual

like :

jose.amengual avatar
jose.amengual

dd.agent.host

jose.amengual avatar
jose.amengual

which will be the hostname of the container running in daemon mode

jose.amengual avatar
jose.amengual

by default is datadog

jose.amengual avatar
jose.amengual

I think

RB avatar

thats for java tho. we use python. would it be the same config

jose.amengual avatar
jose.amengual

same

Yoni Leitersdorf (Indeni Cloudrail) avatar
Yoni Leitersdorf (Indeni Cloudrail)

Something I’d like to verify: The public/private of an Aurora cluster is dependent on the public flag on the instances within the cluster (and of course, routing to the IGW).

Is that correct?

I’m looking here.

Cameron Boulton avatar
Cameron Boulton

Yea, mostly. That config controls whether the instance has a publically routable Internet IP address or a private one from your subnet(s) space if that makes sense.

Cameron Boulton avatar
Cameron Boulton

As you say, that has no bearing on whether the path is routable (route tables/IGW and/or network ACL) or firewalled (security group).

Yoni Leitersdorf (Indeni Cloudrail) avatar
Yoni Leitersdorf (Indeni Cloudrail)

So the cluster “itself” doesn’t have its own public/private setting right?

Cameron Boulton avatar
Cameron Boulton

Yea, I think that’s right; seems like it’s implied by the instance level setting.

Yoni Leitersdorf (Indeni Cloudrail) avatar
Yoni Leitersdorf (Indeni Cloudrail)

OK thanks

Cameron Boulton avatar
Cameron Boulton

I’ve never tried adding one instance public and one instance private; not sure what would happen there. Probably an error.

Yoni Leitersdorf (Indeni Cloudrail) avatar
Yoni Leitersdorf (Indeni Cloudrail)

I just tried that. not an error. The public/private ended up being set by the first instance in the cluster.

1
Cameron Boulton avatar
Cameron Boulton

Wonder if the second (private) would be accessible by the cluster’s reader endpoint then (assuming it resolve to public IPs).

Yoni Leitersdorf (Indeni Cloudrail) avatar
Yoni Leitersdorf (Indeni Cloudrail)

Happy to give you my TF code to check for yourself

Cameron Boulton avatar
Cameron Boulton

All good; just wondering out loud

    keyboard_arrow_up