#aws (2020-03)
Discussion related to Amazon Web Services (AWS)
Discussion related to Amazon Web Services (AWS)
Archive: https://archive.sweetops.com/aws/
2020-03-01
2020-03-02
![Manuel Pirez avatar](https://secure.gravatar.com/avatar/3e12e3458ec1e4647adc58e71f82eadd.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0018-72.png)
Hi, I would like to know if someone have an EKS cluster in any environment with only fargate implemented with the terraform-aws-eks-fargate-profile module. I have tried to deploy a cluster with these module and the pods always remain pending. I even deployed with eksctl and I was comparing permissions of IAM, VPC, Tags etc … and apparently it is the same on the AWS side, but with eksctl the pods work and with the terrafom modules not.
Terraform module to provision an EKS Fargate Profile - cloudposse/terraform-aws-eks-fargate-profile
![Andriy Knysh (Cloud Posse) avatar](https://avatars.slack-edge.com/2018-06-13/382332470551_54ed1a5d986e2068fd9c_72.jpg)
did you see the complete working example https://github.com/cloudposse/terraform-aws-eks-fargate-profile/tree/master/examples/complete
Terraform module to provision an EKS Fargate Profile - cloudposse/terraform-aws-eks-fargate-profile
![Andriy Knysh (Cloud Posse) avatar](https://avatars.slack-edge.com/2018-06-13/382332470551_54ed1a5d986e2068fd9c_72.jpg)
Terraform module to provision an EKS Fargate Profile - cloudposse/terraform-aws-eks-fargate-profile
![Andriy Knysh (Cloud Posse) avatar](https://avatars.slack-edge.com/2018-06-13/382332470551_54ed1a5d986e2068fd9c_72.jpg)
take a look at this https://github.com/cloudposse/terraform-aws-eks-fargate-profile/blob/master/test/src/examples_complete_test.go#L109
Terraform module to provision an EKS Fargate Profile - cloudposse/terraform-aws-eks-fargate-profile
![Andriy Knysh (Cloud Posse) avatar](https://avatars.slack-edge.com/2018-06-13/382332470551_54ed1a5d986e2068fd9c_72.jpg)
before you deploy any k8s resource to the fargate nodes (into the namespace for which the fargate profile was created), the nodes will be in Pending state
![Andriy Knysh (Cloud Posse) avatar](https://avatars.slack-edge.com/2018-06-13/382332470551_54ed1a5d986e2068fd9c_72.jpg)
@Manuel Pirez ^
![Manuel Pirez avatar](https://secure.gravatar.com/avatar/3e12e3458ec1e4647adc58e71f82eadd.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0018-72.png)
Thanks @Andriy Knysh (Cloud Posse)
2020-03-03
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
So I’m trying to understand what the minimal permissions OpsWorks requires are - https://www.terraform.io/docs/providers/aws/r/opsworks_stack.html.
I’m supposed to give it a service role and give instances a default profile. The only thing I’m supposed to use OpsWorks for are users and SSH keys, I want the rest immutable.
Now, logically this would only require opsworks agents running on instances running on instances to be able to pull keys. (This is VERY simple on Google Cloud… but this is AWS…)
Can I give opsworks an empty policy and the instances - some policy that only allows it to pull keys somehow?
Provides an OpsWorks stack resource.
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
Does this mean I cannot use OpsWorks to manage SSH keys (and nothing more, ever) on my spot instances in ASGs?
![Maciek Strömich avatar](https://secure.gravatar.com/avatar/98de12365b633b063e208220100d4594.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0002-72.png)
@Karoline Pauls why do you want to use ssh when you have system manager available?
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
TBH I don’t want to have anything to do with OpsWorks and I don’t know what system manager is.
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
I just want to get the SSH keys thing out of the way without doing anything stupid in IAM.
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
I think i’ll just ignore what was done before and will use https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Connect-using-EC2-Instance-Connect.html
Connect to your Linux instances using EC2 Instance Connect.
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
Because I don’t have time for learning about yet-another-semiredundant-overengineered-feature-of-aws. Don’t want to sound dismissive but I want working infrastructure with few carefully chosen bits of AWS, not a menagerie of AWS services.
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
Looks like AWS also doesn’t recommend OpsWorks for what I’m supposed to use it for because stack creation is taking 20 minutes already.
![Nikola Velkovski avatar](https://avatars.slack-edge.com/2018-11-08/474538495603_cc9e62a39b3dbc9d8d65_72.png)
I would avoid OpsWorks competelly, I’ve had the chance of using it and I can only describe it as a mess.
![Maciek Strömich avatar](https://secure.gravatar.com/avatar/98de12365b633b063e208220100d4594.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0002-72.png)
@Nikola Velkovski puppet 4 life! :D
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
I liked Salt, you could render states with jinja (unlike Ansible when i used it) and since Jinja can almost run like normal Python code and definitely can build data structures, for advanced uses you could use the json
filter to output some dynamically built code that would parse as YAML.
Of course, as someone who dabbled with Clojure, I find this whole “let’s render data with text interpolation” thing shameful to our industry. Any expression-oriented dynamically typed programming language with rich data literals may be used to render heterogeneous data.
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
interestingly, even terraform acknowledges that: https://www.terraform.io/docs/configuration/functions/templatefile.html#generating-json-or-yaml-from-a-template
The templatefile function reads the file at the given path and renders its content as a template.
![Nikola Velkovski avatar](https://avatars.slack-edge.com/2018-11-08/474538495603_cc9e62a39b3dbc9d8d65_72.png)
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
it seems i managed to find the minimal set of permissions to register an instance, now i’m seeing it wants to install Ruby before proceeding (and gets 403)… other than the 403, installing Ruby is exactly something that I want to happen on Spot EC2 Kubernetes workers in an autoscaling group… Where I want startup times to be as fast as possible…
![maarten avatar](https://avatars.slack-edge.com/2020-09-28/1393040065826_b0d13cfde15deff02026_72.png)
Hi, why do you need to install ruby ?
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
I don’t, OpsWorks agent wants
![joshmyers avatar](https://avatars.slack-edge.com/2018-11-20/483958217281_8117d6f6c62807ce9912_72.jpg)
I’d have thought that would be all done in an omnibus package or similar
![joshmyers avatar](https://avatars.slack-edge.com/2018-11-20/483958217281_8117d6f6c62807ce9912_72.jpg)
Been many years since used Chef (never used opsworks)
![maarten avatar](https://avatars.slack-edge.com/2020-09-28/1393040065826_b0d13cfde15deff02026_72.png)
Opsworks with spot instances for k8s workers, am I reading this right ?
![joshmyers avatar](https://avatars.slack-edge.com/2018-11-20/483958217281_8117d6f6c62807ce9912_72.jpg)
Agree with others here, wouldn’t go near Opsworks especially for your use case. ^^
![maarten avatar](https://avatars.slack-edge.com/2020-09-28/1393040065826_b0d13cfde15deff02026_72.png)
You will loose so much time with opsworks, and it’s from a time docker wasn’t really there yet.
![randomy avatar](https://secure.gravatar.com/avatar/fa1b9e14d93f78f4c21a1238ae7984cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0001-72.png)
Why are you using Opsworks when you clearly don’t want to?
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
I agree with you all, I wouldn’t touch OpsWorks too
![maarten avatar](https://avatars.slack-edge.com/2020-09-28/1393040065826_b0d13cfde15deff02026_72.png)
but ?
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
But someone decided to use OpsWorks to manage people’s SSH access while I was busy working in another team porting a particularly blobby bit of software to Python 3.
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
![maarten avatar](https://avatars.slack-edge.com/2020-09-28/1393040065826_b0d13cfde15deff02026_72.png)
Ah ok, but then it’s a good moment to communicate that to this person that this might not be the best way forward, copy paste the slack to this person when needed.
![randomy avatar](https://secure.gravatar.com/avatar/fa1b9e14d93f78f4c21a1238ae7984cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0001-72.png)
and start again
![Nikola Velkovski avatar](https://avatars.slack-edge.com/2018-11-08/474538495603_cc9e62a39b3dbc9d8d65_72.png)
include me in the screenshot!
![Nikola Velkovski avatar](https://avatars.slack-edge.com/2018-11-08/474538495603_cc9e62a39b3dbc9d8d65_72.png)
OPSWORKS
![maarten avatar](https://avatars.slack-edge.com/2020-09-28/1393040065826_b0d13cfde15deff02026_72.png)
Also why is ssh access needed to begin with ?
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
most of the time it’s unnecessary but sometimes it’s useful to SSH to strace something in a container (yes, there is probably a tool for that), or do other sorts of debugging
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
I don’t really SSH there most of the time
![maarten avatar](https://avatars.slack-edge.com/2020-09-28/1393040065826_b0d13cfde15deff02026_72.png)
For executing inside the container kubectl exec
can be used, ssh is not needed there.
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
you cannot really strace this or other containers from there
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
IIRC even if you run as the root, you cannot [normally?] strace from docker because that would circumvent security
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
yeah, you need --cap-add sys_ptrace
![maarten avatar](https://avatars.slack-edge.com/2020-09-28/1393040065826_b0d13cfde15deff02026_72.png)
Ok but in that case you can also run telepresence
from a host with trace capabilities and do it there.
https://www.telepresence.io
Telepresence: a local development environment for a remote Kubernetes cluster
![maarten avatar](https://avatars.slack-edge.com/2020-09-28/1393040065826_b0d13cfde15deff02026_72.png)
Might look complicated, but it’s far less complicated than maintaining Opsworks
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
i currently can ssh, except it’s done by a 10 lines long script executed on instance init
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
looks like it will remain so for now because I’ve got real work to do
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
and i’ll later probably try out https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-connect-methods.html
The following instructions explain how to connect to your Linux instance using EC2 Instance Connect.
![randomy avatar](https://secure.gravatar.com/avatar/fa1b9e14d93f78f4c21a1238ae7984cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0001-72.png)
https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent.html is worth a look too. It can be used for SSH access (look up SSM Session Manager) and also to run commands on multiple instances and get back the results (look up SSM Run Commands)
Install SSM Agent on Amazon EC2 instances, and on-premises servers and virtual machines (VMs), to enable AWS Systems Manager to update, manage,and configure these resources.
![maarten avatar](https://avatars.slack-edge.com/2020-09-28/1393040065826_b0d13cfde15deff02026_72.png)
here an example of a bastion using the ssm agent
AWS Bastion server which can reside in the private subnet utilizing Systems Manager Sessions - Flaconi/terraform-aws-bastion-ssm-iam
![Karoline Pauls avatar](https://secure.gravatar.com/avatar/cc618da2f4549787aad7d1ebaa6ac630.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
looks better
![Chris Fowles avatar](https://avatars.slack-edge.com/2019-10-08/789284772630_caabfcff3b09cf0455ee_72.jpg)
OpWorks is a steaming pile of trash - I’ve had to do demos of it for AWS and even in that tightly constrained scenario it rarely worked
![Chris Fowles avatar](https://avatars.slack-edge.com/2019-10-08/789284772630_caabfcff3b09cf0455ee_72.jpg)
SSM instance connect is actually worth a look though - the rest of the SSM space is pretty questionable as far as I’m concerned (except param store) but the instance connect takes away a lot of pain pretty simply - gives you iam controlled, audited connectivity from either console or cli.
2020-03-04
![Saichovsky avatar](https://secure.gravatar.com/avatar/ab18e1173a03b8e8509206002f0c4717.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
Anyone able to help me with a CloudFormation/LandingZone issue? (description in thread)
![Saichovsky avatar](https://secure.gravatar.com/avatar/ab18e1173a03b8e8509206002f0c4717.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
I have two templates - one defining the execution role and the other the assumed role. I need to link these accounts in a trust relationship. How do I import the value of the execution role into the template for the assumed role? I had created an Outputs section in the executionrole.template, hoping to use SSM but I feel that it will get exported to the SSM in the master account while the value will be needed in the child account. So I need to pass the execution role ARN from one account to a stack running in a different account
![Maciek Strömich avatar](https://secure.gravatar.com/avatar/98de12365b633b063e208220100d4594.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0002-72.png)
not sure 100% but it seems that this CF feature might help you https://aws.amazon.com/about-aws/whats-new/2020/02/aws-cloudformation-stacksets-introduces-automatic-deployments-across-accounts-and-regions-through-aws-organizations/
![RB avatar](https://avatars.slack-edge.com/2020-02-26/958727689603_86844033e59114029b3c_72.png)
has anyone got any direction on the tls issues surrounding bucket names with periods ?
the S3 deprecation plan https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/
Bucket Names with Dots – It is important to note that bucket names with “.” characters are perfectly valid for website hosting and other use cases. However, there are some known issues with TLS and with SSL certificates. We are hard at work on a plan to support virtual-host requests to these buckets, and will share the details well ahead of September 30, 2020.
For example, we cannot use the <https://company.bucket-name.s3.amazonaws.com/mypath>
url if our bucket name containers a period. In this case the bucket name is company.bucket-name
. If we use curl and its https link, it will fail unless we skip certificate validation using --insecure
![loren avatar](https://secure.gravatar.com/avatar/d1e25dcfbc68a0857a04dd78c9afe952.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0003-72.png)
don’t use periods in bucket names?
![RB avatar](https://avatars.slack-edge.com/2020-02-26/958727689603_86844033e59114029b3c_72.png)
lol. let me just get in my time machine
![loren avatar](https://secure.gravatar.com/avatar/d1e25dcfbc68a0857a04dd78c9afe952.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0003-72.png)
have you tried using cloudfront as a workaround? you can use cloudfront to front an s3 bucket for https access
![RB avatar](https://avatars.slack-edge.com/2020-02-26/958727689603_86844033e59114029b3c_72.png)
we haven’t. for now, we were waiting to hear back from amazon to see what they say while we use the old url
![loren avatar](https://secure.gravatar.com/avatar/d1e25dcfbc68a0857a04dd78c9afe952.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0003-72.png)
though the article says they are trying to come up with a solution for bucket names with periods…
Bucket Names with Dots – It is important to note that bucket names with “.” characters are perfectly valid for website hosting and other use cases. However, there are some known issues with TLS and with SSL certificates. We are hard at work on a plan to support virtual-host requests to these buckets, and will share the details well ahead of September 30, 2020.
Host a static website on Amazon S3 by configuring your bucket for website hosting and then uploading your content to the bucket.
![RB avatar](https://avatars.slack-edge.com/2020-02-26/958727689603_86844033e59114029b3c_72.png)
they did say they would share the details well ahead of the sep 30, 2020 timeframe so looking for the official reply.
![RB avatar](https://avatars.slack-edge.com/2020-02-26/958727689603_86844033e59114029b3c_72.png)
i quoted that same bit in the OP
![loren avatar](https://secure.gravatar.com/avatar/d1e25dcfbc68a0857a04dd78c9afe952.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0003-72.png)
![RB avatar](https://avatars.slack-edge.com/2020-02-26/958727689603_86844033e59114029b3c_72.png)
ive asked my TAM and am waiting to hear back
![loren avatar](https://secure.gravatar.com/avatar/d1e25dcfbc68a0857a04dd78c9afe952.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0003-72.png)
they’ve adjusted the plan once already. i’m half-expecting the date to push out. with these blog posts being a scare-tactic to get customers to come forward with issues and use cases, and start folks changing behavior earlier so there are fewer issues after the real, eventual change
![loren avatar](https://secure.gravatar.com/avatar/d1e25dcfbc68a0857a04dd78c9afe952.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0003-72.png)
my brain just exploded
![RB avatar](https://avatars.slack-edge.com/2020-02-26/958727689603_86844033e59114029b3c_72.png)
lol
![RB avatar](https://avatars.slack-edge.com/2020-02-26/958727689603_86844033e59114029b3c_72.png)
we also depend heavily on other company’s s3 urls and they use periods in their buckets
![RB avatar](https://avatars.slack-edge.com/2020-02-26/958727689603_86844033e59114029b3c_72.png)
now we have to nag them to get them to add an access point? ridiculous…
![loren avatar](https://secure.gravatar.com/avatar/d1e25dcfbc68a0857a04dd78c9afe952.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0003-72.png)
yep
![loren avatar](https://secure.gravatar.com/avatar/d1e25dcfbc68a0857a04dd78c9afe952.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0003-72.png)
well, it sounded from the blog that if the bucket already exists then you’ll be fine. just new buckets after 30 september get impacted
Revised Plan – Support for the path-style model continues for buckets created on or before September 30, 2020. Buckets created after that date must be referenced using the virtual-hosted model.
![Tan Quach avatar](https://secure.gravatar.com/avatar/d656c9b04f2e5cdfb64d0662556eaa38.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0006-72.png)
Seems the version for this module went from 0.7.0 to 0.3.2 recently https://github.com/cloudposse/terraform-aws-s3-bucket/releases
Is that the correct next version?
Terraform module that creates an S3 bucket with an optional IAM user for external CI/CD systems - cloudposse/terraform-aws-s3-bucket
![RB avatar](https://avatars.slack-edge.com/2020-02-26/958727689603_86844033e59114029b3c_72.png)
^ maybe in the #terraform channel
Terraform module that creates an S3 bucket with an optional IAM user for external CI/CD systems - cloudposse/terraform-aws-s3-bucket
![Erik Osterman (Cloud Posse) avatar](https://secure.gravatar.com/avatar/88c480d4f73b813904e00a5695a454cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0023-72.png)
We sometimes release a bug fix for earlier versions of the module for 0.11
![omerfsen avatar](https://secure.gravatar.com/avatar/b66c1225c52ce7769292f48c16d03f0f.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0012-72.png)
Issue is the version went backwards I think
![Erik Osterman (Cloud Posse) avatar](https://secure.gravatar.com/avatar/88c480d4f73b813904e00a5695a454cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0023-72.png)
Also, one heads up on this module: make sure the lifecycle rules are right for you
![Erik Osterman (Cloud Posse) avatar](https://secure.gravatar.com/avatar/88c480d4f73b813904e00a5695a454cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0023-72.png)
we’re going to disable them by default in an upcoming PR, but still expose the functionality
![Tan Quach avatar](https://secure.gravatar.com/avatar/d656c9b04f2e5cdfb64d0662556eaa38.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0006-72.png)
(also not sure where to post that question )
![Erik Osterman (Cloud Posse) avatar](https://secure.gravatar.com/avatar/88c480d4f73b813904e00a5695a454cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0023-72.png)
2020-03-05
![omerfsen avatar](https://secure.gravatar.com/avatar/b66c1225c52ce7769292f48c16d03f0f.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0012-72.png)
Hello team. What do you suggest to fine tune permissions on AWS EKS pods ? I am planning to setup an multi-tenant AWS EKS and plan to use Namespaces for seperation but i need some advice I have found https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/ . I know AWS EKS does not have built-in fine tuned separation of duties. So what would be your suggestion ? I plan to hold each customer on separate namespace and only allow a service account to accesss to specific namespac
![attachment image](https://d2908q01vomqb2.cloudfront.net/ca3512f4dfa95a03169c5a670a4c91a19b3077b4/2019/09/04/irp-eks-setup-1024x1015-636x630.png)
Here at AWS we focus first and foremost on customer needs. In the context of access control in Amazon EKS, you asked in issue #23 of our public container roadmap for fine-grained IAM roles in EKS. To address this need, the community came up with a number of open source solutions, such as kube2iam, kiam, […]
![omerfsen avatar](https://secure.gravatar.com/avatar/b66c1225c52ce7769292f48c16d03f0f.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0012-72.png)
Also i am planning to use Fargate on this setup.
![omerfsen avatar](https://secure.gravatar.com/avatar/b66c1225c52ce7769292f48c16d03f0f.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0012-72.png)
Or should i use https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html
The IAM roles for service accounts feature is available on new Amazon EKS Kubernetes version 1.14 clusters, and clusters that were updated to versions 1.14 or 1.13 on or after September 3rd, 2019. Existing clusters can update to version 1.13 or 1.14 to take advantage of this feature. For more information, see
![Erik Osterman (Cloud Posse) avatar](https://secure.gravatar.com/avatar/88c480d4f73b813904e00a5695a454cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0023-72.png)
We use IAM roles for service accounts, and associate each service account with the pods that need it. E.g. we create an IAM role with the requisite polices for external-dns
then associate that with a service account that we pass to the external-dns
deployment/pods.
The IAM roles for service accounts feature is available on new Amazon EKS Kubernetes version 1.14 clusters, and clusters that were updated to versions 1.14 or 1.13 on or after September 3rd, 2019. Existing clusters can update to version 1.13 or 1.14 to take advantage of this feature. For more information, see
![Erik Osterman (Cloud Posse) avatar](https://secure.gravatar.com/avatar/88c480d4f73b813904e00a5695a454cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0023-72.png)
We then do this for every other service we deploy
![Erik Osterman (Cloud Posse) avatar](https://secure.gravatar.com/avatar/88c480d4f73b813904e00a5695a454cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0023-72.png)
like cert-manager
![omerfsen avatar](https://secure.gravatar.com/avatar/b66c1225c52ce7769292f48c16d03f0f.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0012-72.png)
I see so you dont use kiam or kube2iam but this newly announced feature of AWS Eks https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/
![attachment image](https://d2908q01vomqb2.cloudfront.net/ca3512f4dfa95a03169c5a670a4c91a19b3077b4/2019/09/04/irp-eks-setup-1024x1015-636x630.png)
Here at AWS we focus first and foremost on customer needs. In the context of access control in Amazon EKS, you asked in issue #23 of our public container roadmap for fine-grained IAM roles in EKS. To address this need, the community came up with a number of open source solutions, such as kube2iam, kiam, […]
![Erik Osterman (Cloud Posse) avatar](https://secure.gravatar.com/avatar/88c480d4f73b813904e00a5695a454cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0023-72.png)
Yes, for EKS we use the new service accounts. for kops we still use kiam- haven’t explored it yet.
![Erik Osterman (Cloud Posse) avatar](https://secure.gravatar.com/avatar/88c480d4f73b813904e00a5695a454cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0023-72.png)
Service accounts are much better once you have it working.
![kskewes avatar](https://avatars.slack-edge.com/2019-04-05/592663364881_098e29e5a0fe63cc7c82_72.jpg)
Yep to service accounts. For multi tenant if you need hard boundaries I’d consider a separate cluster. Otherwise namespace network policy, service accounts, pod security policy, OPA on admission, etc.
![davidvasandani avatar](https://avatars.slack-edge.com/2019-10-02/784259469622_7d9e31719822afd94ef8_72.jpg)
When making requests to a private AWI GW through an internal NLB the request is routed to the correct stage but the stage name is also included in the resourcePath which breaks the request. This only happens when making the request thought the NLB.
2020-03-06
![Maciek Strömich avatar](https://secure.gravatar.com/avatar/98de12365b633b063e208220100d4594.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0002-72.png)
hey, does anyone has ready to use cloudwatch alarms for kinesis firehose in cloudformation format e.g. for Delivery.DataFresshness, or ThrottledRecords or even better for those math expressions specified in https://docs.aws.amazon.com/firehose/latest/dev/monitoring-with-cloudwatch-metrics.html#firehose-metric-dimensions? (my friday lazy ass is asking (-: )
Learn how to use CloudWatch Metrics to monitor delivery streams in Amazon Kinesis Data Firehose.
![joshmyers avatar](https://avatars.slack-edge.com/2018-11-20/483958217281_8117d6f6c62807ce9912_72.jpg)
Not for that exact use but math expressions in CL alarms aye…
![Maciek Strömich avatar](https://secure.gravatar.com/avatar/98de12365b633b063e208220100d4594.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0002-72.png)
I wonder why they don’t include cw alarms in the docs of the resource.
![Maciek Strömich avatar](https://secure.gravatar.com/avatar/98de12365b633b063e208220100d4594.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0002-72.png)
e.g. https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-kinesisfirehose-deliverystream.html has a complete example with bucket, roles, policies and deliverystream
Use the AWS CloudFormation AWS::DeliveryStream resource for KinesisFirehose.
![Maciek Strömich avatar](https://secure.gravatar.com/avatar/98de12365b633b063e208220100d4594.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0002-72.png)
why they wouldn’t just add an example cw alarm in there?
![Maciek Strömich avatar](https://secure.gravatar.com/avatar/98de12365b633b063e208220100d4594.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0002-72.png)
ok, so for data freshness it should alarm only if 900 secs mark is crossed because that’s the maximum firehose limit
![Maciek Strömich avatar](https://secure.gravatar.com/avatar/98de12365b633b063e208220100d4594.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0002-72.png)
for anyone interested ;-))
FirehoseDeliveryDataFreshness:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmDescription: Firehose Delivery Data Freshness Alarm
AlarmName: FirehoseDeliveryDataFreshness
ComparisonOperator: GreaterThanThreshold
Period: 60
EvaluationPeriods: 3
Threshold: 150
Statistic: Maximum
MetricName: DeliveryToS3.DataFreshness
Namespace: AWS/Firehose
Dimensions:
- Name: DeliveryStreamName
Value: !Ref FirehoseDeliveryStream
TreatMissingData: breaching
AlarmActions:
- !Ref 'AlarmTopicArn'
InsufficientDataActions:
- !Ref 'AlarmTopicArn'
![Maciek Strömich avatar](https://secure.gravatar.com/avatar/98de12365b633b063e208220100d4594.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0002-72.png)
my bufferinghints are set to 120 secs which means that on avg kinesis will deliver on the 121 second.
![Maciek Strömich avatar](https://secure.gravatar.com/avatar/98de12365b633b063e208220100d4594.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0002-72.png)
I gave it a littlebit more room
2020-03-10
![kj22594 avatar](https://secure.gravatar.com/avatar/020d15eec6c879209a42d38a5af35b86.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0010-72.png)
Hey all, I have a two questions regarding AWS Parameter Store:
1) Does anyone know of a good way to find when the last time a secret has been accessed? 2) Is there a good way to find how many times a given parameter has been accessed in a given time frame (ideally 24 hours)?
![tomv avatar](https://secure.gravatar.com/avatar/44a7bbeb9f6711e508a0141d8b365470.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0019-72.png)
I think cloudtrail would be best suited to answer both questions
![kj22594 avatar](https://secure.gravatar.com/avatar/020d15eec6c879209a42d38a5af35b86.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0010-72.png)
So i’ve been looking into using cloudtrail for the latter. For the former I was hopeful that I wouldn’t have to do something like query all cloudtrail logs for the last time GetParameter(s) was called on each secret
![Alex Siegman avatar](https://avatars.slack-edge.com/2019-04-10/592429074434_cea95e800f54d8ea3544_72.jpg)
Anyone familiar with Firehose delivery to Redshift? From everything I can tell, it seems like your redshift needs to be publicly addressable (IE in a “public” subnet with an internet gateway) for Firehose to talk to it, complete with a /27 CIDR block range for Firehose to allow through the SG. Seems just weird to me to keep data stores publicly accessible like that. Is there any configurations out there where it can stay in the private subnet, but still get delivered to from firehose?
![Maciek Strömich avatar](https://secure.gravatar.com/avatar/98de12365b633b063e208220100d4594.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0002-72.png)
You can deliver to S3 and pull the data from there into redshift.
![Alex Siegman avatar](https://avatars.slack-edge.com/2019-04-10/592429074434_cea95e800f54d8ea3544_72.jpg)
Yeah, thought of that work around this morning before seeing the message. Seems to be the way to do it.
![David Lozano avatar](https://avatars.slack-edge.com/2020-10-28/1453157962374_67b9b13d23898f6d2fda_72.png)
Hi @Alex Siegman . I think that you can achieve this using aws vpc endpoint. With Redshift in the VPC, it should be able to communicate with Firehose through the endpoint without leaving the vpc(aws backbone). I haven’t used this solution but it may work for you.
https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints.html
Use a VPC endpoint to privately connect your VPC to other AWS services and endpoint services.
![Alex Siegman avatar](https://avatars.slack-edge.com/2019-04-10/592429074434_cea95e800f54d8ea3544_72.jpg)
my impression was that was for outbound connectivity to services, if redshift was “Reading from” firehose, not the public firehose service delivering to redshift. Could be wrong. Might try it in a test account
![casey avatar](https://secure.gravatar.com/avatar/da69ad958719719e3fe921887eb73b1f.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0009-72.png)
anyone have strong opinions on where ETL/ML structure should be set up? i currently have airflow in our staging k8s cluster but its kind of a pain to manage cross account permissions:
• could move airflow to our prod k8s (worried about doing this)
• could create a new k8s cluster in prod account and move it there
• or could create a new aws data
account that has access to everything?
any insights/experiences are appreciated
![Marcin Brański avatar](https://secure.gravatar.com/avatar/7f3c56304d6e3adb7658889af56cd171.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0001-72.png)
Have you tried/seen kubeflow?
![casey avatar](https://secure.gravatar.com/avatar/da69ad958719719e3fe921887eb73b1f.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0009-72.png)
@Marcin Brański does kubeflow do ETL?
![casey avatar](https://secure.gravatar.com/avatar/da69ad958719719e3fe921887eb73b1f.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0009-72.png)
but yeah for ML side at least it seems good, but havent tried
2020-03-11
![Zachary Loeber avatar](https://avatars.slack-edge.com/2020-05-13/1115475485942_e68ae4d6556df390de70_72.jpg)
https://github.com/bottlerocket-os/bottlerocket -> AWS’s container OS looks interesting. Has an update strategy that reminds me of upgrading F5 LTMs….
An operating system designed for hosting containers - bottlerocket-os/bottlerocket
2020-03-12
![joshmyers avatar](https://avatars.slack-edge.com/2018-11-20/483958217281_8117d6f6c62807ce9912_72.jpg)
![Riak avatar](https://secure.gravatar.com/avatar/25c4dba3b22975994919ed2b82af4184.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0015-72.png)
Hello, Im looking for . modules that deploy eks with s3 backend + dynamodb for lock state
2020-03-13
![Maciek Strömich avatar](https://secure.gravatar.com/avatar/98de12365b633b063e208220100d4594.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0002-72.png)
![attachment image](https://d2908q01vomqb2.cloudfront.net/1b6453892473a467d07372d45eb05abc2031647a/2020/03/12/Picture3.png)
In July 2015, AWS announced Amazon API Gateway. This enabled developers to build secure, scalable APIs quickly in front of a variety of different types of architectures. Since then, the API Gateway team continues to build new features and services for customers. Figure 1: API Gateway feature highlights timeline In early 2019, the team evaluated […]
2020-03-16
![Karthik Sadhasivam avatar](https://secure.gravatar.com/avatar/8ae9d717afe8173c894993ccf9ca4390.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0001-72.png)
Hi Guys, I am new to this channel and trying to get some advice on the rolling update EC2 on the ASGs. I am trying to use this module https://registry.terraform.io/modules/cloudposse/ec2-autoscale-group/aws/0.4.0 and seeing that everytime I update userdata, instance type it just creates a new version of launch template but doesnt do any rolling update on the ASG. Is there is any sort of workarounds available as discussed in https://github.com/hashicorp/terraform/issues/1552
Once #1109 is fixed, I'd like to be able to use Terraform to actually roll out the updated launch configuration and do it carefully. Whoever decides to not roll out the update and change the LC…
![Erik Osterman (Cloud Posse) avatar](https://secure.gravatar.com/avatar/88c480d4f73b813904e00a5695a454cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0023-72.png)
Please avoid cross posting, but if you do, link to your original post. #terraform is the correct channel
Once #1109 is fixed, I'd like to be able to use Terraform to actually roll out the updated launch configuration and do it carefully. Whoever decides to not roll out the update and change the LC…
![Karthik Sadhasivam avatar](https://secure.gravatar.com/avatar/8ae9d717afe8173c894993ccf9ca4390.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0001-72.png)
got it
![Karthik Sadhasivam avatar](https://secure.gravatar.com/avatar/8ae9d717afe8173c894993ccf9ca4390.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0001-72.png)
thanks
2020-03-18
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
Morning everyone ! Quick VPC/EC2 question. When you change the Route Table for a subnet, is it required to restart/launch new EC2 instances on that subnet to reflect the change ? Is it immediate ?
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
I guess there are some exceptions, but still not on the EC2 side. Like if you’re using BGP on a VPN and the routes are being propagated from your edge-device, then that device may need to push updates?
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
To be honest I’m not super clear on those VPN / Gateway scenarios. But if you’re just talking about entirely within the context of a single VPC, you should be good
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
2020-03-20
![David avatar](https://secure.gravatar.com/avatar/4f47da5c338b83938ce2229dbbd5460f.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0003-72.png)
I’m trying to sign a request on an ECS task to sts using my Instance profile IAM Role. When I check the IAM Request headers on my requests, it looks like the session token for the instance role is not being included.
Is there a way to get the current session token on the task?
![Marcin Brański avatar](https://secure.gravatar.com/avatar/7f3c56304d6e3adb7658889af56cd171.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0001-72.png)
Wouldn’t you prefer to use task role instead of instance role?
2020-03-21
2020-03-23
![RB avatar](https://avatars.slack-edge.com/2020-02-26/958727689603_86844033e59114029b3c_72.png)
How do people here lower replication lag in aurora replicas ? I’m looking at write operations and see a direct correlation with spikes in replication lag which is expected but the spikes are well over 100 ms reaching around 1 sec
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
Its been a while since I’ve messed with perf tuning Aurora but I would look at:
• instance classes you’re using since that directly impacts network throughput
• provisioned iops
• memory buffers (in cases of mysql / pgsql)
• any development usage of the replica (data-warehouse / etl / jobs)
![maarten avatar](https://avatars.slack-edge.com/2020-09-28/1393040065826_b0d13cfde15deff02026_72.png)
Isn’t aurora replication done at I/O layer, have you contacted support ?
![RB avatar](https://avatars.slack-edge.com/2020-02-26/958727689603_86844033e59114029b3c_72.png)
we haven’t contacted support because no degradation has been seen. this was just unnerving to see from our developers and i was wondering if any other people have noticed issues and successfully lowered replication lag
![RB avatar](https://avatars.slack-edge.com/2020-02-26/958727689603_86844033e59114029b3c_72.png)
according to aws, it looks like replication lag is correlated well with write operations and we can see that side by side in the metrics
![RB avatar](https://avatars.slack-edge.com/2020-02-26/958727689603_86844033e59114029b3c_72.png)
for now, we’re raising our alert from 0.2 sec to 1.5 sec since with 1 sec lag we’re not seeing issues
![Meb avatar](https://secure.gravatar.com/avatar/22f2dd879a5accf3929330d977b39106.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0004-72.png)
We removed READ replicas due to the lag on my project. The lag can mount to few seconds on heavy write operations.
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
![Meb avatar](https://secure.gravatar.com/avatar/22f2dd879a5accf3929330d977b39106.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0004-72.png)
We were having data state inconsistencies that blew up our workflow, so we end up removing the replica and up sizing the main cluster
![Meb avatar](https://secure.gravatar.com/avatar/22f2dd879a5accf3929330d977b39106.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0004-72.png)
I mean for READ replica’s
![Meb avatar](https://secure.gravatar.com/avatar/22f2dd879a5accf3929330d977b39106.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0004-72.png)
It was a pain the read replica’s in aurora… too much lag
2020-03-24
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
Hello People !! I have a simple problem and I would like to know if anyone here have worked around it. I need to know , from within an EC2 instance, what is the Target Group that the instance is attached to, and then be able to de-register from it…
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
I could be wrong but I don’t think you’ll find anything “standard” in this use-case for the following reasons:
• it predicates the instance is using a role (which isn’t required in general for launching an instance)
• the roles policy permissions would be pretty use-case specific
• sadly there’s no meta-data available for assigned tg’s (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-categories.html)
The following table lists the categories of instance metadata.
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
It might help if you provided more context around the use-case. I’d wonder why not conditionally assign the instance in whatever process is doing that
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
Thanks for the reply @androogle ! . We have an instance role with the proper permissions for querying the Load Balancers and Target Groups… and registering and de registering from them
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
what is the event that you’d like to trigger the removing of the instance from the target group?
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
We have a super big application with around 20 different Target Groups on Different AWS zones… so I want to create a small Python/bash script that would be executed by the Java JVM whenever there is a critical condition, so the instance can de register itself from the TG
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
So that script needs to know what is the TG the instance is attached to
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
Using the AWS CLI
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
aws elbv2 deregister-targets \
--target-group-arn arn:aws:elasticloadbalancing:us-west-2:123456789012:targetgroup/my-targets/73e2d6bc24d8a067 \
--targets Id=i-1234567890abcdef0
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
got it. I’m a python person so I’d probably do it with a mixture of metadata to get the instance-id and boto3 to get the target-group
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
Yep… the Instance ID is a piece of cake … getting the Target Group is more challenging … ’case we have 20+ TargetGroups
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
I mean.. it’s doable on either Python or Bash… I just wanted to avoid rewriting the wheel.. in case someone have already created it
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
yeah I follow. I was just saying because of those initial bullet-points it’d be hard for anyone to make a standardized approach to that. Its a pretty unique use-case
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
Yep… seems like
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
Out of curiosity, what happens with the instances that are failing?
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
Java internal issues …. OutOfMemoryExceptions
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
like is it possible to configure the java app to return an http code that would cause the health-check to fail and remove it from the TG ?
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
and other conditions …. so we want to take the instance of out the load balancer so we can perform post-mortem
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
Otherwise the instance is terminated by the ASG
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
ahhh I gotcha
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
Anyways … @androogle thanks for much for your interest and for the tips !
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
Good luck
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
TY … I will post the resulting script..
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
I just don’t want to reinvent the wheel in case someone has a script for this
![Santiago Campuzano avatar](https://secure.gravatar.com/avatar/f8f05f122df51440e3bd79dd0feb089b.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0000-72.png)
Thanks a lot !
2020-03-25
![Pierre Humberdroz avatar](https://avatars.slack-edge.com/2019-12-10/856434906819_d99dd3e0bce66357e0ce_72.png)
Hey,
can I somehow see which instances are out of stock for a specific aws region? we have been unable to launch c5.xlarge in eu-central-1a a couple of times this week and I am trying to find a source that will tell me when something is unavailable
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
do you mean spot or just in-general you couldn’t launch a c5.xlarge ?
![Pierre Humberdroz avatar](https://avatars.slack-edge.com/2019-12-10/856434906819_d99dd3e0bce66357e0ce_72.png)
in general
![Pierre Humberdroz avatar](https://avatars.slack-edge.com/2019-12-10/856434906819_d99dd3e0bce66357e0ce_72.png)
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
I’ve never run into that. hm I wonder if there’s anything in the API
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
Returns a list of all instance types offered. The results can be filtered by location (Region or Availability Zone). If no location is specified, the instance types offered in the current Region are returned.
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
maybe that?
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
hmm. is it presently happening? I don’t know if that metric is life or “general” availability
aws ec2 describe-instance-type-offerings --region=eu-central-1 | grep -B1 -A3 c5.xlarge
{
"InstanceType": "c5.xlarge",
"LocationType": "region",
"Location": "eu-central-1"
},
![Pierre Humberdroz avatar](https://avatars.slack-edge.com/2019-12-10/856434906819_d99dd3e0bce66357e0ce_72.png)
no happened mostly last two days in the mornings
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
but that still doesn’t narrow it down to zone
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
I’ve had issues where I can get certain instance types in us-east-1a but not like us-east-1e
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
next time it happens you could try:
aws ec2 describe-instance-type-offerings --region=eu-central-1 --location-type availability-zone | grep -B1 -A3 c5.xlarge
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
if it doesn’t return anything for the zone its erroring on then you know you can use that call
![Pierre Humberdroz avatar](https://avatars.slack-edge.com/2019-12-10/856434906819_d99dd3e0bce66357e0ce_72.png)
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
Hey guys! I have an ECS service running my api in containers and today I noticed that the running containers stopped after 24 hours. It happened on staging as well on production today after exactly 24 hours, with the events separated excatly by time the deployment from staging to production took. Is this expected behavior?
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
are these running as a service or a 1 off task?
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
This means they are running as a service right?
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
yup!
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
have you looked at the service / task logs when they stop?
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
also when you look at the stopped tasks, does it give a reason?
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
This is the event log in the service itself. To me it reads like ECS itself decided to drain connections and stop the containers, rather than an unexpected error occurring in the containers.
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
is the service mapped to a target-group?
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
if the target in the target-group fails health-checks, it will de-register the task
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
Also I’d check if the service is logging to cloudwatch (or anywhere) and check the logs of the api itself
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
but to your question, no thats not normal / expected
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
happy to help debug it with you
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
is the service mapped to a target-group?
Yes, indeed!
if the target in the target-group fails health-checks, it will de-register the task
I see in the ALB logs that there were HTTP 503 codes during the same time!
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
happy to help debug it with you
Thank you, really appreciate that! For your information, I’m quite junior in Ops, but still the most experienced in our company.
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
cool np at all What I would try and do if I were in that position is find the api logs and corollate them to the 503 response codes if you can
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
see if its an app issue, environment issue or combo of both
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
if the services have been setup to use the aws logger, you may find logs in cloudwatch that are for that api service
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
Thanks I found the problem: from the api we ping the postgres database to see if it’s still available. The client from node-postgres
had an unhandled error, leading to a crash. Seems like the database drops connections active longer than 24 hours, but I’ll have to investigate more.
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
yeah postgres is probably doing a schedule backup or maintenance
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
if you’re using a HA setup of postgres (RDS with replicas) I’d ping the cluster url, not the individual instance
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
hopefully soon AWS RDS Proxy (assuming you’re using RDS) will become GA and make issues like this less prevalent
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
We are using Aurora Serverless and I think that one is using a proxy too already. However, in de api, I make a connection with the database and perform a simple query (SELECT true
).
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
do you know if you’re using the cluster-url vs the ro endpoint?
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
that may make a difference
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
when maintenance period happens or a scale-up, the resource for the replicas may change, but to be honest I don’t have a lot of experience with Aurora Serverless
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
this is the url: bouw-staging-db.cluster-xxxxxxxxxxxx.eu-west-1.rds.amazonaws.com
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
hm ok yeah that looks like a cluster-url format
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
since this is a staging environment, do you leverage Aurora for a cost savings measure and have it turn-off after a certain period?
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
though if the app is constantly doing a select 1 it should reset the countdown timer to shutdown, so thats probably not it. hmm
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
Hmmm, it happened happened exactly 24 hours after container startup, in different envs (staging and prod).
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
hehe, yeah indeed.
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
I guess I’d poke around in Cloudwatch Metrics for that Aurora instance. Look at connection count and other things to see if it drops all connections. See if the Aurora instance has events in its event log about restarting or applying queued maintenance
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
has this been recurring or just a one time thing per environment?
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
has this been recurring or just a one time thing per environment?
We deployed yesterday, looking in the logs now to see if it happened somewhere before the deployment.
I’d like to thank you for your patience and suggestions in the meantime, appreciate it a lot
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
np One thing I’d say to keep in mind, recently AWS updated their RDS CA certificate for secure connections to RDS
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
and gave time for everyone to switch over, its possible that time has passed and anyone who hadn’t switched to using the new CA had it “forced” on them
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
causing a restart
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
hm that was supposed to be March 5th, so maybe it wasn’t that
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
I found one entry on staging staging having the unhandled error, but it didn’t down all the tasks it seems.
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
for the target-group this service is registered in. Does the health-check allow for multiple failures before considered unhealthy?
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
or just a single one? Technically the cluster-url should be 100% available but its hard to calculate what possible edge-cases there could be
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
Not much, but it should be enough for a recovery.
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
hmm ok so if this were on the Aurora side, it’d have to be unavailable for 30+ seconds to trigger this
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
Uhhmm, our app is not robust in that regard: one of those unhandled errors causes the container to crash, so when the connection gets terminated, it’s an unhandled, causing a container crash.
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
ah ok. so if the db was unavailable at the time of check the process exits.
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
I’ve had similar issues, in a lot of cases we used supervisord in our containers
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
and that would be the “watchdog” of the process. but we also had APM and other reporting so we knew when the process crashed
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
SELECT pg_terminate_backend(pid) FROM
pg_stat_activity WHERE application_name = 'api-health-check';
Just ran this, killing all connections from the api and they all crashed.
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
ouch
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
gotcha so error handling and retries might be a good takeaway
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
haha yep Still wondering why it happened at the 24 hour mark though, but could be connection cleanup or something like that…
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
I would look at the Logs & Events
Tab in the Aurora Serverless Instance details
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
and see if there were any maintenance or restart events
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
Cannot see anything there happening around that time. Also nothing weird in the postgres logs
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
I find APM’s priceless in triaging most issues like this. Depending on your companies appetite there are some reasonable options out there ranging from really cheap DIY, to arm and leg hosted in cloud
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
NewRelic / DataDog are pretty common
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
I’m a fan of Elastic APM as its simple but effective
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
Thanks for the suggestions, unfortunately i’m not given the time (yet…..) to look into those solutions..
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
np, been there. Good luck Feel free to reach out
![Tyrone Meijn avatar](https://secure.gravatar.com/avatar/60d52311d6f51d9f6beb1173c5b8e735.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0007-72.png)
Thanks again! Will keep that in mind
![randomy avatar](https://secure.gravatar.com/avatar/fa1b9e14d93f78f4c21a1238ae7984cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0001-72.png)
No, that doesn’t sound right. I’ve just checked and have Fargate tasks running since January.
2020-03-27
![Igor avatar](https://avatars.slack-edge.com/2022-03-17/3244104166391_48a8db73944f03735a65_72.jpg)
Is chamber
able to read secrets that it didn’t write? I have set the KMS alias, and I can list the keys, but they are showing up with Version 0, and I am unable to read them.
![Igor avatar](https://avatars.slack-edge.com/2022-03-17/3244104166391_48a8db73944f03735a65_72.jpg)
Looks like it’s related to this issue: https://github.com/segmentio/chamber/issues/251
![Erik Osterman (Cloud Posse) avatar](https://secure.gravatar.com/avatar/88c480d4f73b813904e00a5695a454cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0023-72.png)
Adding @discourse_forum bot
![discourse_forum avatar](https://avatars.slack-edge.com/2020-03-26/1029663249525_451a74d3463357c40dbf_72.png)
@discourse_forum has joined the channel
2020-03-28
![Zach avatar](https://avatars.slack-edge.com/2020-07-21/1278358623280_e99d673db1471fc93095_72.jpg)
With RDS IAM authorizations for database users, is it possible to have ‘real users’ (ie, Tom Jones) who log into AWS via Okta SAML and assume a common role (“Developer”) still have unique RDS IAM authorizations? ie TomJones has okta username tJones@myco and assumes the Developer role, I still want him to use a database IAM user tJones, and not have access to the ‘bSmith’ database IAM user that maps to user Bob Smith.
![Zach avatar](https://avatars.slack-edge.com/2020-07-21/1278358623280_e99d673db1471fc93095_72.jpg)
Answering my own question, we found an Okta blog post that showed us how to pass additional Principal Tags through the SAML response that AWS can use, which allows us to get the exact username from Okta.
![kenny avatar](https://secure.gravatar.com/avatar/feec04cca3dc652148eda28a4c1ef93a.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0005-72.png)
, means that you don’t have to create a user for database, instead use the real user who signed using Okta SAML. Using a tag are you able to control which user should have an access to which database?
![Zach avatar](https://avatars.slack-edge.com/2020-07-21/1278358623280_e99d673db1471fc93095_72.jpg)
No you still have to create the user on the DB, and grant it the rds_iam flag so that it can use the IAM auth. What I was then seeing was that our ‘userid’ comes in as <roleid>:<oktaname> which was super user unfriendly as a login name. Eventually we found a blog from okta that said we could enable a beta feature on our account that would let us pass extra SAML tags into AWS. Note though - this required a change to ALL trust policies for the SAML, it actually broke some of our logins for awhile
![kenny avatar](https://secure.gravatar.com/avatar/feec04cca3dc652148eda28a4c1ef93a.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0005-72.png)
thanks for sharing it!
2020-03-30
![SlackBot avatar](https://slack.global.ssl.fastly.net/66f9/img/slackbot_32.png)
This message was deleted.
2020-03-31
![setheryops avatar](https://avatars.slack-edge.com/2020-03-24/1012344269937_600e8094ca1121ddff3a_72.jpg)
Anyone have thoughts on only using Workmail to forward mail to another address? Im moving my domain away from GoDaddy to R53. I currently have my email setup to forward to my gmail address and then when I send from there gmail is set to make it look like im sending from my domain address. Can I use Workmail as a “proxy” the same way and have it dump all my email to my gmail address? My other option is to just go with Gsuite basic plan and basically pay $6 a month to do the same thing.
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
FWIW, I had a similar scenario and I accomplished it with SES forwarder
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
it required a lambda and s3 and some policies but it works (for the most part)
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
there’s the caveat that gmail won’t show attached .eml’s inline
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
but thats more of a gmail issue
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
so you have to download it and open it
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
https://aws.amazon.com/blogs/messaging-and-targeting/forward-incoming-email-to-an-external-destination/ if you’re interested
![attachment image](https://d2908q01vomqb2.cloudfront.net/356a192b7913b04c54574d18c28d46e6395428ab/2017/06/23/6288c174-a286-4b65-9b3b-6199bfdaa1e0.png)
Note: This post was written by Vesselin Tzvetkov, an AWS Senior Security Architect, and by Rostislav Markov, an AWS Senior Engagement Manager. Amazon SES has included support for incoming email for several years now. However, some customers have told us that they need a solution for forwarding inbound emails to domains that aren’t managed by […]
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
I went that route since workmail is a lot more expensive
![setheryops avatar](https://avatars.slack-edge.com/2020-03-24/1012344269937_600e8094ca1121ddff3a_72.jpg)
![setheryops avatar](https://avatars.slack-edge.com/2020-03-24/1012344269937_600e8094ca1121ddff3a_72.jpg)
ill check that out
![randomy avatar](https://secure.gravatar.com/avatar/fa1b9e14d93f78f4c21a1238ae7984cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0001-72.png)
That blog post has comments where people complain about it, and link to a github project with a lot of issues. I ended up making an SES Lambda forwarder in Python that has seen heavy use and works for the most part. I can try to open source it if there’s interest. (it’s a Terraform module, it also uses Step Functions for retries)
![randomy avatar](https://secure.gravatar.com/avatar/fa1b9e14d93f78f4c21a1238ae7984cb.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0001-72.png)
“heavy use” = forwarding 10-15k emails per month
![joshmyers avatar](https://avatars.slack-edge.com/2018-11-20/483958217281_8117d6f6c62807ce9912_72.jpg)
I do this, I think ?
![Mike Martin avatar](https://avatars.slack-edge.com/2020-02-05/940755534935_2259c2aed6bcbc52e117_72.jpg)
Anyone have any ideas on how to setup any type of Directory Service (ie. AWS Managed Microsoft AD) in the Ningxia (China) region WHILE YOUR SOURCE DIRECTORY LIVES IN VIRGINIA. The documentation I’m reading is pretty much saying you need your source directory in China as well. I haven’t been able to Google this…
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
does the Ningxia region have cognito? Maybe setup an identity pool and point to the AD in VA as the User source?
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
more of an SSO than directory replication
![Mike Martin avatar](https://avatars.slack-edge.com/2020-02-05/940755534935_2259c2aed6bcbc52e117_72.jpg)
Hmmm let me dig into that - I appreciate the quick response!
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
np, I was thinking something like this - https://medium.com/@zippicoder/setup-aws-cognito-user-pool-with-an-azure-ad-identity-provider-to-perform-single-sign-on-sso-7ff5aa36fc2a
![attachment image](https://miro.medium.com/max/1160/0*uS8lhCFSJWn20E7q.png)
This a step-by-step tutorial of how to set up an AWS Cognito User Pool with an Azure AD identity provider and perform single sign-on (SSO)…
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
with exception of AD being in AWS instead of Azure
![Mike Martin avatar](https://avatars.slack-edge.com/2020-02-05/940755534935_2259c2aed6bcbc52e117_72.jpg)
I do NOT believe that AWS Workspaces supports Cognito as an IdP. You need to back the “workspaces” instances with a “Directory Service”.
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
ahh ok workspaces. gotcha
![Mike Martin avatar](https://avatars.slack-edge.com/2020-02-05/940755534935_2259c2aed6bcbc52e117_72.jpg)
Yeah - here are my options. -AWS Managed Microsoft AD -Simple AD -AD Connector
I’m looking to back one of these directories with an LDAP server in Virginia if possible. I think I need to open a public IP of sorts…or create a VPN between the two, but that sounds illegal…
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
I’m not familiar with the restrictions of that region, does it not allow things like VPN / VPC peering etc?
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
I don’t even see a Ningxia region in my console
![Mike Martin avatar](https://avatars.slack-edge.com/2020-02-05/940755534935_2259c2aed6bcbc52e117_72.jpg)
Oh, the AWS China partition (Ningxia and Beijing) is completely separate from the AWS that you’re (probably) currently using. It’s a completely different login and console with limited services.
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
ah ok. Well routing issues aside, I suppose you could setup something like FreeIPA in China region
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
and have that be a peer / slave to your VA region, assuming you can solve the connectivity issue
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
I think you could use AD Connector w/that? I’m not sure
![Mike Martin avatar](https://avatars.slack-edge.com/2020-02-05/940755534935_2259c2aed6bcbc52e117_72.jpg)
I’ve never used FreeIPA - I am trying to determine if that would be better than just spinning up another Windows based domain controller in the China AWS region, instead of learning and testing this new tech (FreeIPA).
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
yeah just regular windows might be better if you have the experience there. FreeIPA is great but there’s a curve. I wasn’t sure what the availability was for MS licensing on that region
![Mike Martin avatar](https://avatars.slack-edge.com/2020-02-05/940755534935_2259c2aed6bcbc52e117_72.jpg)
Oh goodness…if they don’t have MS licensing then we’ve got bigger problems lol
![Matt Gowie avatar](https://avatars.slack-edge.com/2023-02-06/4762019351860_44dadfaff89f62cba646_72.jpg)
Hey folks — I’ve got a fun client problem I’m trying to solve. Looking for some input if anybody has a good idea…
I built the environment like so:
3 domains =
CloudFront distribution =
ALB =
ECS Service
This works great, the ECS Service webserver handles providing different views of the application depending on the domain name via the Host
header. That in combination with a couple ALB Listener Rules does the trick for what each of those domains are expected to show.
One of the domains however is the “Admin CMS” domain. I just found out that the client would like to IP Whitelist this domain so only a couple VPNs can access it to help further lock down access to the Admin CMS. This is tricky of course since I cannot use standard security control mechanisms (the ALB security group or VPC Network ACLs) since ALL domains are routed through the same ALB, which means I cannot apply a security group rule to only one domain.
So I’m looking for a workaround… Maybe through an additional ALB Listener Rule? Has anyone used Listener Rules to block / allow CIDRs? Is that a thing? Looked into this shortly yesterday and I’m about to start researching more now, but figured I’d ask here in case somebody goes: “Oh yeah of course, do X”.
I could obviously spin up another ALB, point the domain at that, and then use that ALB’s security group to control access. But unfortunately that increases cost and complexity of the system which I’m trying to avoid.
Any suggestions?
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
When you say IP Whitelist, does that mean outbound from the client -> ALB ?
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
or inversely whitelist the clients coming over a VPN to the ALB
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
I think you might be able to do this with a WAF header based rule, using the host header?
![Matt Gowie avatar](https://avatars.slack-edge.com/2023-02-06/4762019351860_44dadfaff89f62cba646_72.jpg)
I’m saying whitelisting client IPs so they’re the only ones who can access that domain.
![Matt Gowie avatar](https://avatars.slack-edge.com/2023-02-06/4762019351860_44dadfaff89f62cba646_72.jpg)
WAF header rule is interesting. I already have a WAF ACL setup against the CF Distribution for the top 10… Will add that into my research
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
Yeah you could do a webacl at the ALB level, set it as a regular rule with two statements. The first to match the host-header, and the second to match the IP
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
whitelist by default and then blacklist if not matching that pattern maybe
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
or vice versa
![Matt Gowie avatar](https://avatars.slack-edge.com/2023-02-06/4762019351860_44dadfaff89f62cba646_72.jpg)
I’ll check that out. Haha it’s funny that using the WAF for typical firewall usage didn’t come to me.
![Matt Gowie avatar](https://avatars.slack-edge.com/2023-02-06/4762019351860_44dadfaff89f62cba646_72.jpg)
@androogle Was able to get that working via WAF rules. Thanks for the idea man!
![androogle avatar](https://avatars.slack-edge.com/2020-03-11/997034054118_af230be762f8d396365e_72.jpg)
Np!
![niek avatar](https://secure.gravatar.com/avatar/055049cebc56061309609f48090615fe.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0023-72.png)
Hi just wondering, anyone having some terraform modules to setup an EKS cluster with ondemand (only when needed) gpu scaling for KubeFlow?
![kskewes avatar](https://avatars.slack-edge.com/2019-04-05/592663364881_098e29e5a0fe63cc7c82_72.jpg)
Cluster auto scaler works great. Could set it up in a GPU node pool