Discussion related to Amazon Web Services (AWS) Archive: https://archive.sweetops.com/aws/
Hey everyone, I’m excited to be part of this slack channel! Are there any recommended terraform/ansible repos with AWS that have prometheus/grafana visualization incorporated that you all pull from easily? I’d rather not work through and write up the IaC for it from scratch
Best CI CD tool? Looking for the absolute best. PHP shop. Replacing all of our tooling so I can really start from scratch. Currently vetting CICD vendors
Github actions, Github packages with self hosted Github runner
there is no best CICD, test a group of them and use the one that match your use case.
Have to agree with Mohammed. There are literally hundreds of CICD tools out there and the best one is the one that fits your needs and use cases.
• If you are a GitHub shop, you might consider GitHub actions
• If you are a GitLab shop, you might consider GitLab CI
• If you use Docker with your PHP, you might rely on Docker Hub to build and publish your images i’m sure there are other combinations I can come up with but you get the idea. Start by taking a look at a few tools that are already close or tangential to the way you are already working. Don’t belabor your analysis but quickly pick one and go with it. If it works, you just found the best tool.
At the same time, don’t feel stuck. If you realize what you thought was the best tool is no longer working, start the process again with the next tool until you find the right fit.
for flexibility and ease of use i’d recommend checking out Buildkite.
@ @ I agree. Asked because I’m looking for a few to vet
@ can you give a high level description of your application and/or runtime environment? that might help a bit with recommendations. also if you are using repo deploys, building containers, kubernetes, etc..
For resource heavy CI/CD pipelines, which include dependencies and project specific build/deploy workflows, perhaps Jenkins or TeamCity.
For CI/CD on single repos or with fewer dependencies, and less complicated workflows I’d go for Gitlab CI or Github actions, Travis, CircleCI. Mainly due to the reduced set up overhead and simpler to plug-and-play.
There’s also the world of GitOps/k8s-oriented with tools like ArgoCD. I’m unfamiliar with this though, but it’s anyways interesting
Buildkite is really nice. It’s very simple, flexible and easy to use. It doesn’t really do anything that GitHub Actions doesn’t, though.
Does it have a native roll your own solution? Actions doesn’t have anything official for that although there are 2 really good open source solutions for that
“It” being BuildKite?
yarp, just curious
They supply the SaaS control plane and a golang runner/agent. That’s really it. Think of it like GiLab CI unbundled. Or GitHub Actions unbundled.
If I understand right, you can use Github Action with your own runners ( for example in ECS service for AWS use case) for critical workload, same as buildkite
I see a lot of hype following hybrid approach for their CICD even for Terraform
yeh but there isn’t any kind of native scaling options, they provide a self-hosted capability which is mainly designed for static machines. If you want it to be elastic you need to do quite a lot of work yourself
https://github.com/philips-labs/terraform-aws-github-runner this project lets you provide self-hosted machines using the serverless tooling on AWS and https://github.com/actions-runner-controller/actions-runner-controller this project does it for the k8s platform
Thanks for sharing, good to know
I personally feel that a hosted control plane and self hosted agents (with pull connectivity) is the sweet spot of SaaS for CICD. It gives a good amount of flexibility without the overhead of managing control planes.
Yes, @Chris Fowles. The first time I tried Buildkite (2017 or so) my reaction was “OMG, this is how CI was supposed to be. Why did I suffer for so long?” I’m sure there were others that got it right before them. But it’s still a great standalone experience. Now lots of choices with this model.
Hi All, I wanted to know is there any way in AWS to notify us if s3 bucket or lambda function is down using SNS with cloud watch. Thank you
Why would an S3 bucket be down? What do you mean there?
okay why i asked this question is, s3 is a object storage and behind the scenes its just a storage devices what if that goes down and my s3 buckets are down,.. at that time is there any why to notify users
I think s3 is resilient to nine nines
And I only remember 1 time on passed 10yrs of it going down
okay thank you @
For lambda you can use a cloudwatch metric alarm to alert on failed invocations
You can use the health dashboard to alert on actual aws service loss
Yea you can use sns to send an email notification
I think if S3 goes down, all of us will have bigger problems.
But if you truly want to monitor your content and access to S3, I would suggest setting up a resource outside of AWS (perhaps in GCP or Azure) that has permission to access your bucket in the way you want to monitor: HTTPS or direct API access. And then have that service report into your monitoring system that the access is still good. So if indeed S3 goes offline for some reason (which would likely be a larger problem in AWS which might also affect their ability to notify you) you would still be able to get your monitor and alert that your bucket is inaccessible.
Any reason to prefer account-scoped cloudtrails over organization-scoped cloudtrails? I noticed that
terraform-aws-components seems to be preferring account-scoped trials.
multi-payer/multi-org environments is the main reason for me
we have some accounts that need to strictly separate billing, but everything else (cloudtrail, securityhub, guardduty, etc) can be linked to a single management account
Hey guys, one of the API we’re about to use requires IP whitelisting. Is there a way how to configure http proxy using AWS without the need to configure e.g. tinyproxy or nginx?
we’re seeing the following Trace breakdown for a Java runtime based AWS Lambda. I know Java has a hefty cold-start time and I would expect the JVM starting would fall in the “Initialization” phase of execution. What is the 10 second gap before the Initialization phase that is happening in this trace?
does anyone have any strong opinions on an AWS infra tagging taxonomy?
I use the cloudposse module for tagging and leave it at that. Makes it easy!
Only advice is start simple
have a specific goal, then determine if tagging is what you need to achieve that goal. then tag accordingly.
If there isn’t an absolutely must be exactly this way … then I just changed my approach to fit the module cloudposse wrote as it’s so good.
Then all my other tools just get
tags = module.label.tags or
module.label.id. Or I paste in the
[context.tf](http://context.tf) file and add support for the naming tools they built.
As a result naming is consistent, but takes no more mental effort.
We mostly want it for costing/billing purposes
We rely mainly on Application and Environment tags. If you’re trying to allocate to a team that could be another one or you could configure cost categories to map apps to teams.
(This really depends on your company and how you want to allocate costs though)
The terraform provider has the ability to set default tags for almost all resources which is nice.
Anyone try using Lando? https://docs.lando.dev/basics/
I’m interested in anything that simplifies local dev tooling without a ton of extra complexity and seemed interesting. Seems similar to Cloudposse modules in that they are trying to set “sane defaults” by default on the apps to reduce effort.
Get an introduction to Lando; why it exists, what it’s good for and how it differs from Docker Compose and other local development and DevOps tools.
This also sent me towards looking at: https://platform.sh/
Always interested in tooling that reduces complexity when possible and allows me to focus on core business needs. Love to hear any thoughts (and I plan on looking at archives too)
Get hosting, CI/CD, automated updates, global 24x7 support. We support PHP, Node.js, Drupal, WordPress, Java, Ruby, Python, MySQL, ElasticSearch, and more.
Sigh. I’m so tired of ‘contact us’ pricing
Just … tell me what your numbers are. If we have to negotiate from there fine
Just curious, are you objecting to platform.sh pricing transparency?
I’m not sure what @ meant on your pricing page, but I’ll say asking for a call to just get info on product is frustrating with all the tooling devops work requires, it’s just a major annoyance to me. I’ll most likely move on.
On your pricing page I see some transparency on pricing, and the contact us seems to be for highest tier which probably makes sense. I’d like to know what @ noticed specifically as well.
Just what you said
I ask because I’m an engineer and CEO. Certainly agree with you . But curious because platform.sh does have quite a bit of pricing albeit complicated and hard to extrapolate.
Nothing specific to you guys, I just clicked to check out the link the other day and I always look at the pricing. Sort of an off the the cuff thought about these things in general
To be clear, I have nothing to do with platform.sh.
Ah! misunderstood your stmt above
I’m just interested in pricing plans that don’t suck.
The problem generally is they’ll put a plan out at like $5/month for a single dev working on a personal project and then next tier is like “3 people for $50/month” and then “all others contact sales”
Yeah, exactly. You want more visibility to the point that the product is an entrenched success.
and then I’ll have to sit through a bunch of back and forth email, schedule a meeting, have an hour call to find out that the enterprise starts at like $50/person plus some other scaling factors.
Even if its complicated pricing I’d like to see something like ’starts $x based on A B and C factors”
Aside from AWS, any really awesome examples you have of those that don’t suck. Off the top of your head. CI usually pretty clear of course.
haha no, I think just about everyone hides their ‘enterprise’ prices
GitHub, GitLab, atlassian.
Its sort of like if you walk by a fancy clothing store in the mall, when they don’t have prices out you figure “out of my league” and just walk past
Yeah, I guess the counter to this is that pricing that seems inconsequential at small scale can appear asinine at large scale.
plus the usual “please don’t put SSO on your enterprise only plan”
I think that enterprise level at some of these places is such a small number that it’s probable that they want hands on with with sales/engineering to ensure it’s successful. I get that. I guess I’d still like an approx range from someone offering enterprise support like “starting at” vs “contact us.
I’m having a problem getting nginx as a reverse proxy to work in docker compose. I’ve tried to use the mkcert + docker-gen +nginx combo (want to stick with docker compose).
My goal was to allow local development easily against what I’d be deploying to ECS fargate. Almost all the projects I’m working with need this pattern of reverse proxy to support ssl termination as an option.
Anyone have a docker compose project that spins up a reverse nginx proxy and uses docker-gen? I’m ok with using the tmpl file too, but so far no luck in getting anything other than direct access to container, no redirect from root oauth to /appname. Probably removing docker-gen soon but would be nice if I could leverage the automatic config it offers.
I have a root “dockerize” process sending logs to STDOUT from files on the container, which are showing up when running docker locally. However, these logs don’t appear when running in AWS ECS using the awslogs driver. Any thoughts on what the issue might be?
Looking for advice on Transit Gateway - Should you create and maintain seperate Dev and Prod TGWs or use one TGW with Prod and Dev Spokes attached and managing the routing with the TGW route tables to ensure dev spoke can’t reach Prod Spokes?
Personally I’d say always have at least two of anything - if you have a dev-tier and prod-tier (TGWs in this case) then you have somewhere lower risk to apply changes before you start affecting production, but you still have the headache of routes/config being specific to combinations of environments
Having multiple of anything also forces everyone to cater for a different reality, which is useful for DR, other regions, additional environments etc.
I do like separation as @bazbremner already mentioned. Depends what you want to achieve, and how your architecture looks like, and in same time to be cost effective . For example if you want to connect your own network with AWS over site-to-site vpn tunnel you would need to do it for both tgws, and that add additional costs…
i want to be able to change instance size using airflow dags, any ideas?
ALB should be automatically adding an X-Forwarded-Proto header to incoming requests, is that correct?
We are investigating increased API error rates and increased provisioning/registration latencies for ELBs in the US-EAST-1 Region. Connectivity to existing load balancers is not affected.
In case interested I threw together a custom AWS weekly update digest if you want a way to keep up and don’t have a method already.
There are better ways I’m sure but this is custom and has social media top posts too. I didn’t use Cloudpegboard because pretty sure I can’t include redistributed updates from them on my digest if I share it.
YMMV. Reply to email and it will email me if you have any customizations.
Hi all, looking for some networking advice. I’m looking to deploy an EKS cluster with managed nodes. I’m trying to figure out how to best size the VPC. All the nodes will be in private subnets, for a nonproduction account there will be 2AZ and for prod there will be 3AZ. For nonprod, I’ve got the following
/24 VPC 256 host /26 private subnet 64 64 /28 public subnet 14 14
Any suggestions on what a setup for prod would look like, I’m thinking a
/25 CIDR, but I’m not really sure
/25 VPC CIDR is way too small
I recommend a big CIDR
/24 suffice in prod too?
or something bigger?
/21 Subnets for EKS
Remember that every single Pod gets an IP addres
/21 for CIDR, what do you suggest for the subnets? Private needs to be bigger
Well, it depends on how many unique IPs you think you will need. Try to estimate it, then add a big safety margin
In our case, we will able able to spin up around 5K pods per EKS cluster
Which is way more than required
I have some logic for this that I should open source around splitting a CIDR block into regions, then VPCs in that region and then finally subnets in that VPc
You need it ASAP? I’m away from my laptop right now, how soon you need it?
I could wait till you’re back, no problem
Hi, when I use multiple packages like
terraform-aws-documentdb-cluster I get the error that the security group for the vpc already exists.
First terraform creates the documentDB and then the elastic beanstalk stack.
I would now expect that if I use
allowed_security_groups = [module.vpc.vpc_default_security_group_id] for both, that both are in the same security group. is this not correct?
this is a “feature” that we need to actually fix
in cases like that, just add some attributes to ne of the module
attributes = ["2"]
it will add
-2 to all names in that module
not pretty, but will work
and you can select whatever attribute you want, not necessary the “2” in this example
thank you, I will try this
this worked, but how do i add the security group from the database to the elastic beanstalk, so the application ca connect to the database?
Elastic beanstalk has the SG ID output https://github.com/cloudposse/terraform-aws-elastic-beanstalk-environment/blob/master/outputs.tf#L16
Terraform module to provision an AWS Elastic Beanstalk Environment - cloudposse/terraform-aws-elastic-beanstalk-environment
allowed_security_groups input https://github.com/cloudposse/terraform-aws-documentdb-cluster/blob/master/variables.tf#L7
Terraform module to provision a DocumentDB cluster on AWS - cloudposse/terraform-aws-documentdb-cluster
allowed_security_groups = [module.elastic_beanstalk.security_group_id]
I have hit a wall when trying to remove an
eks cluster. I have added
null_data_source.wait_for_cluster_and_kubernetes_configmap as per the README, but now when I’m trying to delete the cluster I get
Error: Get "<http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth>": dial tcp [::1]:80: connect: connection refused
We are wrapping the cloudposse module with a thin layer, so I’m looking into that as well, but is this somethign that others have seen?
It’s an issue that stems from the Terraform Kubernetes provider. See https://github.com/cloudposse/terraform-aws-eks-cluster/issues/104 and https://github.com/terraform-aws-modules/terraform-aws-eks/issues/911
Thanks! It seems to be affecting a lot of people…
I’m seeing that problem when I set
enabled=false in order to destroy the cluster.
-parallelism=1 doesn’t help
null_resource.wait_for_cluster is also causing the issue
I tried removing
wait_for_cluster_and_kubernetes_configmap from the terraform state, but it still causes the
aws-auth configmap to be read, leading to the error:
Error: Get "<http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth>": dial tcp [::1]:80: connect: connection refused
I think that resource should be tied to
OK a workaround turned out to be removing the
aws-auth configmap from the
audit2rbac scans kubernetes audit logs and automatically generates a rbac policy with least-privilege for a user. Anyone know of anything similar for cloudtrail/IAM?
Autogenerate RBAC policies based on Kubernetes audit logs - liggitt/audit2rbac
anyone got a good reason to use savings plans over RIs?
You can change instance types, families, regions!
You don’t actually get reserved capacity tho. Buuut you can do capacity reservations so
Savings Plans apply to EC2, Lambda and Fargate
They’re way more flexible
The better question: is there any good reason to use RIs over savings plans?
If you do the Compute Plan where you lock into a region and I think a family of instances, it should be 1:1 with the RIs (while still allowing slightly more flexibility)
the Savings Plan is the more broad one - any region, any compute type. You get less overall savings, but more freedom to change your operation on the fly
Before savings plans, we were locked into instance types which resulted in putting off upgrading to modern instance types for years… Glad that’s over since the ability to use the newer instance types sooner (via savings plans) has saved us more money than upgrading after our RIs expire
I’m with @RB (Ronak) (Cloud Posse). The money saved from RIs is offset by the loss in flexibility and added management time. Savings Plan is a better tradeoff for us.
RDS is only compatible with RI’s, but I’m pretty sure you implied that by specifically mentioning fargate/lambda/ec2. =] seems like savings plans is the recommended way to go from the ‘common wisdom’ im reading
I imagine they’ll add RDS and elasticache savings plans at some point
Only a matter of time
The RDS RIs make me extremely sad
At this point, RDS being included in Savings Plans is the only announcement I am hanging on the edge of a chair for.
interesting, do you guys believe RIs take more time to manage (@? News to me that people are so much more excited by Savings plans than RIs. Personally, we havent personally had that much of an issue w/ locking in instance types but I was def curious if it was easier to manage. (We also don’t use fargate and our lambda cost is miniscule)
Starts getting interesting when you’re managing RIs across regions and sub-accounts. However, a Savings Plan can be applied to shared accounts and across regions, so it’s much simpler at scale.
right, we’re managing our RIs in a single region and in a single account (although we buy RIs for all accounts there) I can see where savings plans can be easier to manage at scale when it comes to multi-region etc.
we have nonprod in 1 region and prod in 2 region, all with a jumble of different RDS db classes, so we have to pre-buy
• us-west-1 db.m5.*
• us-west-1 db.t3.*
• us-west-1 db…
• us-east-2 db.m5.*
• us-east-2 db-t3.*
• us-east-2 db… and so forth. savings plans we could have went for a single block and not worry about splitting between regions.
Managing RIs was hours of work each week for one person at our company
and then someone wants to a resize an RDS the day after you purchased