#aws (2021-01)
Discussion related to Amazon Web Services (AWS)
Discussion related to Amazon Web Services (AWS)
Archive: https://archive.sweetops.com/aws/
2021-01-04
theres no db slack channel, so I’m asking here since I’m using RDS (and theyre deprecating support for my postgres version). Anyone thats done the postgres 9 -> postgres 10/11 migration have any gotchas we should be concerned about when doing it?
Ah geez — When are they deprecating 9 exactly?
we did a migration from 10-11 and it was fine but that all depends if you are using version specific features but for what I have seen this happens more on the mysql side then postgres
they usually do not have breaking changes
not helpful as to the “how”, but as to the “why”…when deciding to upgrade PostgreSQL this is a great resource: https://why-upgrade.depesz.com/show?from=9.6.20&to=12.4&keywords=
Have you ever wondered what it would feel like to change planes while it is in flight? I for sure do not want to be on that flight, but…
this is the guts of it. we ended up enhancing this process using scripts
but it’s all predicated on “can you allow downtime.. or not?”
9.5 is being deprecated right now. There’s a couple of weeks remaining before AWS mark it EOL
We’re on 9.5 — Feb 16 @Matt Gowie
Upgrade your RDS for PostgreSQL 9.5 databases before Feb 16, 2021
The RDS for PostgreSQL 9.5 end-of-life date is approaching. Your database is using a version that must be upgraded to 12 or higher as soon as possible. We plan to automatically upgrade RDS for PostgreSQL 9.5 databases to 12 starting February 16, 2021 00:00:01 AM UTC.
Ah 9.5 — Thanks man
@rms1000watt how much resource pressure happened (read iops, throughput, etc) on the source DB during the 4 days of synchronization?
it wasn’t bad only because we overprovisioned to becareful of this situation
like we increased IOPS, CPU, etc
(to be fair, the most kudos need to be given to @Ronak for the first migration, and these subsequent two migrations. He’s the real mastermind.)
but there were a few tests beforehand to measure the effects of a migration on a DB of our size
(like stand up a second DB from a snapshot with the same resources and try the migration from that, just to test how it’d behave)
do you mind giving the details of your db size?
24x.large .. waaaaaay over provisioned
is that a typo?
jeez
didnt even know 24xlarges exist
so our 2xlarge will migrate in a much shorter amount of time
yea, we had like.. 3TB?
We can afford downtime on the weekends as long as we put together some language for our customers. I’m tempted to spin up a second RDS instance w/ snapshot, upgrade the db to pg11 and do the cutover at the route53 level.
Did you guys run into any backwards incompatible changes at the application level?
but if you can afford downtime, don’t even mess with pglogical
for the DBs we could afford downtime we just did:
• RDS upgrade from 9.6 -> 11.8
• Create Aurora Read Replicas from the 11.8
• Promote to a Aurora Read Replica
• Cutover the application configs to Aurora endpoint
actually we went to latest 11.9, but it’s the same process
awesome. your two way replication solution is definitely a thing of beauty. But I do think I can afford to spin up a second RDS instance over the weekend upgraded to pg11 and do the cutover at route53. my solution probably wont warrant a cool medium blog about it though.
2021-01-05
In https://github.com/cloudposse/reference-architectures#3-delegate-dns Can some one explains An available domain we can use for DNS-base service discovery (E.g. [ourcompany.co](http://ourcompany.co)
). This domain must not be in use elsewhere as the master account will need to be the authoritative name server (SOA
).
[WIP] Get up and running quickly with one of our reference architecture using our fully automated cold-start process. - cloudposse/reference-architectures
does anyone have a “best practice” for where to have the S3 bucket for access logs for an account in the account itself or in the security account and replicate
I pretty much asked about this a few weeks ago…
https://sweetops.slack.com/archives/CCT1E7JJY/p1608486517167200
I’m thinking about creating an account dedicated to storage within my AWS Org –
My initial plan is to consolidate the S3 bucket sprawl I have between accounts into a single bucket within that account that provides my Org access to write and each account access to read it’s files using a prefix condition on the bucket policy.
Bucket sprawl for Service logs: VPC Flow, S3 Access, CloudFront logs, etc – application buckets would remain in their account….not looking to fix that sprawl.
Any words of wisdom or horror stories that I should be aware of before I embark on my quest?
the responses weren’t threaded though
I loved the word “best practice” before, now I hate it, there is no best practice - I mean - one way to do a thing, there are patterns which mean multiple valid ways to do it. Every use case is unique, so the answer depends on your use case:
• single AWS account - use same account.
• multiple AWS account - create Access-log s3 bucket in the logging account/security account/or audit account and send data there. in both cases setup retention settings and archiving.
@Mohammed Yahya i completely agree with you i hate the word as well, i just wondered really what people do and why
Anyone use AWS distributor for packages? I prefer choco and dsc but I need to create a datadog package and want to cover linux + windows. I’d like to know if various distros can be handled easily, configure etc. Overall if I any problems using or smooth sailing.
Otherwise I have to do a mix of ansible+ dsc and more and it’s unlikely others will be comfortable with that.
In addition while I’m a fan of ansible I primarily use AWS SSM to manage a mix of Windows and Linux instances. At this time AWS SSM only will run playbooks for Linux.
We use cloudsmith
For the packages themselves
2021-01-06
bump going to tackle this in next couple days so would love to know your experience
https://docs.aws.amazon.com/systems-manager/latest/userguide/distributor.html
this? never heard of it but sounds interesting
Create, manage, and deploy software packages on Systems Manager managed instances.
I have a system in place with choco + auto builds, but I’m trying to eliminate using multiple toolchains for each. If I can wrap up all into a single package I’ll probably have better luck maintaining it
Having an internal debate and I’m curious what your guys’ thoughts are: How many people do you think there are (worldwide) who are using Terraform with AWS today? Please include your rationale for your answer.
Some stats:
• The AWS Provider github repo has 5k stars and 2k contributors.
• The AWS Provider has been downloaded 216.2M times.
• This channel has 2,221 members.
engineers who know and actively write HCL or total number of developers that inherently are “using” Terraform?
because I’d barter that those two numbers are dramatically different
Oh great question! Let’s focus on the first one - those actually writing HCL. It can then be a proxy to the second one.
I’d guesstimate that it’s somewhere around 250:1 in regards to users:contributors – so half-million? (my initial 500:1 sounded silly after I thought about it)
how many people is also different of how many infrastructures are managed. we’re ~8 in my company but we managed dozens of different stacks
8 what - HCL developers, total developers, total people?
8 people who write HCL every days
for 150 developpers
So I would count 8 users in this case. This is a theoretical question, but you can then derive a bunch of different things from it - like how many companies use AWS+TF, how many developers in the world indirectly use AWS+TF (like the 150 developers you mentioned), etc.
Once we can answer these questions, we can compare AWS+TF’s usage with that of other open source technologies and see what level of adoption it has.
I think I’ve already answer to a survey with that kind of question
Where?
I’m guessing the number is at least 100,000 but no more than a 1,000,000 just for everyday folks that use Terraform w/ AWS. I think you could easily cross the 2-3 mil threshold if you include folks who do simple POCs and make minor contributions to custom modules. I don’t know how to make an accurate guesstimate further than this. Is this how many people have used it over the lifetime of the products in question, or people that have used it within the last week, month, year?
according to the Google Machine there were 21m software developers in the world in 2016 – if you estimated 24m today…I’d be shocked to learn that more than 2% engineers write HCL actively – so I’m sticking with my 500k number
@Yoni Leitersdorf (Indeni Cloudrail) can’t remember sorry
At @Issif’s company, 5% of developers write HCL, but maybe they’re a more advanced company. Larger companies, with thousands of developers, probably have an even lower percentage of developers who write HCL. I wonder if 2% is too high even.
@MattyB good point, the focus is on “active” users, so let’s say they use it on a regular basis (weekly).
Another vote for ~100k Terraform developers in the whole world. So perhaps 70k Terraform AWS devs
@Alex Jurkiewicz what’s your rationale for that number?
I estimate 1% of software developers write Terraform, plus Darren’s number of 20m software devs globally
So I tried validating that 1% or 2%. My closest approach is to look at GitHub repos stats: https://github.com/oprogramador/github-languages#all-active
If you focus only on repos that have been active for the year before the report ran (given HCL is new and growing quickly), you’ll see that far far less than 1% of all repos have HCL.
Can this be a reliable proxy to determining the percentage of developers who work with HCL?
I don’t know. Multiple repos can be owned by a single developer. Maybe if you could extrapolate people who have contributed to all repos you could get a minimum number?
@Alex Jurkiewicz are you saying 100k terraform AWS developers or just terraform? If that’s just terraform then there should be far fewer terraform AWS since it’s used for a ton of providers
@Yoni Leitersdorf (Indeni Cloudrail) that’s clever and on reflection yes I think 1% estimate was way too high.
Continuing along the GitHub line of thinking, GitHub reported 40M users a year ago. Let’s say it’s closer to 45M now. 1% would be 450,000. As I said, I think it’s far less than 1%. At 0.25% for example, it would support the 100,000 number Alex used.
I took the table from the link above and summed the total number of repos, it’s 18.5m in total (yes, I know many repos can have multiple languages, but trying to get a broad stroke). Of those, 0.21% have HCL.
So 0.21% of 45M is 94,500, close to the 100,000 number.
I think a lot of github metrics can underweight big “old” languages like Java and C++, and overweight modern github-first tools like Terraform. So IMO that 0.21% might still be an overestimate
I also think to attribute 100% of GitHub Users as people who are actively writing code is large assumption
Good point @Darren Cunningham. I wonder if there’s a way to answer what % of GitHub Users are coders, and what % of coders have GitHub useres.
% are abandoned accounts
GitHub claiming 40m users is a publicity stunt IMO
How do you define abandoned accounts?
My Digital Ocean account that I haven’t logged into since 2010
technically I’m “a user” of their platform
LOL
how many people have just created a new account rather than updated their email in GitHub?
how many have 2 accounts…I know I do
So do I actually. I also work with a guy that has 3 or 4.
using number of accounts seems like you’re creating a new complexity factor that’s way out of control
Yeah, I agree. So if we go back to the 20m or so developers, and 0.21%, we’re looking at 42k HCL devs
Hi All - any help greatly appreciated (wasn’t sure which channel was best for this one)
https://sweetops.slack.com/archives/CB6GHNLG0/p1609975378080100
Hi All - first post so excuse if silly :)
Wondering what’s best practice for creating kafka topics post cluster creation using the CloudPosse MSK Module?
AWS doesn’t appear to support anything directly on MSK and even references the apache shell scripts (here points to here)
If really cli only, is it possible to run a template file after the MSK Cluster is created to run the shell scripts? e.g.
$ bin/kafka-topics.sh --bootstrap-server localhost:9092 --create --topic my-topic --partitions 1 \
--replication-factor 1 --config max.message.bytes=64000 --config flush.messages=1
Thanks for any help
2021-01-07
Hi I am trying to find a tool/vault to manage my passwords (a tool like roboform, lastpass). I am looking for an open source tool like this which I can configure on my Linux EC2 instance and access via a UI. Is there any tool like this?
Yes I explored
- BitWarden
- Passbolt
- HyperVault What do you think is it a good option to self host the service? I mean we’ll probably need to provision an RDS database with the EC2 instance. I have a feeling that the cost may increase. Password managers like Roboform costs around 3 USD per user per month. So I am little confused here what to use. Self hosted or buy a plan for my organization?
it depends on your use case. how many ppl are you going to use it? I have oly used BW for personal usage so I cannot really speak enterprise wise
Around 20 to 50 users
Hashicorp Vault = opensource + UI + production grade solution
That’s also one option. I’ll explore more @Mohammed Yahya Thanks.
2021-01-08
Hey I want to work with a subdomain on CloudFlare but unfortunately it does not support working with subdomains. What are my options? Will Route53/Cloudfront be useful?
Guys I’ve been using AWS SSM Parameter store for storing my credentials like RDS database credentials so that I can access them via API or CLI in my pipelines or infrastructure code. I am thinking to put my AWS credentials in SSM parameter store because my Rails application (which is deployed in ECS via Terraform) demands AWS keys for accessing an S3 bucket. Should I put AWS credentials in SSM? I just feel that it is not right way to deal with this problem.
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-iam-roles.html
The key points are:
With IAM roles for Amazon ECS tasks, you can specify an IAM role that can be used by the containers in a task.
Instead of creating and distributing your AWS credentials to the containers or using the EC2 instance’s role, you can associate an IAM role with an ECS task definition or RunTask
API operation.
So you need to:
- Create IAM role for your ECS task, grant this role required S3 permissions
- Update your task definition to use this IAM role
- Make sure that your ruby app uses AWS SDK’s automatic configuration for credentials
With IAM roles for Amazon ECS tasks, you can specify an IAM role that can be used by the containers in a task. Applications must sign their AWS API requests with AWS credentials, and this feature provides a strategy for managing credentials for your applications to use, similar to the way that Amazon EC2 instance profiles provide credentials to EC2 instances. Instead of creating and distributing your AWS credentials to the containers or using the EC2 instance’s role, you can associate an IAM role with an ECS task definition or
Nice. I did not know this option is available. Thanks.
When I am using this Task execution role, Will this allow me run AWS cli commands? For example Will I be able to run aws s3 ls
inside the container running my Ruby app?
- You mixing up Task Execution Role (https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_execution_IAM_role.html) and Task Role. Task Execution Role is used by container agents. Task Role is used by containers in you Task. You grant Task Execution Role permissions to access your secrets, pull image from ECR, write logs etc. And you grant Task Role permissions which your applications need (e.g. s3:ListBucket etc.).
- AWS CLI uses AWS SDK under the hood and can also auto-discover the credentials (https://docs.aws.amazon.com/cli/latest/topic/config-vars.html#using-aws-iam-roles). No need to hard-code them.
The task execution role grants the Amazon ECS container and Fargate agents permission to make AWS API calls on your behalf. The task execution IAM role is required depending on the requirements of your task. You can have multiple task execution roles for different purposes and services associated with your account.
In my code I created one role and passed the same role ARN for both Task Role and Task Execution Role. Thanks for clarifying I did not know the difference.
I logged into the ECS instance and then to my task (docker exec -it <containerid> bash) and tried curl 169.254.170.2$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
it returned the same Role ARN which is passed in task-role parameter in my infra. So I thought aws s3 ls
should work but it returned aws command not found
error.
Well obviously your container image should have aws cli installed if you want to use it inside the container.
2021-01-11
i’m seeing some funky issues with grpc and nlb’s when rolling pods in eks. anyone got experience in this area?
We get 500s when roll ingress nginx as pod process terminates before nlb health check times out Some work upstream around 1.19 but we are still on 1.15… :(
2021-01-12
Anyone using on AWS App Mesh? Thoughts?
I think that it’s a limited implementation of envoy but it’s serving my needs which are also limited currently
I did a POC
only works on awsvpc mode with ECS
so that was a show stopper for me
but I moved it to awsvpc and then I had issues with Service discovery, the docs are far from great if you do not run on fargate
2021-01-13
I’m having a bit of a wood-for-trees moment with ACM Private CA - it’s reasonably straightforward to set up a root and subordinate CA in a single AWS account and RAM share that out to other accounts that need certificates (although it seems necessary to share both the root and the subordinate for the subordinate to appear in the ACM PCA console in the target account). However, the best practice (https://docs.aws.amazon.com/acm-pca/latest/userguide/ca-best-practices.html) recommends having the root alone in its own account, and subordinate(s) in another account. My problem is that the process of signing the subordinate CA manually when the root CA is in another account is really not clear. The docs cover the case of both in the same account, or an using an external CA. Anyone done this before?
Learn the best ways to use ACM Private CA.
Has anyone ever seen the AWS ElasticSearch Service take over an hour to create a domain (i.e. Domain is in “Loading” state and nothing is accessible)? I’ve seen long update / creation times from this service before… but this seems absurd.
Never timed it, but the long creation/update times are something I ran into in the past too.
I remember doing ES Domain maintain activities took a very long time, esp. if I tuned on Automated backups
Does anyone need to periodically restart the aws ssm agent ?
no sir
Nope.
nvm, found out it was an old ami from 2018. upgrading the ami seemed to fix the issue.
¯_(ツ)_/¯
[thread] Troubleshooting — EC2 Instance for Windows isn’t working with ECS tasks, ssm session manager, and RDP reporting User Profile service Can’t start
The source for this AMI = packer built. It’s an ebs based image, encrypted by default 50gb.
I notice the launch configuration for this has no detail on volume, just uses the defaults from the image I believe.
in console I see this means the ebs volume is not optimized and not encrypted.
Is this a false track to go down? pretty sure the instance has permissions for kms + full ssm and more. I want to know if any other ideas before i call it a night
2021-01-14
If anyone’s using AWS AppMesh here - do you know if there’s a way for the virtual gateway to NOT overwrite the host? we’re using the host in order to do some analytics in the container.. might also be similar for Istio users.
am I blind or is there really no way to get container details for a running tasks in the new ECS portal?
Going to check it out right now. What details are missing?
all the details that you could access from the collapse menus in the current version – not the best example since it’s an x-ray container, but you’ll get the point
Hah! That’s pretty bad. I’m seeing the same issue in my UI
new or old UI?
new
I left feedback, first time I’ve had to do that because of a UI change
I wonder if it’s because this information is a part of the task definition and they’re trying to not have it in multiple places?
but that’s how my team knows how to find the CloudWatch Log Stream associated with that task..
I do not like the new one
I can appreciate that they’re trying to streamline information, but they dropped the important bits
ok, glad I’m not the only one maybe there will be enough reactions to get them to fix it before it becomes the default view
the new UI also removed the links to the monitoring/logs
2021-01-15
FYI for the containers people: the last run of COM-401 “Scaling containers on AWS” is in 15-ish minutes at https://virtual.awsevents.com/media/0_6ekffvm8 There are some massive Fargate improvements that got no other announcements as far as I know
Seems quite interesting - wondering if this relates to what hashicorp has done with their nomad testing on top of AWS
morning everyone ! Quick question, is there any performance overhead/impact when live migrating an EBS volume from gp2 to gp3 ?
Hi. I have a question about CloudFront. My application is deployed on Heroku and I am using the Heroku endpoint in CloudFront. It works fine but when I try to open a page by specifying path in URL then the CloudFront URL is routed to use the Heroku endpoint. For example http://myherokuendpoint is the application link in Heroku and d829203example.cloudfront.net is my cloudfront address to access my app. When I try to access d829203example.cloudfront.net/admin it changes the address to http://myherokuendpoint/admin I tried adding origins but it did not work.
If I attach ALB link in CloudFront distribution it works fine. Is there a way I can make it work with Heroku link?
Question about egress filtering -
Ideally you don’t want processes to have the ability to reach out to the internet with the exception of specific cases like calling other AWS services, downloading yum updates, contacting other services like Trendmicro SaaS etc. Some AWS services support in VPC endpoints but last I checked this only worked for some services and generally only within the same region. IP filtering seems solid but it would be a huge pain to setup and maintain. DNS blocking would seem to be easier to maintain but would not prevent connections that don’t require DNS.
Anyway, are there ~best practices~ecommendations for setting up egress filtering? Are there other options?
Thanks!
Tim
What’s your budget? We see well funded enterprises use Next Generation Firewalls for this (CHKP, PANW, FTNT, etc)
Another option is the new AWS Firewall
Error: Error creating aggregator: OrganizationAccessDeniedException: This action can only be performed if you are a registered delegated administrator for AWS Config with permissions to call ListDelegatedAdministrators API.
Anyone had this before. the account I’m executing is actually delegated as such and can call ListDelegatedAdministrators succesfully.
Hey All! I’m working on editing a cloudposse module for our use, but I’m having a weird issue. I think I added all the things in I need and got all the variables in correctly, but now I’m getting a super generic error and I’m unclear on how to troubleshoot it. It’s unfortunately not telling me at all what’s wrong with the construction, and I’d love ot know how to trouble shoot from here:
2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_origin_access_identity.default”: visit complete 2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_origin_access_identity.default (expand)”: dynamic subgraph completed successfully 2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_origin_access_identity.default (expand)”: visit complete 2021/01/15 1505 [TRACE] Re-validating config for “module.cdn.aws_cloudfront_distribution.default[0]” 2021/01/15 1505 [TRACE] GRPCProvider: ValidateResourceTypeConfig 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 10 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.440-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.440-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021-01-15T1505.441-0800 [INFO] plugin.terraform-provider-aws_v3.24.1_x5: 2021/01/15 1505 [WARN] Truncating attribute path of 0 diagnostics for TypeSet: timestamp=2021-01-15T1505.441-0800 2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_distribution.default[0]”: visit complete 2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_distribution.default”: dynamic subgraph encountered errors 2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_distribution.default”: visit complete 2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_distribution.default (expand)”: dynamic subgraph encountered errors 2021/01/15 1505 [TRACE] vertex “module.cdn.aws_cloudfront_distribution.default (expand)”: visit complete 2021/01/15 1505 [TRACE] dag/walk: upstream of “provider["registry.terraform.io/hashicorp/aws"] (close)” errored, so skipping 2021/01/15 1505 [TRACE] dag/walk: upstream of “module.cdn (close)” errored, so skipping 2021/01/15 1505 [TRACE] dag/walk: upstream of “meta.count-boundary (EachMode fixup)” errored, so skipping 2021/01/15 1505 [TRACE] dag/walk: upstream of “root” errored, so skipping 2021/01/15 1505 [INFO] backend/local: plan operation completed
Error: Required attribute is not set
on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {
Error: Required attribute is not set
on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {
Error: Required attribute is not set
on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {
Error: Required attribute is not set
on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {
Error: Required attribute is not set
on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {
Error: Required attribute is not set
on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {
Error: Required attribute is not set
on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {
Error: Required attribute is not set
on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {
Error: Required attribute is not set
on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 30: resource “aws_cloudfront_distribution” “default” {
2021/01/15 1505 [TRACE] statemgr.Filesystem: removing lock metadata file .terraform.tfstate.lock.info Error: Required attribute is not set
on ../fork/terraform-aws-cloudfront-cdn/main.tf line 30, in resource “aws_cloudfront_distribution” “default”: 2021/01/15 1505 [TRACE] statemgr.Filesystem: unlocking terraform.tfstate using fcntl flock 30: resource “aws_cloudfront_distribution” “default” {
2021-01-15T1505.461-0800 [WARN] plugin.stdio: received EOF, stopping recv loop: err=”rpc error: code = Unavailable desc = transport is closing” 2021-01-15T1505.465-0800 [DEBUG] plugin: plugin process exited: path=.terraform/providers/registry.terraform.io/hashicorp/aws/3.24.1/darwin_amd64/terraform-provider-aws_v3.24.1_x5 pid=99698 2021-01-15T1505.465-0800 [DEBUG] plugin: plugin exited
Do you have a link to fork and parameters of call to it
No, I hadn’t cleaned it up adequately yet
Was still just trying to get it to work
Is there any way to have it actually tell me what attribute isn’t set?
Not that I’m aware of
Hrm, thanks @pjaudiomv
@Liam Helmer I’ve never seen error messages like that ^
are you using terraform?
in general, all Cloud Posse modules require at least one of the following inputs: namespace
, environment
, stage
, name
we use https://github.com/cloudposse/terraform-null-label to uniquely and consistently name all the resources
Terraform Module to define a consistent naming convention by (namespace, stage, name, [attributes]) - cloudposse/terraform-null-label
so the ID of a resource will have the format like: namespace-environment-stage-name
you can skip any of the parameters, but at least one is required
Thanks @Andriy Knysh (Cloud Posse)! I figured it out…. It was a canadianism: I used the word “behaviour” instead of “behavior” in a couple of places deep in the module, and that caused it to error out like that. LOL
I now have a working prototype at least, I’ll see how it goes from here!
ok, I figure it out
is a VPC name unique across regions in the same account?
e.g. can i have a dev
VPC in Ireland and another dev
VPC in Singapore within the same account?
The vpc name is just a tag, it’s not unique in any way. Can be the same name in the same account, in the same region. There’s no restriction on the name
2021-01-16
hey all are there modules (or some best practices docs) for creating cloudwatch alarms (use case: alarm on CPU/disk space apache kafka (MSK))?
Hey, you can use https://github.com/terraform-aws-modules/terraform-aws-cloudwatch/blob/v1.3.0/examples/multiple-lambda-metric-alarm/main.tf update namespace and dimension to match MSK read more here https://docs.aws.amazon.com/msk/latest/developerguide/monitoring.html
Terraform module which creates Cloudwatch resources on AWS - terraform-aws-modules/terraform-aws-cloudwatch
Learn how to monitor your Amazon MSK cluster.
2021-01-17
2021-01-18
Can I run command on AWS Fargate when container go to deactivation ? https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-lifecycle.html
When a task is started, either manually or as part of a service, it can pass through several states before it finishes on its own or is stopped manually. Some tasks are meant to run as batch jobs that naturally progress through from PENDING to RUNNING
Is it possible to add this to the task definition?
When a task is started, either manually or as part of a service, it can pass through several states before it finishes on its own or is stopped manually. Some tasks are meant to run as batch jobs that naturally progress through from PENDING to RUNNING
The only thing I’ve seen (not an exhaustive search) task state change events
Do you need to run your command in the currently running container?
Container state change and task state change events contain two version fields: one in the main body of the event, and one in the detail object of the event. The following describes the differences between these two fields:
Hi, I have a query about RSYNC i have an ec2 server with 3 diff apps folder 2 GB each(master server) i want to clone and maintain these folder to multiple ec2, lets say 5 ec2. if i setup cron job of rsync for every hour does it bottle neck my local bandwidth? and does it makes my cpu utilisation high ?
cant, it is has python code so all the process become slow
well python once on pyc files loads in memory so it should not be slow
the data for whatever you are running can be in a localdisk but code can be in efs
so my framework is bit old when i update some thing it will check for all the files and requirements which makes the whole process slow
use code deploy in that case if you are deploying code and such
but rsync is network/io dependant not cpu dependant
i am using efs for now, my site goes down for 20 min when i update. i want to explore other options
codeDeploy is an aws product that does code deploys
you could use EFS, Ramdisk to load your code if is that IO dependant
yes, but my framework do all the checks when we update the code
then check codeDeploy , it is like a fancy rsync
where EFS struggles is the latency on reading lots of small files. any forking processes which have to reload those files will suffer. anything on the front-end will likely see long request delays when loading those files for a new process (recycling a process handling requests).
as an alternative, you should also investigate the use of containers. the layers in the container are stored as larger files, so deployments from a container would be less of a concern if pulling the images locally to the ec2 instance and running them.
if you don’t need to manage that process yourself, look at ECS, EKS, or Fargate for a managed container cluster framework.
you would integrate the pushing and pulling of images with either your CI (push early) or CD pipeline (either early or on-demand).
if you don’t want to go the route of containers, you might also consider distributing files via tarball instead. in the ruby world, one might use something like capistrano to orchestrate the process.
some ideas to get you thinking about it…
https://blog.nuventure.in/2020/02/19/auto-deploying-a-django-app-using-capistrano-with-gitlab-ci-cd/
Yes, you read that right. We are using Capistrano - a very popular application deployment tool that is written in Ruby to deploy a Python Django app. In
Capistrano style deployments with fabric. Contribute to dlapiduz/fabistrano development by creating an account on GitHub.
i would seriously consider containers though. while a little extra overhead in getting up to speed, deployments and testing across your dev >> test >> staging >> production pipeline is a lot easier when they’re all guaranteed to run the same container image / code.
yes, smalls file in EFS is not the best (NFS4) but how big is you deployment? how big the code base could be? but at any rate, the share volume can be used to share the code and then copy local to a disk or ramdisk etc
I used capistrano for years, CodeDeploy is basically capistrano
@jose.amengual: EFS also has additional write latency (multiple-AZ synchronous write), so things like rsync can take a lot longer than self-managed NFS.
I understand but this techniques has been use for many years 20+ years…this is nothing new, capistrano must be 10+ years old? do not get me wrong, containers is the way to go I think but if you really need to share code as ( here is the artifact now deploy it) there is nothing too bad on NFS
you can download the artifact from S3 too( which will be faster)
2021-01-19
Morning everyone ! Any experience migrating EBS volumes from gp2 to gp3 ? I have a large Kafka cluster with huge (1.5 TB) EBS volumes attached to every single broker
Be sure to read up on gp3 before you migrate.
While they offer higher potential IOPS for lower capacity volumes, at a slightly lower cost, the tradeoff is higher latency.
Yes @Ives Stoddard I’ve been reading a lot about gp3
How much is the difference in the latency ?
Just recently, I found out that AWS has introduced a new type of Elastic Block Storage called gp3 in addition to the popular gp2 volume…
Hmmmm… thanks for the info !!!
for most workloads, this likely isn’t an issue.
not all that glitters is gold
It’s Kafka…. I need to read
but if you’re pushing around millions / billions of files, that latency can add up.
Kafka is I/O sensitive
as with all things at this layer, it’s usually best to do some load testing with your applications and workflows.
different patterns and workloads can have very different requirements.
Absolutely right… well… what I like about AWS is that I can go back to gp2, if needed
are you using self-managed kafka, or one of the managed services?
self-managed Kafka
with the newer volume types, you may be able to switch on the fly. if for some reason that ends up being a problem, you can always swap them via LVM volume replacement.
if you have multiple nodes, you can try running on gp3 for a while and see if there is a measurable impact to performance.
Right… we have like 50 brokers … so I will try on some of them ….
depending on your write / read throughput, and application sensitivity to latency, gp3 might work just fine.
latency of 1-2 ms on the tail end of a request is negligible, whereas stacking latency for rsync of 10 million files would be a different story.
for example, an additional 2 ms delay on 10 million file rsync could add up to 5.5 hours in latency (single process).
whereas a 2ms delay on asynchronous event fired from a front-end user request wouldn’t be perceptible to a user.
(or closer to 4ms for write + read delay, in the event of semi-synchronous events, like thumbnail generation on upload, or comment publishing, etc.)
Wow… I really appreciate all your recommendations on this matter ….
I will proceed with caution
one other area to be mindful of is local ec2 root volumes. if you leverage swap at all, that latency might slow down memory. likely not an issue for kafka, as your JVM is likely configured for (Xmx).
Right…. with Kafka we avoid using SWAP at all ….
We have instances with lots of RAM so, that should not be an issue
in those cases, consider emphemeral instance storage for swap.
good luck.
Thanks @Ives Stoddard !!!! Have a great day !
2021-01-20
Hello everyone , I am trying to save output of the terraformed eks cluster into a json file
do you mean to save the terraform outputs in a JSON format from a root module that provisions an eks cluster?
asking this here cause we didn’t get it it in office hours any tips for improving global s3 upload speed? (think india, hong kong, etc) what other optimizations could I possibly make after turning on s3 transfer acceleration and using multipart uploads?
is this from an app or scripting? are you parallelizing the upload processes? is that possible in your app/script? not sure if you are talking about 1 large file or many smaller ones. multipart + multiple upload workers might help in some scenarios
https://pypi.org/project/s3-parallel-put/ or https://netdevops.me/2018/uploading-multiple-files-to-aws-s3-in-parallel/
Have you ever tried to upload thousands of small/medium files to the AWS S3? If you had, you might also noticed ridiculously slow upload speeds when the upload was triggered through the AWS Management Console. Recently I tried to upload 4k html files and was immediately discouraged by the progress reported by the AWS Console upload manager. It was something close to the 0.5% per 10s. Clearly, the choke point was the network (as usual, brothers!). Comer here, Google, we need to find a better way to handle this kind of an upload.
from a web app
not exactly large files. a 20mb file can take minutes to upload for users in bangladore.
form my experience, using custom python boto-based code for downloading was never even close when it comes to the speed of CLI aws s3 cp ...
- so maybe try uploading using aws-cli
as a test first?
multipart uploads and transfer acceleration are the easy ones, which you already have turned on. the only other things I can think of: (1) uploading to a regional s3 bucket and replicating the files back to the central/main bucket or (2) having another intermediary service that would first consume the upload and then put to where you wanted it e.g. write your own service that runs on some regional infra that closer to the enduser (ec2/fargate) or like https://www.dataexpedition.com/clouddat/aws/. i think in my head that the upload times would appear faster to the end user, but over all processing time of getting the file to where it needs to go might be longer if there’s more steps in the process to get the files to the workers that will actually do something with the file.
Also, did you play around with the chunksize? https://boto3.amazonaws.com/v1/documentation/api/latest/reference/customizations/s3.html - multipart_chunksize – The partition size of each part for a multipart transfer.
if you have a 20MB file, you’ll only get 3 chunks with the default of 8mb. if you go to the 5MB min, maybe you’ll get another chunk or two uploading in parallel.
looks like you can play around with the s3 API with the AWS cli config, if you want to try it out before modifying the app:
[profile development]
aws_access_key_id=foo
aws_secret_access_key=bar
s3 =
max_concurrent_requests = 20
max_queue_size = 10000
multipart_threshold = 64MB
multipart_chunksize = 16MB
max_bandwidth = 50MB/s
use_accelerate_endpoint = true
addressing_style = path
yeah thanks @Jonathan Le, ill play around w/ chunk size default first before. Are you also solving this problem? How are you actually testing it’s improving speed? To test, we RDP into an azure vm (in asia or europe) and try uploading from the vm. Unfortunately even testing that route hasn’t been giving us “realistic” timing (much faster) when being compared to our actual clients in those areas. My guess is because Azure servers might still have preferential routing to AWS servers when compared to someone w/ their ISP at home.
Most of the time, upload speed is more to do with the client networks than yours. If you have access at that end, then tcp window scaling and QOS can help, depending on the specifics of their setups.
@btai i had to deal an analogous issue a couple of years ago (before transfer acceleration even), but it wasn’t the same issue. my problem was needed to optimize a data pipeline replicating terabytes of data across regions for processes on a routine basis.
testing and getting realistic results will be hard, esp. with what @Liam Helmer brought up. running an end user test on optimized cloud networks won’t be 100% realistic….my guess is that you’ll need to get friendly with a couple of end users that can do a small number of localized before and after tests and take some measurements. probably not the best, but sometimes you work with the hand your dealt.
Yeah, I’ve reached out to some of our global customers that we have a good relationship with and so I’ll do that and try to get numbers as a pulse check. I didn’t think so, but I was curious if anyone else had other clever ways of testing.
Oh. Hop on fiverr.com or something like that. create some disposable 1 day test accounts. Maybe there are some affordable software testers in the regions you need on home based internet and devices.
@Jonathan Le thats a great idea! im gonna run it by our engineering leadership
You can thank a tiktoker and home gym deadlifts and TGIF for that idea!
2021-01-21
2021-01-22
2021-01-26
Hi everyone! I am deploying a Rails application in ECS. The application only allows to be accessed from a specified hostname. I am stuck in the healthcheck pass issues. Health checks keep failing while the application is working fine when accessed via Loadbalancer (because I passed LB hostname as an environment variable) I am assuming that, for healthcheck, target group hits the instance ip as the IP is not allowed hence the healthcheck fails. I can not specify the instance ip like I specified the loadbalancer because we can not get instance ip from the launch configuration. Is there any way to tackle this?
What’s the advantage of using S3 SSE? is there an attack vector that it prevents?
This was just discussed in another forum actually: (different Slack workspace)
with terraform, is it possible to create a aws_secretsmanager_secret_version
resource that will merge its values with the current aws_secretsmanager_secret_version
(only if one exists)?
data "aws_secretsmanager_secret_version" "current_secret" {
secret_id = module.module_that_creates_a_secret.secret_id
}
resource "aws_secretsmanager_secret_version" "merging_secret_version" {
secret_id = module.module_that_creates_a_secret.secret_name
secret_string = jsonencode(
merge(
jsondecode(data.aws_secretsmanager_secret_version.current_secret.secret_string),
{
SECRET_TO_ADD = "A new secret"
}
)
)
}
Gets me to a point where it will merge when a current version exists but error if one doesn’t with Error: Secrets Manager Secret "the_secret_name" Version "AWSCURRENT" no Version "AWSCURRENT" not found
Anyone know if cloudwatch captures metrics around “Total # of EC2 instances running over time (per region)”?
aws cloudwatch list-metrics --namespace AWS/Usage --metric-name ResourceCount --dimensions Name=Service,Value=EC2
– I don’t believe that can be filtered to Region but I’m probably wrong as my CloudWatch metrics knowledge is weak sauce
There’s also a service limit ed extracted that gives you percent utilization for instances in a region FYI I though that was cool as I got the historical instance count backfilled by 15 months by looking at that number.
People running SQL Server on RDS - how do you handle backups? We need to have a longer retention than automated snapshots allow, but want to retain point in time recovery, so can’t only rely on native backup to s3. Was looking at AWS Backup, but that seems \(\)\(\) compared to S3 storage. Also, database size is over 4Tb so it’s a lot of bits to be pumping around making mistakes.
I’d recommend keeping it simple with AWS backup. if that doesn’t provide what you need then you’ll probably want to look at a third party vendor service that does RDS snapshot management but it’s probably going to be replicating something very similar to what AWS backup would already offer you.
What’s your retentionrequirements from the business and if longer than 35 have they considered GDPR? Having 35 days cleanup sorta makes adherence almost automatic with minimal fuss.
no need for GDPR - .au fintech company servicing only local
so long term retention is important
What’s long term? Also pricing for aws backup that’s the problem? I thought AWS backup was free excepting storage which is all just s3 anyways. You could have it lifecycle to IA or glacier for long-term if you needed to as well.
it’s snapshot storage which is ~ 3x on the price of s3 storage
$0.095/Gb/month
after crunching the numbers a bit more we’re going to go with AWS backup because I think we’ll end up saving money due to the way snapshots work with just taking diffs
it’s a really difficult pricing model to estimate when you’re dealing with TB scale dbs tho
Til that supported dbs can extract out the data as well.
Export an Amazon RDS database snapshot.
yeh, not for SQL server unfortunately tho
I’m dealing with more ec2 self managed instances so it’s easier and harder depending on the point of view. I’ve left default backups going as is for now but later would have to revisit for aws backup if needed longer too. Good to know! Keep us updated if any interesting facts come up
2021-01-27
2021-01-28
Hi All, Glad to have been invited to the channel. I think that the work that you guys do is great and have used some of the modules in the past keep up the great work I wonder if anyone can help me with a problem I have. I am trying to serve multiple react apps from the same s3 bucket we have a single Domain i.e my.example.com: I have configured a CDN with an S3 backend with a policy that allows the CDN to access s3 objects and the s3 bucket is configured for static website hosting. The default page to serve is set to index.html and well as the default error page. The Bucket is then used as an host 2 react Applications under 2 different paths say /App1 /App2 The CDN has been set up with 2 origins and path patterns of /App1/* and /App2/* and /api/* which points to an API gateway. The Default Root Object is not set and this should defer the serving of the root document to S3 under the origin that has been set with the path set s3.bucket.domain/path i.e mybucket.s3.eu-west-2.amazonaws.com/App1 or mybucket.s3.eu-west-2.amazonaws.com/App2 or mybucket.s3.eu-west-2.amazonaws.com/api The behaviour are path pattern /App1/* has the origin that points to mybucket.s3.eu-west-2.amazonaws.com/App1 , path_pattern => /App2/ : origin => mybucket.s3.eu-west-2.amazonaws.com/App2/*, path_pattern => /api/: origin => [mybucket.s3.eu-west-2.amazonaws.com/api/](http://mybucket.s3.eu-west-2.amazonaws.com/api/) with the default path_pattern pointing to the origin mybucket.s3.eu-west-2.amazonaws.com/App1 in that order in the behaviours When ever I request anything form my.example.com/App2/ The default behaviours is applied the same is for mybucket.s3.eu-west-2.amazonaws.com/App1/. In fact every page request returns the same index.html page no matter what page I as for weather it be an image.png or blah.js. The files exist under the /App1 and App2 in the s3 bucket. But for mybucket.s3.eu-west-2.amazonaws.com/api/someapi/ it seems to work fine. There is no index.html at the root of the s3 bucket and all file are either in /App1/ or /App2/ in the bucket. Has anyone done this before or know of a way that I can get this to work. Note everything has to be served under a single domain. Any help would be welcome
attention datadog pros
I need to distribute datadog across a fleet of machines. I can do this with SSM. I use chocolately for windows.
However, chef/puppet have modules for this that help configure the various yaml files on demand. I’m wondering if chef,puppet, salt have any simple to implement “masterless” approach like Ansible so I can implement with minimal fuss for a single package like this and leverage their config options.
The problem isn’t the install… It’s the config. Do I just have one giant config directory that everything uses including all possible log paths, or will this be throwing errors and causing needless failed checks constantly?
If I have to customize the collector per type of instance then I’m back in scripting a config and flipping from json to yaml. I don’t mind but if I can avoid this extra programmatic overhead and maintenance I want to figure that out.
i use powershell and https://github.com/cloudbase/powershell-yaml
PowerShell CmdLets for YAML format manipulation. Contribute to cloudbase/powershell-yaml development by creating an account on GitHub.
I have a finished blog post on an approach like this. I’m hoping though to NOT do this as I really want to avoid programmatic creation so others can just manage a yaml file if possible. Great minds think alike
Anyone use aws-vault + docker? I can normally mount my Codespaces container with aws creds if I use the plain text file. I prefer aws-vault but becomes problematic with this docker based work environment.
thinking maybe I could use a file backend, mount this, and then in container install aws-vault to accomplish this?
I prefer aws-vault but since I’m remote, no one has access to my machine it might be just easier to stick with plain text cred file
DevSecOps tools container, for use in local development and as a builder/runner in CI/CD pipelines. Not to be used to run production workloads. - saic-oss/anvil
We also recently came up with a pattern that uses docker-compose that is working very well. Not open sourced yet but it should be soon
the trick is to make sure the 4 AWS_* environment variables get passed through
Then you’re good
Does anyone know how to programmatically get the permissions needed for a given action? Eg here us a bunch of actions (which aws docs also refer to as operations) and the corresponding permissions:
s3:HeadBucket -> s3:ListBucket
s3:HeadObject -> s3:GetObject, s3:ListBucket
s3:GetBucketEncryption -> s3:GetEncryptionConfiguration
s3:GetBucketLifecycleConfiguration -> s3:GetLifecycleConfiguration
s3:GetObjectLockConfiguration -> docs don't say
Surely there is a table or an AWS CLI command to get this mapping? Eg something like aws iam get-permissions --action s3:HeadBucket
and the output would be s3:ListBucket
.
This is the worst. Following. Just been through down scoping Packer. Both the encoded error and cloud trail complete with its 30m+ lag were hopeless.
#vent
The eternal struggle. I think duo labs has stove tooling like pmapper/cloudmapper but nothing is perfect :-)
s3 is the only service i’ve seen that doesn’t line up 1:1, which i assume is because it’s probably their oldest api
I saw many others not line up, I stumbled into this issue because I did a grep from a terraform plan for the actions it would take, and after adding them to the policy, found many (about half) that did not match a permission. However, this was a quick experiment maybe I got unlucky first try
Did you give cloud mapper a shot? https://github.com/duo-labs/cloudmapper It does a pretty decent job of helping parse those out. Basically start with an elevated account, perform the action and the teardown and see what it requested. Then you can deploys a more narrowly privileged service user.
At least I found it reasonable. It uses athena and build this history from cloudtrail.
CloudMapper helps you analyze your Amazon Web Services (AWS) environments. - duo-labs/cloudmapper
@sheldonh Interesting! So I would use the collect
command of cloudmapper then process the json? What would I look for in there, and is there a way of only collecting iam info otherwise the file will be humungous.
Sorry mixing up repos.
https://engineering.salesforce.com/salesforce-cloud-security-automating-least-privilege-in-aws-iam-with-policy-sentry-b04fe457b8dc And cloudtracker is what I used Maybe explore policy sentry too.
Use this open source tool to achieve least privilege at scale.
@sheldonh cloudtracker sounds awesome I’ll give it a shot, thanks so much!