#aws (2023-05)

aws Discussion related to Amazon Web Services (AWS)

aws Discussion related to Amazon Web Services (AWS)

Archive: https://archive.sweetops.com/aws/

2023-05-01

SaZaf avatar

Hi guys, I just learned about Cloudwatch. Found that it uses metrics like CPU Usage, Disk usage etc but I also noticed that EC2 also displays many metrics in it’s Monitoring section.

Question: Do we use Cloudwatch with EC2 despite EC2 already providing useful analytics/monitoring? If yes, please share the use cases.

loren avatar

the ec2 metrics are really cloudwatch metrics under the covers

Mark Owusu Ayim avatar
Mark Owusu Ayim

Very true. Unless you want to create some personal insights(dashboards) from the instances you have, all accessed from one place then it will makes more sense to get extra features from Cloudwatch such as events for actionable purposes.

2023-05-02

Balazs Varga avatar
Balazs Varga

hello all, in aurora serverless, I see my cpucreditbalane dropped to 0 after a recovery triggered by aws. Is it counting same as ec2 T instances? https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/unlimited-mode-examples.html

Unlimited mode examples - Amazon Elastic Compute Cloud

The following examples explain credit use for instances that are configured as unlimited .

Paula avatar

Hi! im using this module https://registry.terraform.io/modules/cloudposse/ecs-alb-service-task/aws/0.68.0 in version 0.66.2 im trying to update to the latest version because everytime i change an environment variable i have to delete the service and recreate it because it doesnt take the latest task definition (generated by codepipeline) to create the new one. i tryied using redeploy_on_apply but i couldnt find any configuration which correctly takes the latest. My configuration looks like this:

module "ecs_alb_service_task" {
  source  = "cloudposse/ecs-alb-service-task/aws"
  version = "0.66.2"
  namespace                          = var.cluster_name
  stage                              = module.global_settings.environment
  name                               = local.project_name
  attributes                         = []
  container_definition_json          = module.container_definition.sensitive_json_map_encoded_list

  #Load Balancer
  alb_security_group                 = var.security_group_id
  ecs_load_balancers                 = local.ecs_load_balancer_internal_config

  #Capacity Provider Strategy 
  capacity_provider_strategies       = var.capacity_provider_strategies
  desired_count                      = 1
  ignore_changes_desired_count       = true
  launch_type                        = module.global_settings.default_ecs_launch_type

  #VPC
  vpc_id                             = var.vpc_id
  subnet_ids                         = var.subnet_ids
  assign_public_ip                   = module.global_settings.default_assign_public_ip
  network_mode                       = "awsvpc"

  ecs_cluster_arn                    = var.cluster_arn
  security_group_ids                 = [var.security_group_id]
  ignore_changes_task_definition     = true
  force_new_deployment               = true
  health_check_grace_period_seconds  = 200
  deployment_minimum_healthy_percent = module.global_settings.default_deployment_minimum_healthy_percent
  deployment_maximum_percent         = module.global_settings.default_deployment_maximum_percent
  deployment_controller_type         = module.global_settings.default_deployment_controller_type
  task_memory                        = local.task_memory
  task_cpu                           = local.task_cpu
  ordered_placement_strategy         = local.ordered_placement_strategy

  label_order                        = local.label_order
  labels_as_tags                     = local.labels_as_tags
  propagate_tags                     = local.propagate_tags
  tags                               = merge(var.tags, local.tags)

  #ECS Service
  task_exec_role_arn                 = [module.task_excecution_role.task_excecution_role_arn]
  task_role_arn                      = [module.task_excecution_role.task_excecution_role_arn]

  depends_on = [
    module.alb_ingress
  ]
}

any suggestions?

JoseF avatar

You are basically saying ignore_changes_task_definition = true meaning, don’t respect the future updates. It should be false.

Paula avatar

When i activate that option it tries to delete the service and recreate it with an older version of the task definition

JoseF avatar

Then your problem is not the task definition, since it suppose to use the latest version. It’s somewhere else. I don’t see the redeploy_on_apply usage which fulfill such purpose.

Paula avatar

Im currently modifying my original module (in version 0.66.2) such as i showed before. I guessed that redeploy_on_apply of the latest version of this module would take the latest version of the task definition and update it in the state file. Im not sure whats wrong

JoseF avatar

one thing is different than the other. redeploy on aply does not update the module version itself, but deploy a new task with version if is detected in the cluster. 2 different things.

Paula avatar

nono i know, but even if i upgrade the version and activate that option the task definition is not the latest. The picture i sended before is from 0.66.2 version with ignore_changes_task_definition = false. The next pictures are from v0.68 & redeploy_on_apply = true and ignore_changes_task_definition_false

Paula avatar

i tryied with diferent configurations of ignore_changes_task_definition, force_new_deployment and redeploy_on_apply and no one is working

Fizz avatar

If you change ‘ignore_changes_task_definition’ from true to false you should expect the service to be destroyed the first time due to the way the service is coded in the module. A second run should not require a destroy.

Fizz avatar

The reason it picks up an older version of v your task definition is because that is all terraform knows about. You have updated the task definition outside of terraform in code pipeline

Fizz avatar

If you are going to manage revisions is codepipeline you could pass in the correct task definition and version in the variable var.task_definition

2023-05-03

Bart Coddens avatar
Bart Coddens

I have a customer that has a huge oracle database: 120 TB, the limit on RDS is 64 TB, any suggestions ?

Fizz avatar

Sharding or self hosting on ec2

Hugo Samayoa avatar
Hugo Samayoa

Also talk to an AWS rep. You might get some free credits for moving such a large dataset. They would also give you some advice on your current issue

1
jsreed avatar

Aws will give your customer free credits and help cover costs on converting out of oracle for your customer… talk to the TAM

1
jsreed avatar

Otherwise ec2 self host or sharding

Balazs Varga avatar
Balazs Varga

is aurora serverless v1 HA compatible ?

ccastrapel avatar
ccastrapel

Hi there, I wrote a blog post that y’all may be interested in. It discusses how to manage cross-account AWS IAM permissions for different teams with an open-source Python tool called IAMbic. Would love feedback!

https://www.noq.dev/blog/aws-permission-bouncers-letting-loose-in-dev-keeping-it-tight-in-prod

Noq: AWS Permission Bouncers: Letting Loose in Dev, Keeping it Tight in Prodattachment image

Ever had a slight configuration change take down production services? Wish you could give teams more AWS permissions in dev/test accounts, but less in production? Right sizing IAM policies for each team and account can be a tedious task, especially as your environment grows. In this post, we’ll explore how IAMbic brings order to multi-account AWS IAM chaos.

2023-05-04

Alex Atkinson avatar
Alex Atkinson

For AWS Identity center, is there a way to see which accounts a group has access to via the cli? There’s no way in the console afaict.

Soren Jensen avatar
Soren Jensen

Not as far as I know It’s such a missing feature

Alex Atkinson avatar
Alex Atkinson

OK. Was just making sure I didn’t just miss it somehow.

2023-05-05

Matt Gowie avatar
Matt Gowie

Does anyone know of any tools that will scan a set of AWS accounts for best practices? Any that are recommended? My company has a list of 40+ best practices that we’ve identified and I’m looking for solutions to quickly check these best practices against a set of accounts or AWS organization.

bradym avatar

I haven’t used it myself yet, but I think https://github.com/cloud-custodian/cloud-custodian sounds like what you’re looking for.

cloud-custodian/cloud-custodian

Rules engine for cloud security, cost optimization, and governance, DSL in yaml for policies to query, filter, and take actions on resources

loren avatar

this is a nice project maintaining a list of various tools… https://github.com/toniblyx/my-arsenal-of-aws-security-tools

toniblyx/my-arsenal-of-aws-security-tools

List of open source tools for AWS security: defensive, offensive, auditing, DFIR, etc.

1
loren avatar

if i were to start with just one tool for checking against “best practices”, it would probably be prowler https://github.com/prowler-cloud/prowler

prowler-cloud/prowler

Prowler is an Open Source Security tool for AWS, Azure and GCP to perform Cloud Security best practices assessments, audits, incident response, compliance, continuous monitoring, hardening and forensics readiness. It contains hundreds of controls covering CIS, PCI-DSS, ISO27001, GDPR, HIPAA, FFIEC, SOC2, AWS FTR, ENS and custom security frameworks.

loren avatar

ElectricEye is another great one… https://github.com/jonrau1/ElectricEye

jonrau1/ElectricEye

ElectricEye is a multi-cloud, multi-SaaS Python CLI tool for Cloud Asset Management (CAM), Cloud Security Posture Management (CSPM), SaaS Security Posture Management (SSPM), and External Attack Surface Management (EASM) supporting 100s of services and evaluations to harden your public cloud & SaaS environments.

Hao Wang avatar
Hao Wang

yeah custodian is a good one

Hao Wang avatar
Hao Wang

the others are also interesting projects, thanks

Matt Gowie avatar
Matt Gowie

Good stuff – Thank you folks.

Sudhish KR avatar
Sudhish KR

If you are looking for a SaaS solution - I would go with Aqua security … they bought a compnay called CloudSploit a few years ago, and they have a good level of reporting/remediation steps for issues that are detected.

2023-05-08

venkata.mutyala avatar
venkata.mutyala

Just an FYI - if you plan to upgrade to the latest EBS addon for EKS (1.18.0.build1) you may want to wait: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/1591

We use kube-stack-prometheus and have a CPUThrottling alarm going off from our upgrade.

#1591 CPU Spikeattachment image

/kind bug

What happened?

Upgraded EBS Addon in EKS and CPU usage of the node daemonsets spiked

image

What you expected to happen?

literally no change to happen

How to reproduce it (as minimally and precisely as possible)?

eksctl update addon --name aws-ebs-csi-driver --version latest \
  --cluster ${CLUSTER_NAME} \
  --service-account-role-arn arn:aws:iam::${IHSM_ARN}:role/AmazonEKS_EBS_CSI_DriverRole_${CLUSTER_NAME} \
  --force

To roll back to a non spiking version:

eksctl update addon --name aws-ebs-csi-driver --version v1.17.0-eksbuild.1 \
  --cluster ${CLUSTER_NAME} \
  --service-account-role-arn arn:aws:iam::${IHSM_ARN}:role/AmazonEKS_EBS_CSI_DriverRole_${CLUSTER_NAME} \
  --force
1

2023-05-10

Balazs Varga avatar
Balazs Varga

we have clusters using spot instances and we use cluster autoscaler. sometimes we see 504. I found few issues in autoscaler github page. how could I avoid 504 when autoscaler scale down instances?

Fizz avatar

Try karpenter? It drains first before removing nodes.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Yea, might help with Karpenter, but unless the services are deliberately removed from the ALB, you’ll still get 504s.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

I am not sure if karpenter does that.

Balazs Varga avatar
Balazs Varga

We solved it with annotations. So once it cordoned, it will be removed from LB and if the cordon reason is rebalancing, then after delete from cluster it waits for another 90sec before termination. After this we dont see 504 errors in llLB logs

1
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Balazs Varga just to clarify, you cordoned the nodes with an annotation similar to this? or some other annotation?

kubectl annotate node <node-name> eks.amazonaws.com/cordon=true
Balazs Varga avatar
Balazs Varga

Aws termination hander cordoned the node automatically. We enabled the rebalance watch option in the past.

Balazs Varga avatar
Balazs Varga

I am with my laptop, but later I will check the exact annotation

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

That would be great, as I think I am not aware of this option.

Balazs Varga avatar
Balazs Varga

so on svc we use the following: with this we can achieve to close open connections before termination.

Balazs Varga avatar
Balazs Varga

and on node termination handler we use the following options:

• enableRebalanceDraining: true

• enableSpotInterruptionDraining: true

• enableScheduledEventDraining: true

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Aha, though please note the node termination handler is redundant with functionality in Karpenter now, and no longer recommended to be deployed along side of it. That said, I don’t know if those features are available in kapenter. Thanks for sharing!

https://aws.github.io/aws-eks-best-practices/karpenter/#enable-interruption-handling-when-using-spot

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
FAQsattachment image

Review Karpenter Frequently Asked Questions

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Dan Miller (Cloud Posse) heads up

Balazs Varga avatar
Balazs Varga

I think we cannot use karpenter because of limitations with kops. we create our clusters using kops

2
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Aha!

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Yes, not compatible

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Probably

Balazs Varga avatar
Balazs Varga

maybe later .

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

@Jonathan Eunice

2023-05-11

vicentemanzano6 avatar
vicentemanzano6

Hi all! do memory db for redis patch updates cause any dowmtime?

Alex Jurkiewicz avatar
Alex Jurkiewicz

minimal but yes

2023-05-12

Aadhesh avatar
Aadhesh

Hey Everyone. Curious to know if anyone is using Turbonomics as your Cloud Financial/Cost Management tool and how is your experience when compared to Cloud Health (or) Cloudability?

Aadhesh avatar
Aadhesh

Turbomic says it has automated execution actions for Rightsizing Instances. But how does it manage (or) sync the state files if the instances are managed through Terraform?

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

I would be extremely skeptical of any tool that does right sizing the “right way” in an IAC environment

1
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

There’s no one way to do it, and companies have all kinds of strategies.

1
Renesh reddy avatar
Renesh reddy

Hi all

Is there a way to add files( 2 files) to ECS fargate containers ? ( we are using github as source code, not able to add in github due to security reasons )

Darren Cunningham avatar
Darren Cunningham

I’d recommend the service to pull what it needs from either Secrets Manager or S3 depending on the size

Renesh reddy avatar
Renesh reddy

how do we pull via task defination ? should be update service via task defination.

Darren Cunningham avatar
Darren Cunningham

you mean, you want to add two files to a container without making a change to the image you’re pulling?

Renesh reddy avatar
Renesh reddy

yup

Darren Cunningham avatar
Darren Cunningham

might be able to do that with volume mounts then

Renesh reddy avatar
Renesh reddy

Those 2 files are related to Auth files. ( one is private and public )

Renesh reddy avatar
Renesh reddy

just because of adding 2 files not sure required EFS mount points

Darren Cunningham avatar
Darren Cunningham

you’re fairly constrained as to what options you have if you’re unwilling to update the image

Renesh reddy avatar
Renesh reddy

If I upload these 2 files from S3 how do I call it from task defination that these 2 files should be on xx/xxx path

Darren Cunningham avatar
Darren Cunningham

you would update the image that you’re using to have an entrypoint script that pulls the s3 files and then starts the application as it has before

Darren Cunningham avatar
Darren Cunningham

not just update the task def

Darren Cunningham avatar
Darren Cunningham

that works if you can use an environment variable rather than a file

Matt Gowie avatar
Matt Gowie

Does anyone have strong opinions on how to do AWS Lambda while also managing the infrastructure via Terraform? There are a bunch of options out there, but I’ve never personally seen an implementation that I liked. My team and I are working on how to do this better and are evaluating Serverless framework (CloudFormation ), AWS SAM (has TF support, but doesn’t look great), and classic “build our own”.

Would love to hear someone who has implemented a solution that doesn’t feel disjointed and has strong opinions from real experience!

1
loren avatar

What are the pain points you’ve had? I’ve enjoyed using Anton’s terraform-aws-lambda module

jose.amengual avatar
jose.amengual

we recently had to do a similar assesment and we ended up with SAM

this1
jose.amengual avatar
jose.amengual

the reason behind it was the easy integration and deployment with github actions and easier to understand template options

jose.amengual avatar
jose.amengual

this was only done on `VERY basic lambda deploy, which will connect to infra created by TF(SQS, SNS, RDS etc), if someone was, for example, going to deploy a lambda that required a ton of infra then that was a good candidate to move to TF only deploys hybrid with github actions

jose.amengual avatar
jose.amengual

that is what we ended with and it works so far

jose.amengual avatar
jose.amengual

keep in mind that EVERYTHING else was created with TF ( vpc, subnet, transit gateways, rds, sqs, sns etc)

1
Vlad Ionescu (he/him) avatar
Vlad Ionescu (he/him)

Yeah, I go either for vanilla SAM or for Anton’s Serverless.tf modules (with or without SAM).

Vanilla CloudFormation is a bit too raw, but works if you’re a large company that wants to build something custom. Serverless Framework I would avoid because of their chaotic history and dubious ecosystem.

Everybody else is too new, too risky, or too limited.

Vlad Ionescu (he/him) avatar
Vlad Ionescu (he/him)

Neither option is perfect, so expect some mild annoyances!

Vlad Ionescu (he/him) avatar
Vlad Ionescu (he/him)

Usually:

• a few Lambdas => serverless.tf

• a bunch of Lambdas and complex serverless architectures => SAM

1
Alanis Swanepoel avatar
Alanis Swanepoel

Personally, i prefer building out my stack in CDK, which bootstraps Cloudformation

Alex Jurkiewicz avatar
Alex Jurkiewicz

+1 to SAM. Stick with the first party tooling here

managedkaos avatar
managedkaos

last team i was on that was using IaC and Lambda did a mix of Terraform and SAM.

That is, all the underlying plumbing was deployed using TF.

The devs would use the TF resources as inputs to their SAM deployments.

it worked pretty well.

1
1
Matt Gowie avatar
Matt Gowie

AWESOME stuff – thanks folks! Really appreciating this community right now as I’ve talked or worked with a bunch of you already, so I trust your opinions. We’ll likely try out the TF + SAM route

Joe Perez avatar
Joe Perez

I’ve used Terraform+Github Actions+ECR, I’ll be the first to say that the local dev experience isn’t great (aka slow), but it’s a simple setup where a dev pushes to their feature branch, an artifact is built by GHA and pushed to ECR, then a simple aws cli call to update the function code to the dev’s container tag. I’m looking to explore SAM+TF. when I first checked it out, I believe SAM could only use resources created by SAM and the available resource list was a bit small

2023-05-13

2023-05-14

2023-05-15

aj_baller23 avatar
aj_baller23

Hi All, I was wondering if anyone had experience with setting up control tower on an existing AWS account that’s part of AWS Organizations. I want to separate our current environments into their own account and implement tarraform moving forward. I want to make sure i don’t affect our current environment during this process. Any advice would be awesome. Thanks!

Hao Wang avatar
Hao Wang

Wouldn’t recommend using Control Tower, CT is like a black box, hard for troubleshooting

Hao Wang avatar
Hao Wang

Maybe it is just myself

tommy avatar

@Hao Wang what solution do you recommend for multi account organizations? I am using control tower to provision new account under AWS organization, it really is not convenient but I don’t know other solutions.

tommy avatar

Thanks

tommy avatar

@aj_baller23 if your envs are under the root account, enrolling into control tower will not affect it. Then you can create sub accounts to host separate envs.

tommy avatar

Cloudposse doesn’t use control tower either, they provision news accounts with their AWS components.

2023-05-16

managedkaos avatar
managedkaos

The Control Tower question got me thinking about another thing I’ve been wondering about for some time:

For SMBs, how do you manage the root account credentials for a multi-account organization?

That is, given a single AWS account that will be used to spawn off sub accounts, how do you govern access to the root email address and the 2FA keys associated with the account?

I’m specifically looking at this from the perspective of a small business or sole proprietorship that needs to keep things secure but also ensure business continuity.

Darren Cunningham avatar
Darren Cunningham

this becomes even trickier for remote teams. but here are the two practices that I’ve seen:

most secure practice - hardware tokens issued to the people who need access and delivered via signature confirmation snail mail. core member provisions the hardware token and dispatches them, token is removed from the root account if they leave the org.

a more manageable (IMO) practice - software 2FA (e.g. Vault or 1Password) in a shared vault, password is rotated if a core member leaves the org

1
managedkaos avatar
managedkaos
Best practices for the management account - AWS Organizations

Describes best practices related to the management account in AWS Organizations.

Darren Cunningham avatar
Darren Cunningham

ah yeah the group email address is another really good point

1
Kyle Johnson avatar
Kyle Johnson

Email group specifically for the root user. Password in 1Password or similar. I set up yubikeys and mail them; only share the password on confirmation of receipt. (What Darren said.)

When someone leaves, deactivation is as simple as removing the yubikey MFA from AWS (now that they support multiple!). Avoids the need to rotate passwords. Keeps MFA as a piece of hardware which I like:

• I don’t have to think about some case where your password vault somehow gets compromised. (Hard, but not impossible.)

You must go find your physical root MFA key should hopefully make folks think a lot harder about what they need it for vs 1P or similar mindlessly autofilling a TOTP.

Darren Cunningham avatar
Darren Cunningham

the multiple hardware keys is essential to this. that was the biggest blocker and why we used the 1P option for so long.

2
1
Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Second everyone else. We recommend a google group. We used 1P and shared TOTP, but now multiple hardware keys are available. And with 1P we use geofencing.

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)

Also, we recommend resetting the master credentials on all member accounts and enabling MFA on all root credentials for member accounts.

Darren Cunningham avatar
Darren Cunningham

can you share more information about the 1P geofencing? I wasn’t aware of that being an option and would like to explore that a bit more

Erik Osterman (Cloud Posse) avatar
Erik Osterman (Cloud Posse)
Create firewall rules in 1Password Business

Learn how to use Advanced Protection to restrict where your team can access 1Password.

2
Darren Cunningham avatar
Darren Cunningham

thanks

2023-05-17

Brent G avatar
Brent G

Anybody know how to purge-with-fire unattended-upgrades in cloud-init? Because that satan of a package is blocking cloud-init itself from installing packages. I tried throwing a systemctl stop/mask into a cloud-boothook section, but that just threw errors.

Hao Wang avatar
Hao Wang

unattended-upgrades will be set up beforehand, and cloud-init should just run once?

Hao Wang avatar
Hao Wang

not sure if I understand

Brent G avatar
Brent G

So deploying a new Ubuntu box, specifying packages to install with cloud-init, it gets started, but then ubuntu’s unattended-upgrades script kicks off which blocks cloud-init because it grabs the apt lock.

Brent G avatar
Brent G

I think I got it setup with bootcmd to kill off the service, so we’ll see

1
managedkaos avatar
managedkaos

using cookup.ai to generate IAM policies….

Input: give permission to read from the ecr repo named container123456, and permission to start and stop ec2 instances in the us-east-1 region.

Result: Here is the AWS IAM Policy that corresponds to the plaintext description you provided:

{
  "Version":"2012-10-17",
  "Statement":[
    {
      "Effect":"Allow",
      "Action":[
        "ecr:GetAuthorizationToken",
        "ecr:BatchCheckLayerAvailability",
        "ecr:GetDownloadUrlForLayer",
        "ecr:GetRepositoryPolicy",
        "ecr:DescribeRepositories",
        "ecr:ListImages",
        "ecr:DescribeImages",
        "ecr:BatchGetImage"
      ],
      "Resource":[
        "arn:aws:ecr:us-east-1:123456789012:repository/container123456"
      ]
    },
    {
      "Effect":"Allow",
      "Action":[
        "ec2:StartInstances",
        "ec2:StopInstances"
      ],
      "Resource":"arn:aws:ec2:us-east-1:123456789012:instance/*"
    }
  ]
}

good response but I had to format the code myself…

Vlad Ciobancai avatar
Vlad Ciobancai

I’m using https://flosell.github.io/iam-policy-json-to-terraform/ to format the policy to tf hcl

iam-policy-json-to-terraform - Easily convert AWS IAM policies to Terraform HCL

This tool converts standard IAM policies in JSON format (like what you’d find in the AWS docs) into more terraform native aws_iam_policy_document data source code

1
1
managedkaos avatar
managedkaos

I prefer keeping my IAM json as json and using jsonencode in TF. Makes it easier to bring existing policies in…or update my policies in TF when people tweak them outside of IaC

managedkaos avatar
managedkaos

but i guess using these two tools, you can describe a policy in plaintext an end up with native TF.

sounds like the process needs a wrapper…

Vlad Ciobancai avatar
Vlad Ciobancai

make sense

managedkaos avatar
managedkaos

but TF policies are easier to read so .. its a toss up

1
Alex Jurkiewicz avatar
Alex Jurkiewicz

if I was being strict, GetRepositoryPolicy is probably not required

Alex Jurkiewicz avatar
Alex Jurkiewicz

actually, there are some other errors. GetAuthorizationToken requires permissions against the * resource, right?

Anup Dubey avatar
Anup Dubey

Hi All, I wanted to initiate a discussion regarding our kubernetes platform and how we handle services running on Kubernetes with 100% spot instances. Specifically, I would like to address a scenario where a service is running on Kubernetes with two pods in a production environment, each deployed on different spot nodes. In this situation, if one of the nodes experiences spot interruption, resulting in the pod being rescheduled to another node, and that second node also gets interrupted same time other pod also reschedule on another node in initialize state, we will encounter an outage as both pods end up in an “initialize” state. Anyone who’s aware of how we are taking care this running 100% on spot???

Alex Jurkiewicz avatar
Alex Jurkiewicz

you can’t completely mitigate this risk if you run on 100% spot.

If the risk is unacceptable, you can:

  1. Run more pods on more instance types across more AZs to minimise the risk, or
  2. run some portion of your workload on on-demand/reserved instances
tommy avatar

Or you could try spot ocean which will help manage the node lifecycle. You can treat spot nodes as if they are normal ones.

Alex Jurkiewicz avatar
Alex Jurkiewicz

the vmware product? It can’t prevent spot interruptions. Sometimes many instances will be interrupted at once

tommy avatar

It is a node pool scheduler, backed by spot, ondemand and reserved instances, but for k8s, it exposes as a scaling group like resource.

tommy avatar

It will mix different kinds of instances to prevent the worst situation.

Hao Wang avatar
Hao Wang

neat

1

2023-05-18

2023-05-19

2023-05-22

underplank avatar
underplank

Hi all. Im just starting to use the cloudposse module for eks clusters. Really liking it so far. Currently I have a bitbucket pipeline that uses OIDC to assume a role in AWS to run the terraform. That role has the administrator policy. I’ve enabled the aws auth configmap and put that role inside the config map, and attached it to the “cluster-admin” group, which I assume has full powers to update anything cluster wide. So my terraform looks like this

  map_additional_iam_roles = [
    {
      rolearn  = "arn:aws:iam::***:role/workers"
      username = "system:node:{{EC2PrivateDNSName}}"
      groups   = ["system:bootstrappers", "system:nodes"]
    },
    {
      rolearn  = data.aws_iam_role.AdminRole.arn
      username = "admin-user"
      groups   = ["cluster-admin"]
    },
    {
      rolearn  = aws_iam_role.infrastructure-management.arn
      username = "pipeline"
      groups   = ["cluster-admin"]
    }
  ] 
underplank avatar
underplank

This works well when I use data.aws_iam_role.AdminRole.arn to login from my command line. This is temporary creds generated through AWS SSO. However when I use aws_iam_role.infrastructure-management.arn it fails

underplank avatar
underplank
Planning failed. Terraform encountered an error while generating this plan.
╷
│ Error: configmaps "aws-auth" is forbidden: User "pipeline" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
│ 
│   with module.workload_1.kubernetes_config_map.aws_auth[0],
│   on .terraform/modules/workload_1/auth.tf line 138, in resource "kubernetes_config_map" "aws_auth":
│  138: resource "kubernetes_config_map" "aws_auth" {
│ 
╵
underplank avatar
underplank

Ahh I think I worked it out. I need to use “system:masters” as the group. because cluster-admin is a clusterrolebinding but not a group. The issue is I dont know how I work out what groups there actually are?

Gabriela Campana (Cloud Posse) avatar
Gabriela Campana (Cloud Posse)

@Dan Miller (Cloud Posse)

Dan Miller (Cloud Posse) avatar
Dan Miller (Cloud Posse)

this thread is 7 months old. Do you still have a question?

1

2023-05-23

vicentemanzano6 avatar
vicentemanzano6

Hi! I am having issues enrolling an AWS Account under control tower. I am receiving an error saying

AWS Control Tower cannot enroll the account. There's an error in the provisioned product in AWS Service Catalog: ProvisionedProduct with Name: null and Id: *********** doesn't exist

Is there any way to reconcile control tower and service catalog to create a new product when trying to enroll the account?

Hao Wang avatar
Hao Wang

CT is hard to use for it is like a black box. Feeling you may not have enough permission or the product doesn’t exist

Alex Jurkiewicz avatar
Alex Jurkiewicz

this sounds like a great question for aws support

kallan.gerard avatar
kallan.gerard

You’re in for a fun time

kallan.gerard avatar
kallan.gerard

To be honest I don’t recommend it. If you’re not too deep already I’d back out.

kallan.gerard avatar
kallan.gerard

The trouble you’re experiencing now is not an isolated incident. These class of problems will be a continuous occurrence with CT

jonjitsu avatar
jonjitsu

@kallan.gerard Do you have any recommendations for an alternative to CT?

kallan.gerard avatar
kallan.gerard

Hi @vicentemanzano6 I would probably just use terraform with the aws provider resources and the aws organization primitives

kallan.gerard avatar
kallan.gerard

The expand from there

kallan.gerard avatar
kallan.gerard

I typically use an admin/workload tf pattern for this sort of thing. I’m not sure what to call it.

kallan.gerard avatar
kallan.gerard

One terraform config would be for the organization root account, and would contain things like the aws_organizations_account resources for your member accounts, and the initial seed config

kallan.gerard avatar
kallan.gerard

Then each member account would have it’s own terraform governance config directory, which you configure with a governance accountadmin role inside each account, and you import a common stack(s) modules inside each config for any resources you want each account to have

kallan.gerard avatar
kallan.gerard

So like / /admin

# the config directory that runs in the org root account

# provisions org member accounts, trust access for those accounts, whatever else you need etc main.tf … /business-units … /sales /engineering /aws-account-analytics-dev main.tf # call stack module /aws-account-analytics-dev main.tf /modules /stack main.tf # module you want in every account

2023-05-24

2023-05-29

    keyboard_arrow_up