#aws (2023-09)
Discussion related to Amazon Web Services (AWS)
Discussion related to Amazon Web Services (AWS)
Archive: https://archive.sweetops.com/aws/
2023-09-01

Hi, what do I need to read about to use autodiscovery (or constant names) in an ECS cluster? I need services to talk to each other so service A has IP 10.0.1.2 and service B has 10.0.1.4 using those IP addresses to refer to each other will work until the task is restarted.


are you using host
, bridge
, or awsvpc
networking mode?

awsvpc

Then I believe this is going to be the recommended solution: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-discovery.html
Your Amazon ECS service can optionally be configured to use Amazon ECS service discovery. Service discovery uses AWS Cloud Map API actions to manage HTTP and DNS namespaces for your Amazon ECS services. For more information, see What Is AWS Cloud Map?

and then you would use DNS to reference the other service

As far as I understand ECS Connect is more modern way compared to CloudMap. Also with CloudMap you could get situations when CloudMap can return empty list of IP addresses during ECS service deployment. As far as I understand ECS Connect addresses that

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-connect.html
Amazon Amazon ECS Service Connect creates AWS Cloud Map services in your account

it uses CloudMap for private DNS, yes. But it also would deploy envoy proxies that AFAIU would deal with DNS resolution/traffic routing and do retirees if necessary

Hey guys, any comments on this one?

“The IP Address Dilemma: Could cost you an additional $180 per month on an average”

“good” there’s my comment

“The IP Address Dilemma: Could cost you an additional $180 per month on an average”

awesome

I just wish AWS ipv6 support was at a place where ipv4 wasn’t necessary. Definitely gonna be shutting down things in my own personal account that were handy, but won’t justify the cost anymore (I could run them for almost nothing in a public subnet)
See also: https://github.com/DuckbillGroup/aws-ipv6-gaps/pull/6
2023-09-02

Hi, just wondering if anyone is aware of a Terraform / CloudFormation template that has CIS 1.4 monitoring configure?

@Matt Calhoun
2023-09-03
2023-09-07
2023-09-08

Hi ALL,
We have github repos one repo is dependent to another repo. So for first repo it is containerised and second repo only file system. Now second repo I have created a image and pushed to ECR. Now i can use the second repo file system in first repo.
So my question is When ever any commits to second repo first repo codepipeline should work ?
( I have created cloudwatch event and copied arn in github webhook but it is not working )
Any solutions ?

@Matt Calhoun @Dan Miller (Cloud Posse)

@Max Lobur (Cloud Posse) @Igor Rodionov

Hi @Renesh reddy I’m not sure what the question is. Are asking how to trigger the first repo pipeline from the second repo pipeline?


Hi all,

Guys, I have a small problem with the SES service in AWS. Previously, I created a configuration set named ‘xxx,’ and even though it has been deleted, I often receive an error when sending an email. The error message says, “Configuration set xxx doesn’t exist,” and it includes error code 554. Do any of you have ideas on how to solve such issues?

@Andriy Knysh (Cloud Posse)

we have SES module https://github.com/cloudposse/terraform-aws-ses, and example how to use it https://github.com/cloudposse/terraform-aws-ses/tree/main/examples/complete, which gets deployed to AWS on every PR. I don’t know about the issue described above, but you can review the module

I resolved my issue, the problem was that even you delete your configuration set if it is set by default in global settings it will return to you an error because it tries to find it.
2023-09-11

for providing access to S3 resources, is there a good rule of thumb for when to add a policy to the principal vs. when to add a policy to the bucket? e.g. comparing the following two:
1) attach a policy to role to allow it to access an s3 bucket 2) attach a policy to the bucket to allow the role to access it

My rule of thumb: If the bucket and the role are in the same account, only the role policy is necessary. The bucket will by default allow the role. If different accounts, then both must allow access.

makes sense, thanks!

I think it also depends on carnality. If you have several roles that needs to access the bucket, it might be easier to attach the policy to the bucket and have some rules to include them all. On the other hand if it is only one role and several buckets, it might be easier to configure the policy on the role and find a way to include in the rules all the buckets.

smart thinking! I wish there was a wiki that had tips like this

2023-09-12
2023-09-13

Hi all, need some suggestion here. I have ec2 instances in ASG and prometheus exporters configured. can I send the prometheus metrics to cloudwatch and then use Grafana for visualization?

You can use cw as a data source. https://grafana.com/docs/grafana/latest/datasources/aws-cloudwatch/ Extra: Anywhere cloudwatch is involved, consider if you need near real time metrics/logs. I find the service exceedingly latent all too often. Maybe it’s just me.

@Alex Atkinson Thank you.. You mean the latency is higher side by cloudwatch?

My concern is actually on sending the Prometheus metrics to cloudwatch.

Yes, anything that goes into Cloudwatch has a potential to either be visible immediately, or after much time. To my experience. They can’t even identify their own outages in a timely manner. I rely on Twitter for that.

thanks for the info. Good to know that. I’m not concerned on the latency for few mins as the dashboards are used internally.

Hi guys, maybe this is not the right channel, but I am looking for help with an Nginx problem.
I am hosting an instance of Grafana, I have also configured my Nginx in order to ask for authentication request, it is working fine with web browsers, my problem is only with mobile browsers as it is asking login on every request almost every 3 sec.
I am getting this error with different mobile browsers,
Below my Nginx configuration:
server {
listen 3000;
server_name localhost;
auth_basic "Enter password!";
auth_basic_user_file /etc/nginx/conf.d/.htpasswd;
location / {
proxy_pass <http://grafana>;
proxy_set_header Host $http_host;
}
location /backup/ {
proxy_pass <http://flask/>;
proxy_read_timeout 600s;
}
}
I have tried everything, but the issue remains

@Matt Calhoun

hello all, I have the following iam role:
{
"Statement": [
{
"Action": [
"s3:GetReplicationConfiguration",
"s3:ListBucket",
"s3:PutInventoryConfiguration"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3::: sourcebucket"
]
},
{
"Action": [
"s3:GetObjectVersion",
"s3:GetObjectVersionAcl",
"s3:GetObjectVersionForReplication",
"s3:GetObjectVersionTagging",
"s3:PutInventoryConfiguration"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3::: sourcebucket/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectVersion",
"s3:GetObjectTagging",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::destinationbucket/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectVersion",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::reportbucket/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::reportbucket/*"
]
},
{
"Sid": "AllowS3ReplicationSourceRoleToUseTheKey",
"Effect": "Allow",
"Action": [
"kms:GenerateDataKey",
"kms:Encrypt",
"kms:Decrypt"
],
"Resource": "*"
},
{
"Action": [
"s3:GetBucketVersioning",
"s3:PutBucketVersioning",
"s3:ReplicateObject",
"s3:ReplicateTags",
"s3:ReplicateDelete",
"s3:PutObject",
"s3:PutObjectAcl",
"s3:PutObjectTagging",
"s3:ObjectOwnerOverrideToBucketOwner"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::sourcebucket",
"arn:aws:s3:::destinationbucket",
"arn:aws:s3:::sourcebucket/*",
"arn:aws:s3:::destinationbucket/*"
]
}
],
"Version": "2012-10-17"
}
I have velero backups in region A encrypted with kms and I would like to replicate them to another region. thanks

and the error you receive is…?

I did not set sns to get error, When I try the batch Operations then the error is:
Error occurred when preparing manifest: Access denied when accessing arn:aws:s3:::sourcebucket. s3:PutInventoryConfiguration required for the role.

in replication status I see only it is failed

good ticket to open with AWS. It sounds entirely related to their tech

ok thanks
2023-09-14

Hey folks, anybody have an example of setting up a VPC with dynamic subnets for a lambda with access to an RDS instance in the private subnet? I have this so far: https://gist.github.com/drmikecrowe/5c8b3bead3536f77511137417f15db39 No matter what I do, I can’t seem to get the routing to allow the lambda’s in the public subnets to reach internet (and AWS) services.

AFAIK Lambda functions cannot have public IPs so they cannot route to the internet without a NAT (gateway or instance). Put them in a private subnet, ensure the private subnet’s default route is a NAT in a public subnet, that the NAT has a public IP, and that the VPC has an IGW.
2023-09-15

Hi. I am working on a relatively new AWS org with a few accounts. The company only has a few projects/resources running in these accounts ATM. I just joined the company and have the opportunity to build out the aws with terraform. I was planning to carve out a big chunk of 10/8 (rfc1918) block for aws, but one of the networking guys is really pushing for us to use 100.64 (rfc6598) exclusively. I see where 100.64 is allowed and implemented in AWS for non-routable resources like EKS pods. I see where netflix has used it, but it was still using some of 10/8 too.
Do any of you have any experience running AWS enterprise level networking with transit gateways and such using only 100.64/10? I do not want to agree to using only 100.64 and discover some caveat later where we have to hack some additional natting or whatever to communicate with other networks/services.

@Max Lobur (Cloud Posse) @Igor Rodionov

Good question, I think @Jeremy G (Cloud Posse) might suggest if he’s following these. I just scanned though our variable history and did not find any use of 100.64, however there might be older cases I’m not aware. Also might worth to fill a support case to AWS with this

The 100.64/10 block is set aside for ISP-level NAT, in particular NAT64. IIRC it was, at one point, used by Kubernetes and/or kops
for the cluster’s internal address space. For EKS, they do not use 100.64/10.
At Cloud Posse, we always deploy VPCs into 10/8 and have not had a problem with that. I am not specifically aware of problems using 100.64/10 but I would avoid it unless you are doing carrier-grade NAT.

@Vlad Ionescu (he/him) We’ve noticed with some folks that Identity Center is starting to display the challenge code and ask users to verify it matches. Are you aware of anything that AWS might have unofficially updated? It feels like a stealth update was applied for security.

i figure they’re trying to implement suggestions for mitigating device code phishing attacks…
One suggestion is to display the code during the authorization flow and ask the user to verify that the same code is currently being displayed on the device they are setting up
https://ramimac.me/aws-device-auth

@Andrea Cavagna hearing any reports like these

I haven’t seen any official communications about this from AWS

We noticed that this week, we will look at it and I think In Leapp we are going to adapt by showing the AWS sso code


Is that the interaction you wil expect from a workflow like that?

That should work just fine. Thanks @Andrea Cavagna

Just released the fix for that, how it is working?

@Noel Jackson @Jeremy White (Cloud Posse)
2023-09-16

Hello everyone,
We are encountering a peculiar issue with our Windows 2012 R2 EC2 instance. We have installed the AWS CLI for all users. However, when attempting to use the AWS command in one of the user accounts, it fails to open properly, prompting an application error. As a workaround, we have been using the administrator account to execute AWS commands.
Additionally, we have scheduled jobs responsible for transferring files between the local system and Amazon S3. These jobs sporadically run successfully but often fail. It’s worth noting that we are operating behind a proxy.
I would greatly appreciate your suggestions on resolving this issue

@Jeremy White (Cloud Posse)

The CLI depends on several things. You might try using AWS Systems Manager. The AWS-InstallPackages
can take an MSI url, so you could use

<https://s3.amazonaws.com/aws-cli/AWSCLI64.msi>

Here’s the relevant documentation: https://docs.aws.amazon.com/systems-manager/latest/userguide/walkthrough-powershell.html#walkthrough-powershell-install-application
Use the Tools for Windows PowerShell to view information about commands and command parameters, run commands, and view the status of those commands.

That example is using powershell locally. You can also run these in AWS Systems Manager via console. Then you could select which ec2’s to execute on.

The run-command console is here: https://us-east-1.console.aws.amazon.com/systems-manager/run-command?region=us-east-1 (change the region according to your fleet)

If you’re interested in automating that via IaC, then terraform allows you to create an SSM association: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/ssm_association

You would just associate your windows EC2’s with the install document and add parameters to download the msi.

So, this gets to the next question, sync’ing to s3. You should be able to create a document to run aws s3 and a schedule, and AWS SSM will not only run the commands for you, it will report back failures so you can better observe when calls aren’t successful.

To do so using terraform, you can create a document with the aws_ssm_document

The document would just need to contain relevant commands. Since it can be tricky to write a full script in the SSM Document spec, make sure to consider using a separate script file if you find it too cumbersome.

Hi @Jeremy White (Cloud Posse) thanks for your response, we do have this setup in our infra, but this scenario, application team should see what happening in their task scheduler. I tried to install new cli version, but it doesn’t work, we are still facing the issue ( sometime job running and sometimes not ) fyi, we have upgraded the instances to higher
2023-09-17

Some older messages are unavailable. Due to the retention policies of an organization in this channel, all their messages and files from before this date have been deleted.
2023-09-18
2023-09-20

Hey folks need some help in this issue: I have managed to get the custom metrics from the prometheus to the cloudwatch, configured the cloudwatch datasource in the grafana. I can see the namespace in panel but I dont see any metrics in it. However in the cloudwatch console there are 1000+ metrics available under same namespace. Thanks

which document do you follow? seems it is a permission issue that the metrics values are not delivered to AWS

metrics from the aws services namespaces are coming through but not from the custom metrics namespaces.

i used this https://grafana.com/docs/grafana/latest/datasources/aws-cloudwatch/ and https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-PrometheusEC2.html
How to use the CloudWatch agent to collect Prometheus metrics from Amazon EC2 instances.

did you set up metric_namespace
?

yes.

Namespaces of Custom Metrics
Grafana can't load custom namespaces through the CloudWatch GetMetricData API.
To make custom metrics appear in the data source's query editor fields, specify the names of the namespaces containing the custom metrics in the data source configuration's Namespaces of Custom Metrics field. The field accepts multiple namespaces separated by commas.

this is the namespace.

and there are many metrics inside

may find some errors in the shipper’s pod

metrics are synced but values are not

also given the namespace in the grafana datasource configuration


metrics are synced but values are not - how do we check this? is there any doc for this please

are you seeing any metric values in the namespace?

yes…

i sent you a screenshot above, that is from cloudwatch

i can see the metrics thre

so the only issue is the metrics go into a wrong namespace?

hmm, no. in grafana dashboard - there are no datapoints nor there is any metrics displayed

Let us take a look at the logs of pods then

Im not using the kubernetes.. the agent is configured in the ec2 instance

and the agent sends the metrics to the cloudwatch logs. so i see there are metrics available in cloudwatch logs. and then from cloudwatch logs it is sent to the cloudwatch metrics - it is also successful as i see the metrics in cloudwatch metrics as well.
Now the configuration is between the grafana and cloudwatch metrics - the problem lies here,

Oh sorry I mean to find logs in the system

We are missing logs of the agent

ok let me check them


2023/09/21 14:54:46 I! Detected runAsUser: root
2023/09/21 14:54:46 I! Changing ownership of [/opt/aws/amazon-cloudwatch-agent/logs /opt/aws/amazon-cloudwatch-agent/etc /opt/aws/amazon-cloudwatch-agent/var] to 0:0
2023-09-21T14:54:46Z I! Starting AmazonCloudWatchAgent CWAgent/1.300026.3b189 (go1.20.7; linux; amd64)
2023-09-21T14:54:46Z I! AWS SDK log level not set
2023-09-21T14:54:46Z I! Creating new logs agent
2023-09-21T14:54:46Z I! [logagent] starting
2023-09-21T14:54:46Z I! [logagent] found plugin cloudwatchlogs is a log backend
2023-09-21T14:54:46.879Z info service/telemetry.go:96 Skipping telemetry setup. {"address": "", "level": "None"}
2023-09-21T14:54:46.879Z info service/service.go:131 Starting ... {"Version": "", "NumCPU": 2}
2023-09-21T14:54:46.879Z info extensions/extensions.go:30 Starting extensions...
2023-09-21T14:54:46.879Z info service/service.go:148 Everything is ready. Begin running and processing data.
2023-09-21T14:55:25Z I! Drop Prometheus metrics with unsupported types. Only Gauge, Counter and Summary are supported.
2023-09-21T14:55:25Z I! Please enable CWAgent debug mode to view the first 1000 dropped metrics
2023-09-21T14:55:26.883Z info [email protected]/emf_exporter.go:101 Start processing resource metrics {"kind": "exporter", "data_type": "metrics", "name": "awsemf/prometheus", "labels": {}}
2023-09-21T14:55:27.038Z info [email protected]/pusher.go:305 logpusher: publish log events successfully. {"kind": "exporter", "data_type": "metrics", "name": "awsemf/prometheus", "NumOfLogEvents": 185, "LogEventsSize": 68.4580078125, "Time": 40}
2023-09-21T14:55:27.198Z info [email protected]/emf_exporter.go:154 Finish processing resource metrics {"kind": "exporter", "data_type": "metrics", "name": "awsemf/prometheus", "labels": {}}
2023-09-21T14:56:27.205Z info [email protected]/emf_exporter.go:101 Start processing resource metrics {"kind": "exporter", "data_type": "metrics", "name": "awsemf/prometheus", "labels": {}}
2023-09-21T14:56:27.281Z info [email protected]/pusher.go:305 logpusher: publish log events successfully. {"kind": "exporter", "data_type": "metrics", "name": "awsemf/prometheus", "NumOfLogEvents": 438, "LogEventsSize": 161.5947265625, "Time": 60}
2023-09-21T14:56:27.421Z info [email protected]/emf_exporter.go:154 Finish processing resource metrics {"kind": "exporter", "data_type": "metrics", "name": "awsemf/prometheus", "labels": {}}
2023-09-21T14:57:27.428Z info [email protected]/emf_exporter.go:101 Start processing resource metrics {"kind": "exporter", "data_type": "metrics", "name": "awsemf/prometheus", "labels": {}}
2023-09-21T14:57:27.497Z info [email protected]/pusher.go:305 logpusher: publish log events successfully. {"kind": "exporter", "data_type": "metrics", "name": "awsemf/prometheus", "NumOfLogEvents": 438, "LogEventsSize": 161.5693359375, "Time": 54}
2023-09-21T14:57:27.643Z info [email protected]/emf_exporter.go:154 Finish processing resource metrics {"kind": "exporter", "data_type": "metrics", "name": "awsemf/prometheus", "labels": {}}
2023-09-21T14:58:27.648Z info [email protected]/emf_exporter.go:101 Start processing resource metrics {"kind": "exporter", "data_type": "metrics", "name": "awsemf/prometheus", "labels": {}}
2023-09-21T14:58:27.710Z info [email protected]/pusher.go:305 logpusher: publish log events successfully. {"kind": "exporter", "data_type": "metrics", "name": "awsemf/prometheus", "NumOfLogEvents": 438, "LogEventsSize": 161.5986328125, "Time": 43}
2023-09-21T14:58:27.867Z info [email protected]/emf_exporter.go:154 Finish processing resource metrics {"kind": "exporter", "data_type": "metrics", "name": "awsemf/prometheus", "labels": {}}

I think its something related to the aws permissions. i see in the raw response console
{"Message":"error getting accounts for current user or role: access denied. please check your IAM policy: User: arn:aws:sts::298738319810:assumed-role/mobilus-grafana-instance/1695309629321218232 is not authorized to perform: oam:ListSinks on resource: arn:aws:oam:eu-west-2:298738319810:/ListSinks","Error":"access denied. please check your IAM policy: User: arn:aws:sts::298738319810:assumed-role/mobilus-grafana-instance/1695309629321218232 is not authorized to perform: oam:ListSinks on resource: arn:aws:oam:eu-west-2:298738319810:/ListSinks","StatusCode":403}

right, it is

if admin role is attached to the instance, you should be seeing them

then you can figure out what policy is required later

attached the permission, still no luck yet my grafana is in the same organisation account, not sure why the error has come.

now the role mobilus-grafana-instance
is attached

let us attach admin policy to this role

ok

Hey thanks a lot. I figured out the mistake i was doing.

Its really a silly mistake - in the datasource namespace there was a space at the start which i missed out.

i saw the query and figured it out

awesome

you are welcome

HI @Hao Wang I got stuck into new issue. now the metrics is set but the server keeps getting crashed every 5 mins. I am sending around 50+ metrics and using t2.medium and i tried with t3.medium as well.

should be memory issue, need a more powerful instance

yes. The same metrics and configuration works really great with the prometheus standalone server but when it comes to pushing to cloudwatch there is performance issues.
2023-09-21

anyone used, done something like this https://www.youtube.com/watch?v=MKc9r6xOTpk

it makes sense in a lot of ways

this is pretty great and something that’s built up over time. Just thinking about my process of learning and applying to brownfield environments, building in security where we can, but also trying to have flexibility for teams. Definitely not easy

Some weekend reading from AWS : https://docs.aws.amazon.com/wellarchitected/latest/devops-guidance/devops-guidance.html
The AWS Well-Architected Framework DevOps Guidance offers a structured approach that organizations of all sizes can follow to cultivate a high-velocity, security-focused culture capable of delivering substantial business value using modern technologies and DevOps best practices.

AWS Cloudfront question:
• LIMIT: Response headers policies per AWS account : 20
• Desire: Be able to set response headers policies for all common MIME types, allowing devs to upload whatever they want to s3 without having to set the mime-type per file, resulting in the Content-Type header being present for all file served via cloudfront. Considering npm build can spew out who knows what, and who knows how much, uploading per file seems aweful, so I’m assuming folks use response header policies as follows to inject Content-Type headers…. Unless there’s another way (that’s not not using CF).

resource "aws_cloudfront_response_headers_policy" "css" {
name = "css"
custom_headers_config {
items {
header = "Content-Type"
override = true
value = "text/css"
}
}
}

I BASH’d out the configs for the 76 common mime types detailed from Mozilla, which is where I ran into the limit.

You can use Edge Lambdas

… obviously.

I just don’t have much experience with CF. All my CDN experience is with limelite, Akamai, and Fastly.

tyvm

Most libraries that upload to S3 will set the content type correctly. I think it’s better for you to require this on the source object

Hmmm……….. I’m uploading from a GH action. I bet there’s a GH action to do s3 uploads and detect content-type

5 deep and can’t see any mention of content-type…. Maybe not.

If you’re using the AWS cli, this behaviour is built in

I wasn’t seeing that.

I would really recommend fixing the type on upload. You don’t want to be repeatedly recomputing something every download at the edge

I was using aws s3 cp...
though.
I ended up making a simple map of extensions to mime types and for-eaching through every file.

It’s slow though. And it’s a small project. I could multi-thread the cmds though. Just chop the file list into 4 and have 4 bg tasks going.

so grody

https://docs.aws.amazon.com/cli/latest/reference/s3/cp.html
–content-type (string) Specify an explicit content type for this operation. This value overrides any guessed mime types.

Cli will guess for you

Use xargs or parallel?

…. You know what…. This all started bc of an incorrect mime type error where the file was XML instead of css, so that would have been CF responding with a 404 error.

Thanks for that, @Alex Jurkiewicz. The mention of the cli guessing automatically made me remember why this came up.
2023-09-22
2023-09-25

Hello, I need some help with some architect work:
I’m trying to automate extracting data from a DynamoDB in to S3 Bucket. This is what I have come up with some far (See image). The data will be extracted via Lambda function that will run every hour when there has been changed to the database. Would this be best practice or is there another way to approach this?

couple of questions for you to think about (i don’t have answers):
• there is data already in dynamo db. how this initial state will be extracted/processed?
• there was bug in lambda code, it was failing for more than 24 hours, dynamodb stream doesn’t contain full log anymore. what to do?

@z0rc3r
Could I ask what you mean by “dynamodb stream doesn’t contain full log anymore”

DynamoDB Streams captures a time-ordered sequence of item-level modifications in any DynamoDB table and stores this information in a log for up to 24 hours. Applications can access this log and view the data items as they appeared before and after they were modified, in near-real time.

@z0rc3r I don’t believe this will be an issue as the Lambda function will hourly on a daily basis so the data will always be up to date. I’m unsure if DynamoDB streams is a bit of an overkill for what my end goal is.

What’s your use case? Would one of the options presented here be acceptable?
I want to back up my Amazon DynamoDB table using Amazon Simple Storage Service (Amazon S3).

@Christopher Wade To extract data from DynamoDB Table in to a CSV file then push it to an S3 Bucket so external users are able to grab the CSV file

I don’t believe this will be an issue as the Lambda function will hourly on a daily basis so the data will always be up to date.
Imagine there was some changes in data structure, corrupted data in dynamodb, or aws deprecated lambda runtime. Lambda execution for some reason failed for more than 24 hours straight.

There will be some changes to the data structure if you mean by new records inputted. Unsure what you mean by “Corrupted Data in DynamoDb”. Can you elaborate further about what you have mentioned about Lambda execution failed for more than 24 hours straight? @z0rc3r

You expect lambda to always succeed and consume dynamodb stream. What happens when lambda execution fails for some reason? What happens when dynamodb stream is truncated after 24 hours of lambda failures?

@z0rc3r It’s something I’ve not thought about but I do believe these do need to be consider. Ideally I think it would be better to run the Lambda on daily basis rather than hourly. This would remove of using DynamoDB Streams.

Probably you don’t understand what dynamodb stream is. It’s sequence of changes to data, not data itself. If you don’t have complete history of changes (truncation after 24 hours) you cannot construct exact data state between current and last successful run.

It’s something that I’m not used too. As you’ve mentioned it can’t construct exact data state between current and last successful run. I do believe it’s a bit of an overkill for what we’re needing.

If you have ever done a similar piece of architecture work that would help I would be grateful to hear @z0rc3r

I didn’t
2023-09-26

Are there any usage examples for the https://github.com/cloudposse/terraform-aws-eks-node-group module that expand on how to create a node group that only has nodes in a private subnet?
Terraform module to provision a fully managed AWS EKS Node Group

Looking at the example I see an output that references a private_subnet_cidrs

But I don’t see how this output is actually discovered

Ah I see it comes from https://github.com/cloudposse/terraform-aws-dynamic-subnets
Terraform module for public and private subnets provisioning in existing VPC

provider "aws" {
region = var.region
}
module "label" {
source = "cloudposse/label/null"
version = "0.25.0"
# This is the preferred way to add attributes. It will put "cluster" last
# after any attributes set in `var.attributes` or `context.attributes`.
# In this case, we do not care, because we are only using this instance
# of this module to create tags.
attributes = ["cluster"]
context = module.this.context
}
locals {
# The usage of the specific kubernetes.io/cluster/* resource tags below are required
# for EKS and Kubernetes to discover and manage networking resources
# <https://aws.amazon.com/premiumsupport/knowledge-center/eks-vpc-subnet-discovery/>
# <https://github.com/kubernetes-sigs/aws-load-balancer-controller/blob/main/docs/deploy/subnet_discovery.md>
tags = { "kubernetes.io/cluster/${module.label.id}" = "shared" }
allow_all_ingress_rule = {
key = "allow_all_ingress"
type = "ingress"
from_port = 0
to_port = 0 # [sic] from and to port ignored when protocol is "-1", warning if not zero
protocol = "-1"
description = "Allow all ingress"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
allow_http_ingress_rule = {
key = "http"
type = "ingress"
from_port = 80
to_port = 80
protocol = "tcp"
description = "Allow HTTP ingress"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
extra_policy_arn = "arn:aws:iam::aws:policy/job-function/ViewOnlyAccess"
}
module "vpc" {
source = "cloudposse/vpc/aws"
version = "2.1.0"
ipv4_primary_cidr_block = var.vpc_cidr_block
context = module.this.context
}
module "subnets" {
source = "cloudposse/dynamic-subnets/aws"
version = "2.4.1"
availability_zones = var.availability_zones
vpc_id = module.vpc.vpc_id
igw_id = [module.vpc.igw_id]
ipv4_cidr_block = [module.vpc.vpc_cidr_block]
nat_gateway_enabled = false
nat_instance_enabled = false
context = module.this.context
}
module "ssh_source_access" {
source = "cloudposse/security-group/aws"
version = "0.4.3"
attributes = ["ssh", "source"]
security_group_description = "Test source security group ssh access only"
create_before_destroy = true
allow_all_egress = true
rules = [local.allow_all_ingress_rule]
# rules_map = { ssh_source = [local.allow_all_ingress_rule] }
vpc_id = module.vpc.vpc_id
context = module.label.context
}
module "https_sg" {
source = "cloudposse/security-group/aws"
version = "0.4.3"
attributes = ["http"]
security_group_description = "Allow http access"
create_before_destroy = true
allow_all_egress = true
rules = [local.allow_http_ingress_rule]
vpc_id = module.vpc.vpc_id
context = module.label.context
}
module "eks_cluster" {
source = "cloudposse/eks-cluster/aws"
version = "2.9.0"
region = var.region
vpc_id = module.vpc.vpc_id
subnet_ids = module.subnets.public_subnet_ids
kubernetes_version = var.kubernetes_version
local_exec_interpreter = var.local_exec_interpreter
oidc_provider_enabled = var.oidc_provider_enabled
enabled_cluster_log_types = var.enabled_cluster_log_types
cluster_log_retention_period = var.cluster_log_retention_period
# data auth has problems destroying the auth-map
kube_data_auth_enabled = false
kube_exec_auth_enabled = true
context = module.this.context
}
module "eks_node_group" {
source = "../../"
subnet_ids = module.this.enabled ? module.subnets.public_subnet_ids : ["filler_string_for_enabled_is_false"]
cluster_name = module.this.enabled ? module.eks_cluster.eks_cluster_id : "disabled"
instance_types = var.instance_types
desired_size = var.desired_size
min_size = var.min_size
max_size = var.max_size
kubernetes_version = [var.kubernetes_version]
kubernetes_labels = merge(var.kubernetes_labels, { attributes = coalesce(join(module.this.delimiter, module.this.attributes), "none") })
kubernetes_taints = var.kubernetes_taints
cluster_autoscaler_enabled = true
block_device_mappings = [{
device_name = "/dev/xvda"
volume_size = 20
volume_type = "gp2"
encrypted = true
delete_on_termination = true
}]
ec2_ssh_key_name = var.ec2_ssh_key_name
ssh_access_security_group_ids = [module.ssh_source_access.id]
associated_security_group_ids = [module.ssh_source_access.id, module.https_sg.id]
node_role_policy_arns = [local.extra_policy_arn]
update_config = var.update_config
after_cluster_joining_userdata = var.after_cluster_joining_userdata
ami_type = var.ami_type
ami_release_version = var.ami_release_version
before_cluster_joining_userdata = [var.before_cluster_joining_userdata]
# Ensure ordering of resource creation to eliminate the race conditions when applying the Kubernetes Auth ConfigMap.
# Do not create Node Group before the EKS cluster is created and the `aws-auth` Kubernetes ConfigMap is applied.
depends_on = [module.eks_cluster, module.eks_cluster.kubernetes_config_map_id]
create_before_destroy = true
force_update_version = var.force_update_version
replace_node_group_on_version_update = var.replace_node_group_on_version_update
node_group_terraform_timeouts = [{
create = "40m"
update = null
delete = "20m"
}]
context = module.this.context
}

the complete example which gets deployed to AWS on every PR

uses VPC, subnets, EKS cluster and EKS Node Group modules

does it use them to create new ones?

or does it hook to an existing VPC/Subnet configuration?

and use that?

you def can use the subnets module or not use it. The subnets are inputs to the EKS modules

subnet_ids = module.subnets.public_subnet_ids

subnet_ids = module.this.enabled ? module.subnets.public_subnet_ids : ["filler_string_for_enabled_is_false"]

provide your own subnets there

There’s only reference there to public_subnet_id

how do I pass a private_subnet_id to spin a node_group in?

the EKS modules def don’t create or lookup any subnets since its not their responsibility

if you are using the subnets
module, the private subnets are in the outputs https://github.com/cloudposse/terraform-aws-dynamic-subnets/blob/main/outputs.tf#L23
output "private_subnet_ids" {

if you create your own subnets, you have their IDs already (from the terraform-aws-subnet
resource)

This is true

I’m currently passing in a list of 6 subnets, 3 of which are private, 3 of which are public to the module:
variable "aws_eks_subnets" {
type = list(string)
description = "Subnets for worker nodes"
default = [
"subnet-0961f52276f66803a",
"subnet-0628e61bc2cbf07ab",
"subnet-05a72053829efed5c"
]
}

imagine there are 6 subnets there

there’s no distinction on that list between a public subnet vs a private one

oh sec I think I know what I’m doing wrong

ah I see, or I guess (if I’m using the aws vpc module) I could also use the output from module.subnets.private_subnet_ids

I’m going to just try passing the list of private subnet id’s, instead of the entire list

buuut…if I already have a VPC, with existing subnets

that I’d like to attach to and simply use to discovery existing private & public subnets….can the module do this?

just taking a training on AFT, Orgs and such and I saw this:

have anyone used this network setup?

does cloudposse @Erik Osterman (Cloud Posse) used this?

I wonder how performant, pricy this could be

Which training is this ?


Seems a bit old school with a single point of failure and centralised control, so I guess if you were super paranoid like a bank it could work.
Other more distributed approaches like maybe VPC Lattice could be explored.
GCP introduced some features that make this approach easier to implement including multi-cluster ingress, cross-region and cross-project load balancer references, distributed in-line firewall and hierarchical org-level firewal.


I set something like this up in a PCI environment to be able to audit all in/egress traffic in 1 place, meaning all public endpoints is then in the “public” account, including loadbalancers etc. not with AFT though

all cross account traffic is through transit gateway

single place to monitor wafs etc… but we had to use global accellerator for some of the loadbalancers, because of this design

@Eamon Keane no single point of failure here, this are not 1 device per component this are autoscalable devices to speeds of up to 45 Gbps

and multi-az etc

is a very simplified diagram

probably single point in the sense that a admin with acess to the public account can ruin the day

well anyone with access to an BGP router can ruin the day, that is a people problem

its a balance yes - if you have autonomous groups or divisions, you could argue that each account running their own in/egress is more “redundant”

in the training they mention that the Ingres part of this approach in AWS is not quite flexible , so a the end you need IGWs fro ingress for your LBs

egress goes through the network account

we used ingress on public account only, then loadbalancers would use target groups in destination accounts - but this was the reason we used global accellerator if I remember correctly - also its different for elb and layer4 one

IP targets?

I remember doing something like that with VPC peering, is a bit limiting

yes… not good, I know

we do it with transit gateway - not optimal, but it works

yes, I think with TGs is better now

Yes, we’ve implemented this before.

Most recently we helped @johncblandii

I understand the value from a compliance perspective, but I don’t like it from an IaC perspective.

what is the problem from IaC? Isolation of components per team?

I share this article a lot. It covers all the ways to manage centralized ingress in AWS… https://aws.amazon.com/blogs/networking-and-content-delivery/design-your-firewall-deployment-for-internet-ingress-traffic-flows/

Introduction Exposing Internet-facing applications requires careful consideration of what security controls are needed to protect against external threats and unwanted access. These security controls can vary depending on the type of application, size of the environment, operational constraints, or required inspection depth. For some scenarios, running Network Access Control Lists (NACL) and Security Groups (SG) […]

there are interesting options around centralized firewall management but distributed ingress points in there also

to be honest I’m not looking at implementing this but mostly to understand if this is a common pattern, if is used and if it have any gotchas

i’ve implemented the ELB sandwich with TG a couple times (from the article i shared, last diagram on the left). the biggest gotcha is just the complexity, and understanding impacts of changes. second biggest is lower agility due to division of responsibility, as any ingress requires coordination between the app team and the network/firewall team

yes, that will be difficult and I guess you will have to get static EIPs to make sure to never change the ingress ips if your products are whitelisted fromt he client side etc, which ads external coordination

This set up adds like $0.02 per request on ingress so take your current costs and x2 then

egress and Ingres x2?

On which metric/resource @johncblandii?

ingress which is why we only did egress

and only for non-prod

prod has too much traffic so 2x’ing our prod traffic bill is a non-starter

“Ingress” is not a billing metric though, so just trying to understand which item of the bill you are referring to

egress, not ingress

Transit Gateway data processing charge across peering attachments: 1 GB was sent from an EC2 instance #1 in a VPC attached to Transit Gateway #1 (N. Virginia region) over a peering attachment to Transit Gateway #2 (Oregon region), where it will reach EC2 instance #2 within a VPC. The total traffic related charges will result in a charge of $0.04. This charge comprises $0.02 for Transit Gateway Data Processing on Transit Gateway #1 along with $0.02 for outbound inter-Region data transfer charges. Here Transit Gateway #2 will not incur data processing charges, as they do not apply for data sent from a peering attachment to a Transit Gateway. As inbound inter-Region data transfer charges are free, no further charges apply to the Transit Gateway #2 (Oregon region) side.

but…Account A to B adds $0.02 per request so it is ingress, technically (at the start of the chain), but the charge is on the egress to the other account

so just keep that in mind

Oh yeah, TG traffic pricing is absurd

yeah, so we abandoned centralized ingress entirely

It is $0.02 per-GB, though, not per-request….

correct

so we did the math and that 2x on our GB’s was a no go

That might be a good reason to use one of the distributed ingress designs, from the article I posted (presuming you also have managed firewall requirements, anyway)

mmm that is good information
2023-09-27

Hi All. Might be a basic ask. I am trying to get a pricing comparison (On-demand) for Compute, Storage and Network between AWS, Azure and Alibaba Cloud for China Region.
Seems the pricing calculator domains for China region is different and doesn’t get much detail easily (Especially Object Storage).
If anyone of you have come across this requirement and have this data ready, could you please share? It will be much helpful to address this urgent query from a Stakeholder.
2023-09-29

Hi everyone, we are trying to use this https://github.com/cloudposse/terraform-aws-cloudfront-s3-cdn/blob/0.76.0/main.tf repo when I enable the s3_origin_enabled = true, we are getting the issue at line number 210: data “aws_iam_policy_document” “combined” and remove the sid or make the unique , please use the unique sid , but now fix the issue using the same repo anyone help to fix this issue
``` locals { enabled = module.this.enabled
# Encapsulate logic here so that it is not lost/scattered among the configuration website_enabled = local.enabled && var.website_enabled website_password_enabled = local.website_enabled && var.s3_website_password_enabled s3_origin_enabled = local.enabled && ! var.website_enabled create_s3_origin_bucket = local.enabled && var.origin_bucket == null s3_access_logging_enabled = local.enabled && (var.s3_access_logging_enabled == null ? length(var.s3_access_log_bucket_name) > 0 : var.s3_access_logging_enabled) create_cf_log_bucket = local.cloudfront_access_logging_enabled && local.cloudfront_access_log_create_bucket
create_cloudfront_origin_access_identity = local.enabled && length(compact([var.cloudfront_origin_access_identity_iam_arn])) == 0 # “” or null
origin_id = module.this.id origin_path = coalesce(var.origin_path, “/”) # Collect the information for whichever S3 bucket we are using as the origin origin_bucket_placeholder = { arn = “” bucket = “” website_domain = “” website_endpoint = “” bucket_regional_domain_name = “” } origin_bucket_options = { new = local.create_s3_origin_bucket ? aws_s3_bucket.origin[0] : null existing = local.enabled && var.origin_bucket != null ? data.aws_s3_bucket.origin[0] : null disabled = local.origin_bucket_placeholder } # Workaround for requirement that tertiary expression has to have exactly matching objects in both result values origin_bucket = local.origin_bucket_options[local.enabled ? (local.create_s3_origin_bucket ? “new” : “existing”) : “disabled”]
# Collect the information for cloudfront_origin_access_identity_iam and shorten the variable names cf_access_options = { new = local.create_cloudfront_origin_access_identity ? { arn = aws_cloudfront_origin_access_identity.default[0].iam_arn path = aws_cloudfront_origin_access_identity.default[0].cloudfront_access_identity_path } : null existing = { arn = var.cloudfront_origin_access_identity_iam_arn path = var.cloudfront_origin_access_identity_path } } cf_access = local.cf_access_options[local.create_cloudfront_origin_access_identity ? “new” : “existing”]
# Pick the IAM policy document based on whether the origin is an S3 origin or a Website origin iam_policy_document = local.enabled ? ( local.website_enabled ? data.aws_iam_policy_document.s3_website_origin[0].json : data.aws_iam_policy_document.s3_origin[0].json ) : “”
bucket = local.origin_bucket.bucket bucket_domain_name = var.website_enabled ? local.origin_bucket.website_endpoint : local.origin_bucket.bucket_regional_domain_name
override_origin_bucket_policy = local.enabled && var.override_origin_bucket_policy
lookup_cf_log_bucket = local.cloudfront_access_logging_enabled && ! local.cloudfront_access_log_create_bucket cf_log_bucket_domain = local.cloudfront_access_logging_enabled ? ( local.lookup_cf_log_bucket ? data.aws_s3_bucket.cf_logs[0].bucket_domain_name : module.logs.bucket_domain_name ) : “”
use_default_acm_certificate = var.acm_certificate_arn == “” minimum_protocol_version = var.minimum_protocol_version == “” ? (local.use_default_acm_certificate ? “TLSv1” : “TLSv1.2_2019”) : var.minimum_protocol_version
website_config = { redirect_all = [ { redirect_all_requests_to = var.redirect_all_requests_to } ] default = [ { index_document = var.index_document error_document = var.error_document routing_rules = var.routing_rules } ] } }
Make up for deprecated template_file and lack of templatestring
https://github.com/hashicorp/terraform-provider-template/issues/85
https://github.com/hashicorp/terraform/issues/26838
locals { override_policy = replace(replace(replace(var.additional_bucket_policy, “\({origin_path}", local.origin_path), "\){bucket_name}”, local.bucket), “$${cloudfront_origin_access_identity_iam_arn}”, local.cf_access.arn) }
module “origin_label” { source = “cloudposse/label/null” version = “0.25.0”
attributes = var.extra_origin_attributes
context = module.this.context }
resource “aws_cloudfront_origin_access_identity” “default” { count = local.create_cloudfront_origin_access_identity ? 1 : 0
comment = local.origin_id }
resource “random_password” “referer” { count = local.website_password_enabled ? 1 : 0
length = 32 special = false }
data “aws_iam_policy_document” “s3_origin” { count = local.s3_origin_enabled ? 1 : 0
override_json = local.override_policy
statement { sid = “S3GetObjectForCloudFront”
actions = ["s3:GetObject"]
resources = ["arn:aws:s3:::${local.bucket}${local.origin_path}*"]
principals {
type = "AWS"
identifiers = [local.cf_access.arn]
} }
statement { sid = “S3ListBucketForCloudFront”
actions = ["s3:ListBucket"]
resources = ["arn:aws:s3:::${local.bucket}"]
principals {
type = "AWS"
identifiers = [local.cf_access.arn]
} } }
data “aws_iam_policy_document” “s3_website_origin” { count = local.website_enabled ? 1 : 0
override_json = local.override_policy
statement { sid = “S3GetObjectForCloudFront”
actions = ["s3:GetObject"]
resources = ["arn:aws:s3:::${local.bucket}${local.origin_path}*"]
principals {
type = "AWS"
identifiers = ["*"]
}
dynamic "condition" {
for_each = local.website_password_enabled ? ["password"] : []
content {
test = "StringEquals"
variable = "aws:referer"
values = [random_password.referer[0].result]
}
} } }
data “aws_iam_policy_document” “deployment” { for_each = local.enabled ? var.deployment_principal_arns : {}
statement { actions = var.deployment_actions
resources = distinct(flatten([
[local.origin_bucket.arn],
formatlist("${local.origin_bucket.arn}/%s*", each.value),
]))
principals {
type = "AWS"
identifiers = [each.key]
} } }
data “aws_iam_policy_document” “s3_ssl_only” { count = var.allow_ssl_requests_only ? 1 : 0 statement { sid = “ForceSSLOnlyAccess” effect = “Deny” actions = [“s3:”] resources = [ local.origin_bucket.arn, “${local.origin_bucket.arn}/” ]
principals {
identifiers = ["*"]
type = "*"
}
condition {
test = "Bool"
values = ["false"]
variable = "aws:SecureTransport"
} } }
data “aws_iam_policy_document” “combined” { count = local.enabled ? 1 : 0
source_policy_documents = compact(concat( data.aws_iam_policy_document.s3_origin..json, data.aws_iam_policy_document.s3_website_origin..json, data.aws_iam_policy_document.s3_ssl_only..json, values(data.aws_iam_policy_document.deployment)[].json )) }
resource “aws_s3_bucket_policy” “default” { count = local.create_s3_origin_bucket || local.override_origin_bucket_policy ? 1 : 0
bucket = local.origin_bucket.bucket policy = join(“”, data.aws_iam_policy_document.combined.*.json) }
resource “aws_s3_bucket” “origin” {
#bridgecrew:skip=BC_AWS_S3_13:Skipping Enable S3 Bucket Logging
because we cannot enable it by default because we do not have a default destination for it.
#bridgecrew:skip=CKV_AWS_52:Skipping Ensure S3 bucket has MFA delete enabled
due to issue in terraform (https://github.com/hashicorp/terraform-provider-aws/issues/629).
count = local.create_s3_origin_bucket ? 1 : 0
bucket = module.origin_label.id acl = “private” tags = module.origin_label.tags force_destroy = var.origin_force_destroy
dynamic “server_side_encryption_configuration” { for_each = var.encryption_enabled ? [“true”] : []
content {
rule …