#aws (2020-12)
Discussion related to Amazon Web Services (AWS)
Discussion related to Amazon Web Services (AWS)
Archive: https://archive.sweetops.com/aws/
2020-12-01
Hi, anyone knows if ACM will support a SAN with one of the alternative names as an IP?
AFAIK CA/B forums deprecated use of IPs many years ago, I am not sure what current state is though
AFAIK no but this sounds like the sort of simple yes/no question that AWS Support are actually good at answering
it looks like is not supported
Hi, anyone knows if ACM will support a SAN with one of the alternative names as an IP?
Hey @jose.amengual can you explain what you are trying to do?
sure, we have Customers that do not allow DNS resolution outside their network
technically you can create a SAN certificate with a wildcard. I don’t know why you would put an IP to a SAN certificate when you can use DNS. If you want to create a SAN certificate for a particular IP, I would do a self-signed cert.
so we provide static ips
So, are you saying that these IPs will be the same at all times?
and when they use static ips they need to use /hosts files to be able to use the dns name than the cert
yes they are static
so I was ask to see if we can add the ip as an alternative name to the SAN
Are these certs for public facing apps, or internally?
so they do not need to use the host files
public
Gotcha. So this is probably possible. Let me write something here.
Sorry @jose.amengual I was having a bunch of zoom calls.
So this was easy. Here is what you need to know anyways: Technically you need a registered DNS domain if something is public so your browser doesn’t pop a warning that says something like: “This site is not trusted” when you are using a browser, unless you copy the public certificate into your laptop/computer certificate chain. Mostly because certificates are for HTTPS connectivity when resolving an DNS domain when this is public facing.
This is how you need to use this script (it uses openssl
to generate the cert:
- You need to specify the first argument for the certificate filename.
- the 2nd argument is *optional* but this is particularly the name of the DNS name that you want to add to it. If you don’t specify this, it will use the internal fullname of the host where you are running this script.
- Inside the SAN, It will use
hostname -I
to get the IP of the host and use it as part of the DNS. Additionally it will use*.ec2.internal
wildcard so it works in everyus-<REGION>-n
region (i think in europe they useec2.compute.internal
So the way to use it (you need to usebash
notsh
for formatting reasons), is like this:bash create_san_certs.sh somename mydomain.com
The script will generate a private key and a public key.
so then I will have to import it to ACM?
because I need the cert in ACM at some point
that’s a good point. You can import custom certs into ACM. If it lets you import the cert it will work
Hi, Anyone knows if I can read a Parameter store secret from one AWS account to another? but using TF?
like to use it to add it as a secret in a taskdef
or to maybe save it somewhere else
sure. Set up two providers
provider aws {
alias = "primary"
}
provider aws {
alias = "secondary"
}
data aws_parameter_store_secret mine {
provider = aws.secondary
name = "/foo"
}
resource aws_parameter_store_secret mine {
provider = aws.primary
name = "/foo"
value = data.aws_parameter_store_secret.mine.value
}
it is funny but I never had to do this before
It’s rare for me too. Usually it makes sense to have two configurations and use remote state outputs to share data. We use multiple AWS providers most often when an AWS service has some components in us-east-1. Like creating a CloudFront distribution in $standard_region and then creating a CloudWatch alarm in us-east-1.
also S3 bucket replication
yes same here, we use 3 provider is some cases for RDS multi region setups and such
but can a taskdef ( task execution role) assume the role to get the secret?
or I will have to save it somewhere and then use it?
How big do you make your VPC? Is every VPC running with 10.0.0.0/8 IP range, or do you allocate the whole thing a /24 if it will contain one RDS cluster?
2020-12-02
Error: invalid value for domain_name (must start with a lowercase alphabet and be at least 3 and no more than 28 characters long. Valid characters are a-z (lowercase letters), 0-9, and - (hyphen).)
on main.tf line 100, in resource “aws_elasticsearch_domain” “default”: 100: domain_name = module.this.id
Best to post this in #terraform
https://aws.amazon.com/blogs/aws/preview-aws-proton-automated-management-for-container-and-serverless-deployments/ looks very interesting
Today, we are excited to announce the public preview of AWS Proton, a new service that helps you automate and manage infrastructure provisioning and code deployments for serverless and container-based applications. Maintaining hundreds – or sometimes thousands – of microservices with constantly changing infrastructure resources and configurations is a challenging task for even the most […]
does anyone know how to get the specific engine version for the most recent postgres family in RDS (12.4 ) ?
if you’re using Aurora then the latest available is 11.8, but if you’re using PostgreSQL RDS then you can select 12.4
we are not using aurora
module "harbor_postgres" {
source = "git::[email protected]:ume-platform-engineering/tf-modules.git//modules/rds?ref=b8952a5"
additional_security_groups = [data.terraform_remote_state.security_groups.outputs.ids["rds"]]
engine_type = "postgres"
engine_version = "12.4.1"
environment = "core"
instance_class = "db.t2.large"
major_engine_version = "12.4"
multi_az = true
name = "harbor"
parameters = []
private_domain = data.terraform_remote_state.base.outputs.dns_domain_private
private_hosted_zone_id = data.terraform_remote_state.base.outputs.dns_zone_private
subnets = data.terraform_remote_state.base.outputs.subnets
team_prefix = "pe"
version_family = "postgres12.4"
vpc_id = data.terraform_remote_state.base.outputs.vpc_id
}
my concern is the engine_version
value
I don’t think engine_version
supports patch versions, I think it should just be 12.4
– You’ll need to dig into the docs to be sure https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-rds-database-instance.html#cfn-rds-dbinstance-engineversion
Use the AWS CloudFormation AWS::DBInstance resource for RDS.
yeh you’re right
i am trying to work out the jq
command when using aws rds describe-db-engine-versions --region eu-west-1
jq '.DBEngineVersions[] | select(.Engine=="postgres") | .EngineVersion'
i was close but not fast enough
I was wondering if anyone is using internet facing and internal ALB at the same time?
What do you mean @uselessuseofcat ?
If your ALB is facing internet, it should be accesible from your internal network
yes, but I want to be able to communicate between my ECS clusters via internal DNS
You need a specific internal ALB to do that
if the ALB is public and you point your internal comms at it, all that traffic goes outside your vpc to the public endpoint
I’ve already created it, but I need to set service to point to multple target groups, right?
Alternatively you can use 1 ALB that is internal, and use GlobalAccelerator as the public entry to it
@Zach I’ve created 2 lbs, one is private one is public, but I was not able to forward them to the same Target Group, so I’ve created new one. Now I have to set ECS services to point to 2 different target groups?
Yup.
TG can only be attached to 1 listener
Thanks @Zach, works like a charm
cool, and the GlobalAccel pattern of access for an internal alb works very well too. You get public access for your external clients but your internal clients can use the same ALB w/o leaving the vpc
doesn’t work with cloudfront unfortunately
interesting use of Global accelerator although that defeat the purpose of subnet isolation which is required in many companies
why does telnet on every public EC2 ip address on port 80 and 443 connects? Those ports are closed on SGs
not sure what you mean by “closed on SGs”, do you mean that those ports just aren’t specified as open? because SGs don’t have explicit denies.
Easiest way to check, from the ec2 Instances console, look at the Security
tab and Inbound Rules
, you might be unintentionally allowing access
Yup, they are not on SG, but you can do telnet ip-address 80 or telnet ip-address 443. it says Connected to ip-address Escape character is ‘^]’.
this is really strange…
Are these websites that you refer to or different applications using those ports? Technically the security group should block incoming requests if these are are not allowed as part of your inbound rules. If this is already the case, I would check for the Network ACLs and confirm that you are DENYing these ports and sources as well. On a side note, Telnet only helps with TCP but it’s not really the best tool for troubleshooting connectivity. I would use netcat when troubleshooting this.
nc -zv -w2 someurl.com 443
If the problem continues, I would confirm that the instance that you are trying to hit is not behind a load balancer or a proxy. You can use “MTR” for that. And lastly/or even to begin with, confirming that you are not able to hit that IP from within a trusted network
@Diego Saavedra-Kloss thanks!
I didn’t touch NACL, I have allowed everything there.
I’ve controlled access via Security Groups…
Just interesting that 22 is not ope, but 80 and 443 are
@Diego Saavedra-Kloss I’ve tried nc and it says 80 and 443 are opened even though they are not in SG!
could you please try with one of your instances public ips
I don’t use public instances, but I created one as a quick test and was not able to recreate your scenario.
aws ec2 describe-security-groups --group-ids $(aws ec2 describe-instances --instance-id $id --query "Reservations[].Instances[].SecurityGroups[].GroupId[]" --output text) --output text
complete $id
– and if you’re willing to share then we could better speak to what’s going on
if you feel like thats a security concern, I get it
aws ec2 describe-security-groups --group-ids xxx "Reservations[].Instances[].SecurityGroups[].GroupId[]" --output text
None
That’s on the instance without any inbound rules!
Then I go and copy Public IPv4 address
nc -zv -w2 ip-address
and it says open
@uselessuseofcat do you mind sharing one of the public IPs that you are talking about so I can test it from my side? I will use MTR to get the hops too.
I’ve open ticket towards AWS so I’ll know more. I cannot share that IP, sorry, that AWS accounts belongs to a company that I’m working at.
Gotcha, no problem. Hopefully you can get that resolve quickly.
@Diego Saavedra-Kloss I used my Linux Academy sandbox to create an instance. It’s the same. Finally something useful from them
Here’s an IP 52.90.183.94
@uselessuseofcat I cannot reach that IP from my network. The first result is from MTR. The second one is when I’m using netcat. I haven’t tested port 80 yet, only 443
[root@miniconda bin]# mtr --tcp 52.90.183.94 --port 443 --report-wide
Start: Wed Dec 2 21:41:51 2020
HOST: miniconda Loss% Snt Last Avg Best Wrst StDev
1.|-- gateway 0.0% 10 0.1 0.3 0.1 0.5 0.0
2.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
[root@miniconda bin]# nc -zv -w2 52.90.183.94 443
Ncat: Version 7.50 ( <https://nmap.org/ncat> )
Ncat: Connection timed out.
Same thing with port 80. I think that you need to really confirm that you are not in a trusted network to reach those IPs.
[root@miniconda bin]# mtr --tcp 52.90.183.94 --port 80 --report-wide
Start: Wed Dec 2 21:44:24 2020
HOST: miniconda Loss% Snt Last Avg Best Wrst StDev
1.|-- gateway 0.0% 10 0.2 0.2 0.1 0.5 0.0
2.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
[root@miniconda bin]# nc -zv -w2 52.90.183.94 80
Ncat: Version 7.50 ( <https://nmap.org/ncat> )
Ncat: Connection timed out.
$ nc -zv -w2 52.90.183.94 80
nc: connectx to 52.90.183.94 port 80 (tcp) failed: Operation timed out
nc -zv -w2 52.90.183.94 443
ec2-52-90-183-94.compute-1.amazonaws.com [52.90.183.94] 443 (https) open
[~]$ nc -zv -w2 52.90.183.94 80
ec2-52-90-183-94.compute-1.amazonaws.com [52.90.183.94] 80 (http) open
As a matter of fact, I’m connected to ExpressVPN Ha, this is really strange, it works when I’m connected to ExpressVPN…
and it does not when I’m not…
[~]$ nc -zv -w2 52.90.183.94 443
ec2-52-90-183-94.compute-1.amazonaws.com [52.90.183.94] 443 (https): Connection timed out
[~]$ expressvpn connect
Connecting to USA - New York... 100.0%
Connected to USA - New York
- To check your connection status, type 'expressvpn status'.
- If your VPN connection unexpectedly drops, internet traffic will be blocked to protect your privacy.
- To disable Network Lock, disconnect ExpressVPN then type 'expressvpn preferences set network_lock off'.
[~]$ nc -zv -w2 52.90.183.94 443
ec2-52-90-183-94.compute-1.amazonaws.com [52.90.183.94] 443 (https) open
@uselessuseofcat why don’t you try this from another host rather than your laptop? of course, a host that is not part of that account, so you are 100% sure that is unrelated. The reason why I used MTR is to show you the hops — which is important for you to know too since it’s likely that you are testing from a trusted source.
Neither Darren, nor me are able to get a reply from that IP when using those ports.
Does not work from my desktop!
Only from my laptop and only when I’m connected to ExpressVPN - I’ve tried to change locations to Germany and US, and each time I am able to access ports 80 and 443
[~]$ nc -zv -w2 52.90.183.94 443
ec2-52-90-183-94.compute-1.amazonaws.com [52.90.183.94] 443 (https): Connection timed out
[~]$ expressvpn connect
Connected to USA - New York
[~]$ nc -zv -w2 52.90.183.94 443
ec2-52-90-183-94.compute-1.amazonaws.com [52.90.183.94] 443 (https) open
[~]$ expressvpn disconnect
Disconnected.
[~]$ nc -zv -w2 52.90.183.94 443
ec2-52-90-183-94.compute-1.amazonaws.com [52.90.183.94] 443 (https): Connection timed out
[~]$ expressvpn connect "Germany"
Connected to Germany - Frankfurt - 1
[~]$ curl ifconfig.co
178.239.198.117
[~]$ nc -zv -w2 52.90.183.94 443
ec2-52-90-183-94.compute-1.amazonaws.com [52.90.183.94] 443 (https) open
[~]$ expressvpn connect usny1
[~]$ expressvpn disconnect
Disconnected.
[~]$ expressvpn connect "US"
Connected to USA - New York
[~]$ curl ifconfig.co
45.132.227.95
[~]$ nc -zv -w2 52.90.183.94 443
ec2-52-90-183-94.compute-1.amazonaws.com [52.90.183.94] 443 (https) open
So what I would focus now is to narrowing the sources from your security groups and see if that works if you consider that part to be an issue. However, it seems that now you are able to confirm that your initial question is not a security concern any longer.
But this is really strange that I’m getting port 443 with various IPs provided to me by ExpressVPN
and from my home network I see it same as you guys!
but I’m not worrying, it’s not security concern as you said. Have a nice day!!!
…since this is 80/443 specific, and on ExpressVPN (an “anonymous” vpn service), you’re likely connecting to a middleware proxy on the VPN providers network. (which is…funny.) I didnt see any attempts to actually GET /
or anything to see if you got a response. I bet if you were to tcpdump the target instance (or just netstat and look for the connection) you wouldn’t see these mystery connections actually connecting….
anyone know if it’s possible to only allow put policy perms on iam roles that contain a permission boundary ?
yes…
{
"Action": [
"iam:AttachUserPolicy",
"iam:CreateUser",
"iam:DetachUserPolicy",
"iam:DeleteUserPolicy",
"iam:PutUserPermissionsBoundary",
"iam:AttachRolePolicy",
"iam:CreateRole",
"iam:DeleteRolePolicy",
"iam:DetachRolePolicy",
"iam:PutRolePermissionsBoundary"
],
"Condition": {
"StringEquals": {
"iam:PermissionsBoundary": "<boundary_arn>"
}
},
"Effect": "Allow",
"Resource": "*",
"Sid": "AllowRestrictedIamWithPermissionsBoundary"
},
for some reason i thought that condition wouldnt work for putpolicy
i was looking for a good reference
this is the reference i was looking for https://docs.aws.amazon.com/service-authorization/latest/reference/list_identityandaccessmanagement.html#identityandaccessmanagement-actions-as-permissions
Lists all of the available service-specific resources, actions, and condition keys that can be used in IAM policies to control access to Identity And Access Management.
thank you for confirming loren
yeah, i recall being under the same impression based on that doc. but unless i totally screwed up, when i was testing the combined statement of actions that do list the condition with the putpolicy action, it did actually work and allow adding policies as long as the role already had the permissions boundary applied
but ya know, it’s iam. constructing advanced and restrictive policies feels like a crapshoot a lot of the time. if your testing finds otherwise, i would appreciate a callback!
haha yes it does feel like a crapshoot. the policy simulator has been saving me lately
i saw that it supports permission boundaries now so ill be using that to make sure what i can and cant do
i wish there was a command line way of simulating iam policies. that would make this even easier
hmm well i guess htis does exist too https://docs.aws.amazon.com/cli/latest/reference/iam/simulate-custom-policy.html
ah so i got some backlash since we’re in a single account. makes sense i suppose. there is some hesitation on giving devs access to this because it would allow adding or removing policies from production roles which i think scares devops teams
reminds me of the car key analogy: https://d1.awsstatic.com/events/reinvent/2019/REPEAT_1_AWS_identity_Permission_boundaries_&_delegation_SEC402-R1.pdf
besides breaking a monolithic account into multiple accounts, i wonder how else i can safely get these permissions added. perhaps there is no other way.
tag those roles, and in the boundary policy, use a deny statement for the disallowed actions with a condition matching that tag/value
{
"Action": [
"iam:Add*",
"iam:Attach*",
"iam:Create*",
"iam:Delete*",
"iam:Detach*",
"iam:Put*",
"iam:Remove*",
"iam:Tag*",
"iam:Untag*",
"iam:Update*",
"iam:Upload*"
],
"Condition": {
"StringLike": {
"iam:ResourceTag/<PROTECTEDTAG>": "true"
}
},
"Effect": "Deny",
"Resource": "*",
"Sid": "DenyProtectedIam"
},
you can also protect managed policies:
{
"Action": [
"iam:CreatePolicy*",
"iam:DeletePolicy*",
"iam:SetDefaultPolicyVersion"
],
"Effect": "Deny",
"Resource": [
"arn:${partition}:iam::*:policy/<PROTECTED_POLICY_1>",
"arn:${partition}:iam::*:policy/<PROTECTED_POLICY_2>",
...
],
"Sid": "DenyWriteManagedIamPolicy"
},
and be sure to note the iam:Tag*
and iam:Untag*
in the action list, to stop them from modifying the protected tag
ah tag based conditions seem like the only way remaining to do this. i was looking for a way to base conditions on names since all of our production roles contain that in their name itself
i suppose now i’ll have to search all iam roles and tag each ones that contain production
in their names with tag:env=production
i dont believe there is a way to set a condition based on the iam role name but this tag based method seems a lot b etter.
if the roles really are truly named consistently like that, then you can use a resource glob…
"Resource": [
"arn.../production*",
]
what a monday this is. ty
Been thinking about using these prefixed lists to consolidate security group rules. Has anyone given these a try?
The recently announced Amazon Virtual Private Cloud (VPC) Prefix Lists feature makes it easier to create consistent security postures and routing behaviors. A Prefix List is a collection of CIDR blocks that can be used to configure VPC security groups, VPC route tables, and AWS Transit Gateway route tables and can be shared with other […]
2020-12-04
AWS’ Internal CA is $400 per month? That expensive! And if I want to use SSL for internal microservice communication I have to pay it. And I have multiple regions and multiple accounts…
Is there anyone here who uses it? I could pay $400/month, but for multiple regions and accounts - that’s a lot of many.
Is there any alternative?
Thanks!
disclaimer: I don’t use it
https://aws.amazon.com/blogs/security/how-to-use-aws-ram-to-share-your-acm-private-ca-cross-account/
In this post, I use the new Cross-Account feature of AWS Certificate Manager (ACM) Private Certificate Authority (CA) to create a CA in one account and then use ACM in a second account to issue a private certificate that automatically renews the following year. This newly available workflow expands the usability of ACM Private CA […]
Thanks! I saw it also, but this worries me: https://aws.amazon.com/certificate-manager/pricing/
as you can see there are pice for one region and two regions…
Thanks again for quick reply
Pricing for AWS Certificate Manager - Amazon Web Services (AWS)
perhaps one day cross-region sharing will be a thing
honestly, I’ve heard the horror stories of private CAs so $400 a region sounds reasonable to me
I can settle on 2 regions, and multiple accounts. I’ll probably go that way!
honestly, I’ve heard the horror stories of private CAs so $400 a region sounds reasonable to me
same. i’ve lived the horror of trying to automate the setup and management of private CAs. if you really really need them, $400 is a deal. but try not to need them at all!
@loren so you think that HTTP communication inside VPC is OK? I’ve heard that there are some security requirements for HTTPS even inside the VPC…
yeah, i used to. it became a harder sell when aws released vpc traffic mirroring. a compromised aws credential could then setup an internal mirror and see decrypted traffic
This is very interesting! Thanks
nice, good point. set that in an SCP at the org level
Instead of internal CA, can’t you use public amazon.com hostnames and rely on free ACM certificates for each microservice?
TMK that can’t work with services like App Mesh
Was just catching up on the thread , I’m late but have you looked into hashicorps vault’s pki engine ?
3 questions, going to separate into separate items to allow threaded response to each if desired.
- Is there any non-org based root module setup that builds the identity account like prescribed by Cloudposse without using orgs?
- Identity account should be separate from ORG account. org account = master account that includes the org + billing details right? Just making sure that makes sense. I don’t think the identity account should be mixed with org.
Yeah, I’d agree with this. You want as few people as you can get in that org account since it’s got keys to the castles via the OrgAssumeRole (not the right name, but you know what I’m talking about).
OrganizationAccessAccessRole should probably be deleted once the child accounts have been bootstrapped and your usual roles have been provisioned; that’ll reduce the damage if someone gets into the Management Account
- If I have a half baked org setup that is mixed up right now and want to start improving, can I create another org and assign accounts that already have link to the other and then disconnect later, or is it a 1 org to 1 account linkage only?
1:1… you can leave one org and join another, but you can’t link to more than one at a time
Are you my personal knowledge assistant, AI, or just a helpful genius? We should jump on a zoom call soon and just touch base. You always have great insight. I need more smart people on speed dial
haha, notice i’m only answering the one question i have an easy answer for
2020-12-07
Has anyone worked with Appsync in combination with S3 & Dynamodb ? I can’t find any good documentation on how KMS is integrated ? Will it work with CMK’s for S3 & Dynamo ?
2020-12-08
I want enable Performance insights. But I can’t chose own master key. This is AWS RDS master DB (also have 2 replicas). What I need to do?
In other databases I can easily choose any key. And what is wrong with this database, I do not understand.
my solution is recreate form snapshot
do you all store your lambda code in the same repo as your terraform that defines said lambda?
I use CDK rather than TF, but yes any IaC is defined alongside the application with exception to shared (Security Groups, IAM Managed Policies, etc) resources
my typical repo looks like:
/app
/docs
/iac
/resources
README.md
Would you advise what’s the biggest selling point of CDK if you already know terraform? I’m not as familiar with it and wondering where it saves you time.
I don’t know that I’d recommend a switch
I use CDK because it allows my development team to contribute to the IaC in a language they already know: TypeScript.
I just wanted to add a disclaimer in my answer for @Tomek that my case is slightly different, but should still apply.
this was recently discussed in #terraform… https://sweetops.slack.com/archives/CB6GHNLG0/p1607024998053000
Does anyone have an opinion on the thread I post here - https://twitter.com/swade1987/status/1334554787711492097?s=21
oh nice, thanks Loren!
Has anyone ever done lambdas around gracefully handling fargate spot termination? This is more of a thought experiment as I’m not looking to do anything around it at the moment due to a lot of uncertainty..
From what I’ve read, it seems like there’s quite a bit involved, at the very least need to …
- manually deregister the ecs task that is draining due to SIGTERM on the target group side of things as spot termination doesn’t do this
- run a new task with the on demand provider
Anything else?
IIRC, the suggestion from AWS is to have a small amount of regular instances to back-fill things, and the rest running on spot.
Is it just a thought experiment, or a specific use case?
Yeah, I saw a talk where they said 66% OD 33% spot.
It’s not a specific use case in the sense of CI server or web server, more as a thought experiment around over engineering a terraform fargate module that creates a lambda to more gracefully handle spot termination
I think the most gracious way to do it is work off of a queue (or similar mechanism). If the spot instance dies, someone else picks up the same work item.
2020-12-09
2020-12-10
anyone have ideas on how to keep a list of ec2 instances’ public ip addresses up to date across all regions in a single account ?
Did you look at cloudquery?
What about using Consul for Service DIscovery ?
i was thinking of a cloudwatch event the triggers off cloudtrail, then invokes a lambda, that updates maybe a google spreadsheet. just spit balling.
wut
We implemented Consul for a big fleet of EC2 instances we have, when the server is being bootrstrapped, it registers itself to Consul with a bunch of information
Including the Public and Private IP address..
ah I see what you mean
We then use the Consul Directory
for querying a lot of info about the instances, thus, we avoid hitting the AWS API
that solution seems novel and more benefits with less work
To be honest, we are super happy with Consul, we manage around 300 EC2 instances with a small Consul cluster
we don’t have a service discovery at the moment. we’d have to align on that first then
in my org it’s easier to bring in an aws specific version. i wonder what the aws service discovery is
CloudMap / ECS-ServiceDiscovery - this uses DNS A or SRV records
cloud map ?
Hmmmm to be honest, we haven’t explored the AWS option
im sure it’s more expensive
for us , Consul is free and super easy to implement
Check the UI
We have some health checks as well…
HTTP healthchecks
Cool @Santiago Campuzano . Is there any better doc to follow to implement this.
Understood
is anyone using fugue.co at all?
i am looking at automating the creation of accounts in fugue when we create an AWS account via terraform
was wondering if anyone has already done this and if so what approach would they recommend?
They have an API, you can add environments through it.
If you’re wondering how to make an API call in a Terraform file, I actually built that for our own API (we’re a competitor of Fugue’s). There’s a TF provider for making REST API calls:
Although, last I checked, Mastercard didn’t publish this module on the TF registry yet (this is a few weeks ago, when they were considering doing it).
So this fork IS registered in the TF registry: https://github.com/fmontezuma/terraform-provider-restapi
Let me find the TF file I created so you have an example…
Here it is:
Thanks man, they have a CLI I’m wondering if I just have an out of band process to create the fugue account from the outputs of creating the account via terraform
That means the CLI needs to be installed on the build server
And if it isn’t… no one will notice the error. Or they will, and have no idea what’s going on.
Don’t suppose anyone is able to please summarise policy/controls for doing a S3 bucket which allows GetObject from within VPC but denies otherwise? We want:
- http service that we can dump files in, hence S3.
- anything in VPC can GetObject it (not sensitive even if got out to world), specifically k8s pods without IAM providing they have egress 443.
- specific IAM can write to it as per usual.
I’m trying to understand whether I want to
Allow
on condition ofaws:sourceVpc
and specific actions (listbucket, GetObject, etc). Because then I can add allow for specific users/roles to PutObject, or if I want to useDeny
per below:
https://aws.amazon.com/premiumsupport/knowledge-center/block-s3-traffic-vpc-ip/
Also how that might work with a typical Public Access Block?
resource "aws_s3_bucket_public_access_block" "aaa" {
bucket = aws_s3_bucket.aaa.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
Think I got there. See on Monday. Allow a better way to go.
2020-12-11
Does PagerDuty support inbound Cloudwatch Events/EventBridge as triggering incidents?
All the examples provide metric based alerting.
I configured an event that triggers on IAM policy change. This is event driven, not metric.
I am not seeing any triggered events in PagerDuty as the endpoint with SNS messages. Looking for any examples, or if I’m just trying to do something that isn’t supported.
UPDATE: I was using enqueue url. I think instead this is using the Events API not the cloudwatch api, meaning i should use: <https://events.pagerduty.com/x-ere/>
PagerDuty wouldn’t confirm. I think I’m still stuck. Any tips welcome
has anyone successfully run a Java AWS lambda container? https://docs.aws.amazon.com/lambda/latest/dg/java-image.html I’m having issues getting their dockerfile to work with a jar
Deploy your Java Lambda function code as a container image using an AWS provided base image or the runtime interface client.
how do you all manage the ssm-user
on your ec2 instances ?
since we’ve been pushing people more to ssm, we’ve realized things like the ssm-user
is not in the docker group for instance so it needs to be added. have you all noticed other issues that requires additional configuration to make the ssm-user
a viable alternative.
this is for things other than console access?
yes, ssm access for sshing into ec2 instances
we use amazon linux, so the agent and users are already there
so we have not seen any issues
but I think you have different access requirements since for us someone that needs the console is = admin so they can do sudo su
I’m of the main that ssh-ing to an instance is proof of lack of good monitoring/observability so I avoid having dev to ssh ever
I try to
right ya we can do a sudo su - ec2-user
but it would be nice if ssm-user
has the same access as ec2-user
otherwise it’s an extra step in the process
so you have picky customers
lol
does the ec2-use has it’s own group?
it’s pretty standard on amzn-linux 2. i think the ec2-user is in the ec2-user group
we only add it to the docker group so ec2-user can run docker commands
so to keep it consistent, we figured we could do the same with the ssm-user
i was wondring if there were additional settings. but i cannot think of any.
2020-12-14
Hi All, curious if you know how often Aurora fails over to it’s replicas?
We’re standing up 1 aurora instance and loading it with a ton of data. we noticed yesterday it restarted twice and stalled once. we didn’t have any replicas, so we had downtime (as expected). But just curious with 2+ replicas, are the failovers seamless from your experiences?
if you are loading with tons of data and is failing over you are maximizing the instance IO and that is why is failing over
but most probably you are undersized
enable enhanced monitoring and look at the graphs to understand where is the constrain
maybe you will need to change the way you are loading the data
+1. enhanced monitoring is very good for diagnosing IO issues, the standard cloudwatch metrics are not very helpful at all.
We have 3 aurora clusters, none have failed over automatically in the past year (we have done manual failovers though)
we’re using a 24xlarge
tried to bring up a read replica an hour ago.. and it’s failing to start
segmentation faults.. lol
on the phone with aws support
how much data do you guys have, ballpark?
we’re at about 3TB
honestly, I think we’re oversided
most of the time, we’re at about 6%cpu
but we sometimes hit usecases were we get a spike of requests (on the hour) and there’s been times where it caused ShareLocks
.. then causing CPU to spike to 100% and a handful of issues
it’s nutty
anyways, for more context: We did a RDS 9.6 -> RDS 11.8 migration using pglogical
now we’re in process for a RDS 11.8 -> Aurora 11.8 migration using the same process (with pglogical)
but.. aurora is causing headaches.. luckily it’s before we cutover to aurora, so we haven’t had application level downtime because of it
so 30 minutes prior to aurora failing last night, there was 1 spike on aurora for 60k write iops.. but during initial data sync on this instance, we’ve had a consistent 75k write iops.. with bursts up to 200k write iops.. and no hiccups
the metrics weren’t anything crazy from what I saw prior to failing
wow. we use 8x or 4x with 7tb of data
fwiw, one thing aurora is worse at is single threaded write performance. Which is generally exactly what database restores do. But also use-cases where you have a very hot table
what did aws support say?
they said known bug with 11.8
and to upgrade to 11.9
doing that now, lol
crazy
yeah.. seemed to have fixed things..
yeah.. “crazy” definitely isn’t the word i want to hear during DB migrations/upgrades But yes, i agree with ya
@Alex Jurkiewicz @jose.amengual how many active connections / concurrent queries do you guys run on average? Do you ever get spikes that cause failovers?
mmm selects or update/inserts?
we are heavy readers
a bulk writters
SET charset=utf8
any? either? both? hehe, just trying to get a frame of reference
at peak (~3hrs/day) about 5k SELECT QPS, 1k INSERT/UPDATE/DELETE. 16k write IOPS, 10k read IOPS
would i roughly be able to compare those numbers with commit throughput in cloudwatch?
i guess i need to open vividcortex and get actual numbers
hmm, yea we’re at about 5k QPS not at peak..
around 4k write IOPS, 600 read IOPS, lol
peak is around 8pm PST
i’ll compare then also
we have bust of 40K write I/O every hour
on postgress
8xl
depends which clusters we have 700K read IOPS
man, it’s nasty how overprovisioned we are
on RDS 9.6 and RDS 11.8 we got these weird situations where we’d get spikes in HTTP requests that then execute:
insert into .. on conflict do update ..
on like 4 different tables. But for some reason, we’d get like 15 ShareLock
across those tables.. causing things to backup and CPU to spike to 100%
(happened last week, prior to cutting over to aurora yesterday)
you need a DBA
2020-12-15
No matter how much automation you have built, no matter how great you are at practicing Infrastructure as Code (IAC), and no matter how successfully you have transitioned from pets to cattle, you sometimes need to interact with your AWS resources at the command line. You might need to check or adjust a configuration file, […]
No matter how much automation you have built, no matter how great you are at practicing Infrastructure as Code (IAC), and no matter how successfully you have transitioned from pets to cattle, you sometimes need to interact with your AWS resources at the command line. You might need to check or adjust a configuration file, […]
Sadly no docker….
i played with this and it seems promising
amazon-linux-extras enable docker
?
Sessions cannot currently connect to resources inside of private VPC subnets, but that’s also on the near-term roadmap.
awww
@RB could one do what @Darren Cunningham suggested?
The shell runs as a normal (non-privileged) user, but I can sudo
and install packages if necessar
seems like it should work.
I just checked again, I got in and validated that it works:
sudo -- sh -c 'amazon-linux-extras enable docker && yum clean metadata && yum install docker'
[cloudshell-user@ip-10-0-145-19 ~]$ docker --version
Docker version 19.03.13-ce, build 4484c46
omg finally!
google and azure have had this forever
Just an update on this:
So while docker can be installed, it cannot be executed
whomp whomp
Hi All , wrong channel probably , but trying to automate a git push
every day ( may be even more frequent) , is there a git command to only add files that changed and untracked files , I’m using git add .
For now , but that won’t work as I have to force push it everytime , kinda defeats the purpose of version control
the force push is the problem… the remote branch has commits your local branch does not. sounds like your workflow needs to fetch and merge before pushing…
Hi @nileshsharma.0311, running low on sleep due to teething baby so excuse my being dense. Does -A from https://git-scm.com/docs/git-add solve your issue?
Thanks @MattyB , yeah that should work , get some rest :)
@loren yeah I can add that to the workflow , thanks for the help , the real problem was the force push
I’d avoid using git add .
– if you forget to add files like .DS_Store
to your project ignore (or global ignore) then you’re going to start committing those unintentionally
Rich, interactive data visualization service to monitor, analyze, and alarm on metrics, logs, traces and events across cloud and on-premises data sources.
A fully managed Prometheus-compatible monitoring service that makes it easy to monitor containerized applications securely and at scale.
this is an open preview
A fully managed Prometheus-compatible monitoring service that makes it easy to monitor containerized applications securely and at scale.
this might be a dumb question but is it straightforward to create an internet facing alb that has a listener that redirects all traffic to say <https://asdaslkdjaslkdjalskdjalskdjalskdj.com>
?
you add a listener rule
and do a Redirect to whatever you want
and you will need a cert that is capable of recognizing whatever name is in the host header
ah ok so that’s what i thought. i was clicking in the ui and was havign some difficulty. i should just terraform it and see if it works
thanks
thought i was going a bit crazy
I have done it using the cloudposse modules too
so I know it works
Run controlled chaos engineering experiments on AWS to improve application performance, resiliency, and availability with AWS Fault Injection Simulator
2020-12-16
Hi guys, any way to migrate
from RedisLabs
to ElastiCache
(redis)? I can’t loose data, so I need live sync between them.
There are tools to sync data from one redis cluster to another. They aren’t particularly fast though, so it depends how live/stale your data is.
You might want to consider two alternatives:
- Lose the data. Redis is not reliable anyway so your application can handle this right
- Migrate slowly using application code to connect to both clusters during a migration period
Thx! Alex!!!! we would start sync both 2 Redis, and we suddently stop app using RedisLabs and point the app to Elasticache. When we start again the app, would be nive to have all REddis data syncronized. Of course we want as less downtime as posible @Alex Jurkiewicz
So how can we do it?
about five ways here: https://www.google.com/search?q=github+redis+sync The Redis protocol is extremely simple, so you will find a lot of options
we need as well to make the elasticcache public in some way, any recommendation?
don’t do that!
why would you need it to be public?
solved, I needed to read in deep, thats all, now I have the live replication working, thx
Although I thank your help, let me suggest to not write phrases like “don’t do that!” with a person you don’t know.
cheers
I think it’s ok to say “you are making a big mistake” when someone asks how to do something that’s 99% a bad idea.
If someone asks “how can I remove authentication from my database” I would respond the same way
does anyone have a diagram handy, showing ALL the networking components of a standard architecture using the best practices? something like the following, but with ALL the bits like IGW/RT/ACL/SUBNETS (public/private)/NAT/AZ/ALB/etc.. u get the idea
2020-12-17
Any sshuttle people here?
Id like advice on my aws setup https://github.com/sshuttle/sshuttle/issues/572
Thank you so much for this great app. It's helped me by providing a solid workaround where existing solutions fail today. I was curious about our setup. We have an AWS network load balancer tha…
Hey Having a short question.
Imagine an organisation has different application envoronments. Lets say two of them. Prod and test. What are some good practises to have these environment split as your aws?
Do you simply replicate the whole infrastructure and use different tags and so on or is there a “cleaner” way to do so?
Hey :wave: - I always recommend 2 accounts (preferably under an organization. one “dev” and one “prd”. In ITV we called these “ecosystems”, in each “ecosystem” we had a “core” environment for CI/CD, monitoring etc. In “dev” we had several environments like “dev”, “stg”, “test” for development work, these pretty much aligned an environment to a VPC with peering back to the core environment.
Whatever you do, it’s usually best to keep prd
in its own account to reduce blast radius. You can see a bit more about the setup (which is common in most places) here: https://www.youtube.com/watch?v=sBIavQW6iO8
Don’t forget the sequel: https://www.youtube.com/watch?v=SeuIty069oI
hehe thanks @tim.j.birkett!
2020-12-18
Anyone know a solid way to monitor if/when RDS has a failover? We have a slightly atypical RDS setup that needs fixing, but in the meantime, I’m looking for a good way to monitor when a failover does happen in an RDS cluster. I looked at CloudWatch but didn’t see a good way to actually monitor that state.
RDS publishes events — https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_Events.html#USER_Events.Messages
Get a notification by email, text message, or a call to an HTTP endpoint when an Amazon RDS event occurs using Amazon SNS.
Anybody else in us-east-1 seeing issues with NAT Gateway?
they have an outage posted about it, it affected some of our connectivity.
2020-12-19
Hi, How to migrate, RDS instances from one AWS account to another AWS account, with Data n all, Thank you.
create replica cluster from a snapshot to another region, or just create a cluster from snapshot
2020-12-20
I’m thinking about creating an account dedicated to storage within my AWS Org –
My initial plan is to consolidate the S3 bucket sprawl I have between accounts into a single bucket within that account that provides my Org access to write and each account access to read it’s files using a prefix condition on the bucket policy.
Bucket sprawl for Service logs: VPC Flow, S3 Access, CloudFront logs, etc – application buckets would remain in their account….not looking to fix that sprawl.
Any words of wisdom or horror stories that I should be aware of before I embark on my quest?
If someone had access to that account, they have have access to all buckets? You will need to guard the policies really well
I think sprawl is a necessary evil if your goal is to limit blast radius, and IaC is supposed to help you to achieve that
Don’t put all your eggs in the same basket
I’m not trying to put all buckets there. at this point I really only want to consolidate all the service logging into a centralized bucket, but I would grow this account into org wide storage as needed.
I understand the perception that is a bigger security risk to put everything into a single account, but at the same time it seems like it would be better to create a single walled garden rather than try to create dozens/hundreds of them.
I’m anticipating that at some point I’m going to be feeding these logs into Athena, Guard Duty, and beyond. Doing so from a single account/bucket sounds a lot more reasonable than creating policies throughout my org to provide such access
happy to have you and others poke holes in my theories though
A centralized logging / audit account is a common multi-account pattern. I don’t think this is unreasonable. You should lock it down as much as possible because it does become a prime target if someone malicious were to get access to your system, but that doesn’t mean this strategy is a bad practice.
Ahhh just logging then is ok, that is how many log aggregators work anyhow
We started moving to using cloud watch since I fit the bill for 90% is the use cases
pretty much all of my applications are going to CloudWatch but most of the AWS services want to write their logging to S3 which is what I’m planning to consolidate into a single “warehouse”
Agree with the others. For logging - makes sense. For other uses - agree with @jose.amengual, better to spread buckets to limit blast radius. Force users to work through IaC with approved templates. Then scan the environments for something that someone created outside of your IaC process and blast it (if you have the political weight within the company to do that).
I believe in isolation for application buckets but less so for blast radius and more so for preventing mishaps such as somebody edits the wrong file in a bucket believing they’re editing TEST where they’re really editing PROD
2020-12-21
Launching a new EC2 instance. Status Reason: Could not launch On-Demand Instances. InvalidParameterValue - You cannot specify tags for Spot instances requests if there are no Spot instances requests being created by the request. Launching EC2 instance failed.
Bit of a bummer this I guess… scenario:
- Launch Template which configured Tag specification for
instance
,volume
,spot-instances-request
. - ASG which uses above Launch Template, with a mixed instances policy …
- You then ask for anything except 100% spot on the above ASG.
I just wanted my spot instance requests to get tagged properly. Guess I have to manage without the requests being tagged .
2020-12-22
Hi everyone,
I’m currently trying to create a redirect rule for an ALB Load balancer.
The generic format for the redirect target is as follows:
http://#{host}:#{port}/#{path}?#{query}
I have then changed it to
<http://example.com:80/dGVzdAo=?#{query}>
The issue with the specific path in the above url dGVzdAo=
is, that it includes an equal sign (=
). and AWS reports as: Invalid URL scheme
I tried %3D
instead, but this also results in an Invalid URL scheme
error from AWS.
The web page I am redirecting to, does however need this equal sign (=
) at the end.
Do you guys know of any workaround to somehow get this equal sign into the redirect target of the ALB rule?
~I think if you configure a custom redirect rule with the following values it will be what you need:
- host:
[example.com](http://example.com)
- path:
/dGVzdAo
- query:
#{query}
~
or should I be reading this literally above and it should then be path: /dGVzdAo=
? which is what you’re trying?
This looks like it will be problematic for sure
Yes, it does not seem to be possible
Answer from AWS:
At present, the Application Load Balancer’s path based routing has a limitation that it cannot route or redirect requests over to a path that contains “=
” or “%
” in the URL. I have confirmed the same with the internal Application Load Balancer Service team who have informed me that there are currently aware of this limitation and we also have a pending feature request to enable support for these special characters. I have added your case to this pending feature request to make service team aware of your requirement and to gain further traction.
Good to know, thanks!
Why base64 encoded string in URL path?
@Tim Birkett URL was externally provided and I can’t do anything about it.
2020-12-23
2020-12-30
I’m thinking about a bucket archival process for my Org, rather than giving users permissions to delete buckets directly (even with MFA) I’m thinking about giving them access to add a resource tag: ArchiveRequested: $userName
My plan would be to create a scheduled job that queries buckets with the tag and moves their contents to an S3 Bucket w/an Archive Policy then delete the origin bucket: s3://<regionalArchiveBucket>/<account>/<originBucketName>/
- Can I restrict access to create tags so that the value has to match their username
- Am I overthinking this? Or is there something already out there that I should be using?
How big is your org? How often are buckets needing to get archived? Does it make sense to make this tag system with all the IAM hackery you’ll need to do instead of just having users open a ticket with you / your infra team?
I would think the latter would solve the problem and then you don’t need to do IAM BS that will be painful.
maybe I”m naive, but I don’t think the IAM hackery would be that painful.
my org isn’t huge, but I don’t have a lot of faith in humans to follow the prescribed bucket archival procedures (even infra team)
Yeah, I think automating the bucket archival procedure is a good thing to automate. I myself would question whether automating the tagging requirements makes sense. That’s the bit I would expect would be painful.
yeah, maybe it would be easier to restrict access to a Lambda and the Lambda be the executor of moving contents of the bucket, etc
Remember that the only api call that’s free is DELETE. Depending on your scale or budget that could be worth thinking about
A S3>Glacier archival config without proper thought spent a years budget in a few days for us once.
much appreciated @mfridh – any details you can share on a successful (budget conscientious) archival config would be awesome too
Avoid lots of small objects… Better to send one DELETE after a year instead of paying AWS for moving things around several times before you’re done with the data.
Not much else to say generally I guess.
If you have to archive, then for sure - do . That’s part of your expected cost then. Just remember this nugget - DELETE is free
Doing nothing may also be cheaper. Maybe “archival” could be as simple as denying access to the bucket but keeping it there for future needs. Of course, you’d still be paying storage costs, but that is true of wherever you store the data (although the cost may differ). It’s also possibly the quickest, simplest and most scalable “archival” procedure.
but I don’t have a lot of faith in humans to follow the prescribed bucket archival procedures (even infra team)
Sure hope none of them are in here…
Caveat number of buckets per account could have a limit though, so verify that suits your needs too.
Sure hope none of them are in here…
I’m not stating that I’m perfect and that I don’t trust other humans. I am including myself, being that (1) I am human (2) I am fallible.