Discussion related to Amazon Web Services (AWS) Archive: https://archive.sweetops.com/aws/
Apparently gp2 EBS docs aren’t as precise as one would thought.
Patient: 100GiB gp2 EBS volume in multi-az RDS cluster
I’m running a few M row update/delete process on one of our mysql clusters and based on docs it would mean that we would be able to burst over base performance of 300IOPS (3IOPS per GiB) for about 20minutes. Apparently in multi-az environments base performance is doubled and the time required to deplete the gathered (since yesterday late evening) credits allowed to burst with avg 1500IOPS for over 2h.
the spike visible around the 8PM yesterday was a test performed on ~200k rows
for the sake of data completeness this graph comes from db.m5.large cluster
Does anyone have recommendations for aws + okta cli tools?
- https://github.com/oktadeveloper/okta-aws-cli-assume-role: i hate java and want to stay away from this if possible.
- https://github.com/segmentio/aws-okta: looks promising
Just curious if there was something you guys swear by
Okta AWS CLI Assume Role Tool. Contribute to oktadeveloper/okta-aws-cli-assume-role development by creating an account on GitHub.
aws-vault like tool for Okta authentication. Contribute to segmentio/aws-okta development by creating an account on GitHub.
aws-okta is great
I have only used
aws-okta and it works really well
we were just talking about it
hahaha that’s awesome
sweeeeet, good to hear this feedback!
Okta seems pretty expensive. Why the buzz?
Did anyone Come across NPM memory Issues ?
Perhaps share some more details of what you are seeing?
Upgrade Node and NPM on CI/CD server. Observe the npm memory issue.
m new to node…so just want to know where can I check them memory issues
i suppose you need to upgrade nodejs and npm to the latest versions, then monitor the build server on CI/CD for memory consumption when it builds the node project with npm
Aurora postgres db seems down on eu-west-1 region
oof if so
14 minutes down
Hey all, looking for some opinions on how to go about creating VPC’s in a new aws account of mine. I recently setup an ECS cluster with fargate using the ‘get started’ feature in the console and it did a lot of the heavy lifting for me. however I’m trying to automate some of this using Terraform. So I’ll need to create some VPCs for the ECS cluster. What is the most simple, secure setup? One public subnet, private subnet, place the cluster in the private subnet with an ALB in the public subnet?
setup it in a way that you can easily change it to multi-az (one subnet per az for every type of subnet - public, private, db). it doesn’t mean you will use all of them but if the requirements change you will have them already available
can you give more detail?
I’ve a vpc with a cird 10.0.0.0/8
and then every subnet in every availability zone uses /24 from that cird
i’ve a total of 8 subnets - public and private for every availability zone
public have outgoing traffic routed via nat gateway
private have only routing for the 10.0.0.0/8
that makes most sense for my cluster
can provide more info if needed, but really just looking to get some general guidance on VPC setup
See this module. It does the setup the way Maciek describes. https://github.com/terraform-aws-modules/terraform-aws-vpc
Terraform module which creates VPC resources on AWS - terraform-aws-modules/terraform-aws-vpc
Is it possible to disable root login on AWS accounts that are connected to an Organization?
I don’t think it is, which is why it’s very important to secure that root account if you created the account programatically - anyone with access to the email could take over the account
If it’s one you joined that used to be an individual account, I’d hope that access is already secure
anyone else experienced rds chocking around 1h ago?
we found our pgsql rds instance stopped resolving hostnames
2019-08-25 13:03:47 UTC::[unknown]@[unknown]::WARNING: pg_getnameinfo_all() failed: Temporary failure in name resolution
rumping up db connections and killing our application between 1420 CET
I wonder whether it was RDS general or only our cluster
What’s your go-to way of providing external devs/contractors (outside of your corporate AD) access to your AWS accounts? IAM users on Bastion? Cognito?
What kind of access you have in mind? Access to accounts or access to resources (ec2?) on accounts?
Console & CLI access.
I imagine it would be something like:
- Give [solution] access to consultant
- Consultant uses [solution] to gain access to either console or gain temporary access id/key pair
- Consultant can then use console or CLI
Although we only wish to give them explicit access to our Bastion/Security account, and they then use the credentials above to sqs:assume_role into sub-accounts
Isn’t IAM sufficient for that? I would personally go with it but can’t say I’m an expert on the subject
As a consultant it depends on the client Most of the time we get an IAM user in a sharedservices account, then assume roles cross account Others will give us a AD account, then SAML / SSO to an AWS role
Yeh, it seems that giving consultants limited users on our AD is the favoured approach. Our tech services are looking into it now.. it just doesn’t seem like something that should be managed by Terraform!
Could build the roles out that they would assume at least. For our managed services side, for some clients the client (or us) creates a role in each account that trusts one of our aws account and a specific role in that account. Then we can managed the users who have access to the client’s AWS account without needing to bother them.
I think it depends on what they are hired to do for the company.
Think about this from the company perspective: they want to eliminate risk, liability, and exposure, embarrassment, while at the same time accelerate development and maintain knowledge transfer.
Think about this from the perspective of the developer. They want to operate as unencumbered as possible. They want to quickly prove their worth and get more work.
It goes without saying that IAM roles assumed into accounts is one of the mechanisms that will be used.
If the contractor was hired to oversee the uptime of production systems, I find it hard to justify anything other than administrator-level roles in the accounts they are responsible for.
If trust is an issue, then don’t hire.
If the contractor is hired to build out some form of automation, then there should be a sandbox account.
The deliverable should include “infrastructure as code” or other kinds automation scripts.
I’ll address the latter. Give them a sandbox with administrator level access. They can do everything/anything (within reason) in this account. It can even be a sandbox account specifically for contractors.
They’ll check their work into source control with documentation on how to use it.
The company is ultimately responsible for operating it and “owning it”, so this forces knowledge transfer.
The company and it’s staff must know how to successfully deploy and operate the deliverable.
Ideally, you’ve rolled out a GitOps continuous delivery style platform for infrastructure automation.
The developer can now open PRs against those environments (without affecting them). The pending changes can be viewed by anyone.
Once approved, those changes are applied -> rolled out.
Regardless of this being a contractor or employee, etc - this is a great workflow. You can radically reduce the number of people who need access at all to AWS and instead focus on git-driven operations with total visibility and oversight.
Exactly the answer I anticipated from you Erik glad I remembered well
Getting accurate time series forecasts from historical data is not an easy task. Last year at re:Invent we introduced , a fully managed service that requires no experience in machine learning to deliver highly accurate forecasts. I’m excited to share that is generally available today! With , there are no servers to provision. You only need to provide […]
Better way to update an ecs task, with only one container. I’m receiving this error:
The closest matching (container-instance 5df0ce11-3243-47f7-b18e-2cfc28397f11) is already using a port required by your task
@Daniel Minella if you use the host port 0 in your task definition, ECS will use dynamic port allocation which works good together with the use of an ALB
How ECS will handle with that? It understand that a traffic from the LB at port 8080 has to be foward to any container inside the cluster? In that port?
ECS will manage that under the hood for you. https://aws.amazon.com/premiumsupport/knowledge-center/dynamic-port-mapping-ecs/
We made it! Thank you again!
Hi, I have multiple eks clusters across multiple accounts and I would like to give access to all of them to an S3 bucket in one of the accounts using the IAM profile of the instance nodes, but can’t seem to get it right, any tips on how to get this working?
You need two pieces to this:
- On the bucket, you need to give permissions such as
s3:GetObjectas well as add the source roles to the Principals section as well (assume-role policy document)
- On the roles that need access to that bucket, you then have to give the permissions for s3 against that resource
I do this all the time. The specifics with EKS I can’t help with, but I’d imagine the cluster members have a role they use…
Good example doc here:
Nice, thanks for the help @Alex Siegman!
How can I run this:
docker run -d --name sentry-cron -e SENTRY_SECRET_KEY='<secret-key>' --link sentry-postgres:postgres --link sentry-redis:redis sentry run cron at task definition? My concern is about
run cron. Is a command? Something like entrypoint:
run cron as command?
run cron would be a command. it would pass through whatever entrypoint script is defined in the Dockerfile
Thank you! I’ll try
run, cron at command works for me
thanks @Nelson Jeppesen for the added context
Interesting, I thought negative ttl was the last value in the data of the SOA. Are you saying negative ttl is reflected by the SOA ttl directly?
dig abc.com soa +short ns-318.awsdns-39.com>. <http://awsdns-hostmaster.amazon.com|awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400
in this example, i thought
86400 was the negative ttl, but thats not the TTL of the SOA itself
unless I’m mixed up
Just looked it up, negative ttl is the lower of either the TTL of the SOA _OR_ the last value,
86400 in the above example
TLDR; lazy me dropped the TTL of the SOA to 60s; thanks!
Hello, what is the main benefit of shortening SOA TTL to 60 secs? I noticed that in your best practices docs.
so in highly elastic environments which are changing or subject to change at any time, a long TTL is a sure fire way to “force” an outage.
perhaps the most important TTL is that of the
SOA record. by default it’s something like 15m.
SOA (start of authority) works a little bit like a “404” page for DNS (metaphor). when client requests a DNS record for something and nothing is found, the response will be negatively cached for the duration of the SOA.
so if your app looks up a DNS record (e.g. for service discovery) and it’s not found, it will cache that for 15m. Suppose after 1m that service is now online. Your app will still cache that failure for 14m causing a prolonged outage.
a DNS lookup every request will add up, especially in busy apps. a DNS lookup every 60 seconds is a rounding error.
Anyone running AWS Client VPN here? We’re having issues just starting an endpoint even – stuck in Associating/pending state for hours
Thanks for the rec - I do have some pritunl experience and it was way smoother of an experience than AWS Client VPN has been - going to propose that
I’m new to AWS… and I make a lot of mistakes running Terraform, so I end up with errors like:
aws_s3_bucket.build_cache: Error creating S3 bucket: BucketAlreadyOwnedByYou: Your previous request to create the named bucket succeeded and you already own it. status code: 409, request id: 54C0B6BA
is there a switch like
that will back off it already exists.
If the bucket is already in AWS but not in the state file, use terraform import
It seems that I cannot import the resource, but it also says the resource is not created because it already exists.
That guid is not a resource id
It’s a request id from api call
Go to AWS console and find the resource id
If the bucket is in the state file but not in AWS for any reason, use terraform state rm
I think I remember reading about that in…. nowhere ! How very cool.
oh sorry I understand now
yea :slightly_smiling_face: because of
rm -rf *tfstate* you see the error what you see
the fruits of
rm -rf *tfstate*
Anyone had issues with Firehose > ElasticSearch 6.5 ?
the ES cluster returned a JsonParseException. Ensure that the data being put is valid.
nope. we’re at es5 still for our logging.
@Maciek Strömich Are you Firehose > Lambda processor > ES ?
nope. I’m emulating logstash structure in the logs and pass it directly via firehose to es
Is this data from CloudWatch Logs?
nope. we dropped cwl support because it was a pain to send it to es via firehose
hmm, OK thx
we’re not going to contribute back to rsyslog but we created our solution based on https://github.com/rsyslog/rsyslog/blob/master/plugins/external/solr/rsyslog_solr.py, but instead working directly with es we push everything to firehose using boto3 with the same structure as our app logs. way cheaper compared to cwl as well.
a Rocket-fast SYStem for LOG processing. Contribute to rsyslog/rsyslog development by creating an account on GitHub.
anyone ever turn on aws s3 transfer acceleration?