we are facing a strange behaver of WeaveCNI in our AWS EKS cluster. for some reason our CoreDNS getting NXDOMAIN, which mean that the POD are not able to resolve the URL of the services in the cluster. after long investigation we found out that the only way to solve the DNS issue is by restarting all WeaveCNI POD.
any one have encountered the same behaver ? thanks.
Join us in the Microsoft reactor next week for a technical demonstration of provisioning AKS with Terraform and then deploying microservices with Helm!
We have deployed alpine images on EKS Fargate nodes, and have also associated a service account to an IAM role which has access to DynamoDb and some other services. When deploying the containers, we can see that AWS has automatically set these env vars on all containers
But if we execute this command with the cli
aws sts get-caller-identity
aws dynamodb list-tables
the command simply hangs and does not return any results.
We have followed the docs on setting up the iam roles for the EKS (k8s) service accounts - is there anything more we need to do to check the connectivity from the containers to the DynamoDb for example? (please note, from Lambda or so we can access DynamoDb - an endpoint exists for the necessary services)
When I execute this on the pod:
aws sts assume-role-with-web-identity \ --role-arn $AWS_ROLE_ARN \ --role-session-name mh9test \ --web-identity-token
file://$AWS_WEB_IDENTITY_TOKEN_FILE \ --duration-seconds 1000
I get this error: Connect timeout on endpoint URL: “sts.amazonaws.com>” which is strange because the vpc endpoint is sts.eu-central-1.amazonaws.com I can also not ping endpoint addresses such as <http://ec2.eu-central-1.amazonaws.com|ec2.eu-central-1.amazonaws.com
First thing to check would be the route tables to make sure any VPC endpoints are actually used.
I’m guessing you don’t have any NAT gateways in the VPC, otherwise the containers should use the public internet to reach the endpoint.
Second thing to check is security groups and network ACLs.
so for the containers to reach the VPC endpoints (in the case of no internet gateway) we would have to set up a nat gateway?
Hi I think he means add some VPCE endpoints so you do not leave the AWS Network.
VPC endpoints were already in place.. still I was not able to ping these VPC endpoints by private DNS name from within a container…
Usually when you can’t ping AWS services from within the network you either have the security groups (outbound) or network ACLs (both directions) set up incorrectly or (because you don’t have a NAT gateway) the VPC endpoints are not correctly attached to the route tables.
One final thing that could also be is that you don’t have DNS hostnames enabled in the VPC.
Single-node K8S cluster for testing on GCP? What would you all think is the best option? Throw minikube on it?
k3s works fine
didn’t even think about that! Thanks!
if you are using ubuntu it comes with a production ready kubernetes server api called micro-k8s with a ton of official plugins supported. K3s is also fine for this.
sudo snap install microk8s --classic --channel=1.19
MicroK8s is the simplest production-grade upstream K8s. Lightweight and focused. Single command install on Linux, Windows and macOS. Made for devops, great for edge, appliances and IoT. Full high availability Kubernetes with autonomous clusters.