Skip to main content
Version: main 🚧

Troubleshooting External Database

info
This feature is available from the Platform version v4.8.0
Modify the following with your specific values to replace on the whole page and generate copyable commands:

Platform pods are crashlooping​

Check pod logs for errors:

kubectl --context arn:aws:eks:us-east-1:123456789012:cluster/platform-ha \
logs -n vcluster-platform -l app=loft --tail=50

Common causes:

  • Database connectivity errors: Verify that VPC peering is active, routes exist on the correct route tables (including public route tables if nodes are in public subnets), and the database security group allows port 3306 from the cluster VPC CIDR.
  • IAM authentication failures: Confirm the Pod Identity association exists, the Pod Identity Agent add-on is ACTIVE, and the RDSIAMAuthKine IAM policy contains the correct DbiResourceId.
  • Resource exhaustion: Check node resource usage with kubectl top nodes and pod resource usage with kubectl top pods -n vcluster-platform.

Leader election is stuck​

If the platform is unresponsive but pods are running, the leader lease may be stuck. Check the lease status:

kubectl --context arn:aws:eks:us-east-1:123456789012:cluster/platform-ha \
get lease -n vcluster-platform -o wide

If the lease holder is a pod that no longer exists, restart the deployment to force a new leader election:

kubectl --context arn:aws:eks:us-east-1:123456789012:cluster/platform-ha \
rollout restart deployment/loft -n vcluster-platform

ALB health checks failing​

Check the ingress status:

kubectl --context arn:aws:eks:us-east-1:123456789012:cluster/platform-ha \
get ingress -n vcluster-platform loft

Verify the ALB target group health in the AWS console or with:

Modify the following with your specific values to generate a copyable command:
aws elbv2 describe-target-health \
--target-group-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/k8s-vcluster-loft-xxxxxxxxxx/xxxxxxxxxx

Common causes:

  • The platform pods are not ready (check kubectl get pods).
  • The /healthz endpoint is not reachable from the ALB (check security groups and the ALB target type matches the service configuration).

Database connection timeouts​

If pods start but log database connection errors:

  1. Verify VPC peering is active:
aws ec2 describe-vpc-peering-connections \
--filters "Name=status-code,Values=active" \
--region us-east-1 \
--query 'VpcPeeringConnections[].{ID:VpcPeeringConnectionId,Requester:RequesterVpcInfo.VpcId,Accepter:AccepterVpcInfo.VpcId}'
  1. Verify routes from the EKS VPC to the database VPC exist:
Modify the following with your specific values to generate a copyable command:
aws ec2 describe-route-tables \
--filters "Name=vpc-id,Values=vpc-yyyyyyyyy" \
--region us-east-1 \
--query 'RouteTables[].Routes[?DestinationCidrBlock==`10.1.0.0/16`]'
  1. Test connectivity from within the cluster:
Modify the following with your specific values to generate a copyable command:
kubectl run dbtest --image=busybox --restart=Never --rm -it -- \
nc -zv mariadb-ha-platform.xxxxxxxxxxxx.us-east-1.rds.amazonaws.com 3306