AWS Service Networking Guide: EKS, ECS, Lambda, RDS, and More
Deploy AWS services with confidence. Learn networking requirements for EKS, ECS Fargate, Lambda, RDS, ElastiCache, ALB, API Gateway, and MSK with production-ready Terraform examples.
AWS Service Networking: Practical Deployment Guide
This is a companion guide to AWS Networking Best Practices. While that article covers VPC fundamentals, this one focuses on the practical networking requirements for deploying specific AWS services.
In this series
- Part 1: AWS Networking Best Practices
- Part 1a: AWS Service Networking Guide (you are here)
- Part 2: Azure Networking Best Practices →
- Part 3: GCP Networking Best Practices →
Understanding VPC fundamentals is essential, but the real challenge comes when deploying actual AWS services. Each service has unique networking requirements that can trip up even experienced engineers. This guide covers what you need to know to successfully deploy the most common AWS services.
Quick Reference: Networking by Service
| Service | Subnet Type | Key Requirements |
|---|---|---|
| EKS | Private (large /19+) | Subnet tags, CNI IP planning, pod security groups |
| ECS Fargate | Private | awsvpc mode, ALB integration, service discovery |
| Lambda | Private (only if needed) | NAT for internet, VPC endpoints for AWS services |
| RDS | Database (isolated) | Subnet group across 2+ AZs, no public access |
| ElastiCache | Database (isolated) | Subnet group, cluster mode considerations |
| ALB/NLB | Public or Private | Cross-zone, target group networking |
| API Gateway | N/A (managed) | VPC Links for private integrations |
| MSK (Kafka) | Private | 3 AZs recommended, high throughput planning |
EKS (Elastic Kubernetes Service)
EKS networking is one of the most complex topics in AWS. The key thing to understand is that AWS VPC CNI assigns real VPC IP addresses to each pod—this is different from other Kubernetes distributions that use overlay networks.
Why this matters: A node with 30 pods needs 30+ IP addresses. If you’re running 100 nodes with 30 pods each, that’s 3,000+ IPs just for pods. Plan your subnets accordingly.
Critical: Subnet Tags for EKS
EKS uses specific tags to auto-discover subnets for load balancer placement. Without these tags, your Kubernetes services won’t be able to provision external or internal load balancers. The kubernetes.io/role/elb tag tells EKS which subnets can host internet-facing load balancers, while kubernetes.io/role/internal-elb identifies subnets for internal load balancers. Additionally, the cluster tag helps EKS identify which subnets belong to which cluster when you have multiple clusters in the same VPC.
EKS VPC with Required Subnet Tags terraform
module "vpc" { source = "terraform-aws-modules/vpc/aws" version = "~> 5.0"
name = "eks-vpc" cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
# Use /19 for private subnets (8,192 IPs each) - EKS pods consume IPs fast private_subnets = ["10.0.0.0/19", "10.0.32.0/19", "10.0.64.0/19"] public_subnets = ["10.0.96.0/24", "10.0.97.0/24", "10.0.98.0/24"]
enable_nat_gateway = true single_nat_gateway = false one_nat_gateway_per_az = true
# REQUIRED: Tags for EKS subnet auto-discovery public_subnet_tags = { "kubernetes.io/role/elb" = 1 # Public LBs (internet-facing) }
private_subnet_tags = { "kubernetes.io/role/internal-elb" = 1 # Internal LBs }
tags = { "kubernetes.io/cluster/my-cluster" = "shared" }}EKS Security Groups
EKS requires careful security group configuration to allow communication between the control plane and worker nodes. The control plane needs to reach nodes on ports 1025-65535 for kubelet communication, while nodes need to reach the control plane API on port 443. Additionally, nodes must be able to communicate with each other for pod-to-pod networking. The configuration below sets up these security groups with the minimum required rules.
EKS Cluster and Node Security Groups terraform
# Cluster security group (control plane to nodes)resource "aws_security_group" "eks_cluster" { name = "eks-cluster-sg" description = "EKS cluster security group" vpc_id = module.vpc.vpc_id
# Allow nodes to communicate with control plane ingress { description = "Nodes to cluster API" from_port = 443 to_port = 443 protocol = "tcp" security_groups = [aws_security_group.eks_nodes.id] }
egress { description = "Cluster to nodes" from_port = 1025 to_port = 65535 protocol = "tcp" security_groups = [aws_security_group.eks_nodes.id] }
tags = { Name = "eks-cluster-sg" }}
# Node security groupresource "aws_security_group" "eks_nodes" { name = "eks-nodes-sg" description = "EKS node security group" vpc_id = module.vpc.vpc_id
# Node to node communication (required for pod networking) ingress { description = "Node to node" from_port = 0 to_port = 0 protocol = "-1" self = true }
# Control plane to nodes ingress { description = "Cluster API to nodes" from_port = 1025 to_port = 65535 protocol = "tcp" security_groups = [aws_security_group.eks_cluster.id] }
egress { description = "All outbound" from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] }
tags = { Name = "eks-nodes-sg" }}EKS IP Planning
- Default CNI: Each pod gets a VPC IP. A
m5.largecan run ~29 pods (based on ENI limits) - Prefix delegation: Enable to get 16 IPs per ENI slot, supporting 110+ pods per node
- Secondary CIDR: Add a 100.64.0.0/16 CIDR for pod IPs if running out of space
- Subnet sizing: Use
/19minimum for production (8,190 usable IPs)
ECS (Elastic Container Service)
ECS networking depends on your launch type (Fargate vs EC2) and network mode. For modern deployments, use Fargate with awsvpc network mode—each task gets its own ENI with a VPC IP address.
Fargate Networking
ECS Fargate with awsvpc network mode gives each task its own elastic network interface (ENI) with a private IP address from your VPC. This provides better security isolation and allows you to apply security groups directly to tasks. The configuration below shows a complete ECS service setup including the cluster, service with network configuration, ALB integration, and the security group that controls traffic to your containers.
ECS Fargate Service with awsvpc Networking terraform
# ECS Clusterresource "aws_ecs_cluster" "main" { name = "production-cluster"
setting { name = "containerInsights" value = "enabled" }}
# ECS Service with awsvpc networkingresource "aws_ecs_service" "app" { name = "my-app" cluster = aws_ecs_cluster.main.id task_definition = aws_ecs_task_definition.app.arn desired_count = 3 launch_type = "FARGATE"
# awsvpc network mode - each task gets its own ENI network_configuration { subnets = module.vpc.private_subnets security_groups = [aws_security_group.ecs_tasks.id] assign_public_ip = false # Use NAT Gateway for outbound }
# ALB integration load_balancer { target_group_arn = aws_lb_target_group.app.arn container_name = "app" container_port = 8080 }
# Service discovery for internal communication service_registries { registry_arn = aws_service_discovery_service.app.arn }}
# Security group for ECS tasksresource "aws_security_group" "ecs_tasks" { name = "ecs-tasks-sg" description = "Security group for ECS tasks" vpc_id = module.vpc.vpc_id
# Allow traffic from ALB ingress { description = "From ALB" from_port = 8080 to_port = 8080 protocol = "tcp" security_groups = [aws_security_group.alb.id] }
# Allow task-to-task communication (for service mesh/discovery) ingress { description = "Task to task" from_port = 0 to_port = 0 protocol = "-1" self = true }
egress { description = "All outbound" from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] }
tags = { Name = "ecs-tasks-sg" }}Service Discovery with Cloud Map
AWS Cloud Map provides DNS-based service discovery for your ECS services, allowing containers to find each other using friendly DNS names instead of hardcoded IP addresses. When you register a service with Cloud Map, ECS automatically registers and deregisters task IP addresses as tasks start and stop. Other services can then resolve the service name to get the current IP addresses of healthy tasks.
ECS Service Discovery with Cloud Map terraform
# Private DNS namespace for service discoveryresource "aws_service_discovery_private_dns_namespace" "main" { name = "local" description = "Private DNS namespace for ECS services" vpc = module.vpc.vpc_id}
# Service discovery serviceresource "aws_service_discovery_service" "app" { name = "my-app"
dns_config { namespace_id = aws_service_discovery_private_dns_namespace.main.id
dns_records { ttl = 10 type = "A" }
routing_policy = "MULTIVALUE" }
health_check_custom_config { failure_threshold = 1 }}
# Other services can now reach this service at: my-app.localLambda Networking
Here’s the most important thing about Lambda networking: Don’t put Lambda in a VPC unless you have to.
Lambda functions run in AWS-managed VPCs by default and can access the internet and AWS services without any VPC configuration. Only configure VPC access when you need to reach private resources like RDS, ElastiCache, or internal APIs.
When to use VPC with Lambda:
- Accessing RDS, Aurora, or ElastiCache in private subnets
- Calling internal APIs or services in your VPC
- Compliance requirements mandating VPC isolation
When NOT to use VPC with Lambda:
- Calling public APIs or AWS services (use VPC Endpoints instead)
- Simple functions that don’t need private resource access
- When cold start latency is critical (VPC adds 1-10 seconds)
Lambda in VPC
When you do need VPC access for Lambda, the configuration is straightforward but has important implications. Lambda creates ENIs in your specified subnets, which means you need enough IP addresses to handle concurrent executions. The security group should follow least-privilege principles—only allow outbound traffic to the specific resources your function needs to access. Adding VPC endpoints for AWS services like Secrets Manager and SSM avoids routing that traffic through NAT Gateway, reducing both cost and latency.
Lambda with VPC Access and Endpoints terraform
# Lambda function with VPC accessresource "aws_lambda_function" "vpc_lambda" { filename = "function.zip" function_name = "my-vpc-function" role = aws_iam_role.lambda.arn handler = "index.handler" runtime = "nodejs18.x"
# VPC configuration - only add if you need private resource access vpc_config { subnet_ids = module.vpc.private_subnets security_group_ids = [aws_security_group.lambda.id] }
# Increase timeout to account for cold starts timeout = 30
environment { variables = { DB_HOST = aws_db_instance.main.address } }}
# Security group for Lambdaresource "aws_security_group" "lambda" { name = "lambda-sg" description = "Security group for Lambda functions" vpc_id = module.vpc.vpc_id
# Outbound to RDS egress { description = "To RDS" from_port = 5432 to_port = 5432 protocol = "tcp" security_groups = [aws_security_group.db_tier.id] }
# Outbound to ElastiCache egress { description = "To ElastiCache" from_port = 6379 to_port = 6379 protocol = "tcp" security_groups = [aws_security_group.redis.id] }
# Outbound HTTPS (for AWS services via NAT or VPC Endpoints) egress { description = "HTTPS outbound" from_port = 443 to_port = 443 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] }
tags = { Name = "lambda-sg" }}
# IMPORTANT: Add VPC Endpoints to avoid NAT Gateway costsresource "aws_vpc_endpoint" "lambda_endpoints" { for_each = toset([ "secretsmanager", # For database credentials "ssm", # For parameter store "logs", # For CloudWatch Logs ])
vpc_id = module.vpc.vpc_id service_name = "com.amazonaws.${var.region}.${each.value}" vpc_endpoint_type = "Interface" subnet_ids = module.vpc.private_subnets security_group_ids = [aws_security_group.vpc_endpoints.id] private_dns_enabled = true
tags = { Name = "${each.value}-endpoint" }}Lambda VPC Best Practices
- Use VPC Endpoints: Avoid NAT Gateway costs for AWS service calls (Secrets Manager, SSM, S3)
- Provision Concurrency: Reduces cold starts for VPC Lambdas by keeping ENIs warm
- Subnet sizing: Lambda can consume many IPs during scale-out. Use
/19subnets minimum - Security groups: Be specific—only allow outbound to resources Lambda actually needs
RDS and Aurora
Database networking is straightforward but critical to get right. Databases should always be in isolated subnets with no internet access.
RDS Networking
RDS requires a DB subnet group that spans at least two Availability Zones for high availability. The database should never be publicly accessible in production—all access should come through your application tier or bastion hosts. The security group should only allow connections from the specific security groups that need database access, following the principle of least privilege.
RDS with Subnet Group and Security terraform
# Database subnet group (required for RDS)resource "aws_db_subnet_group" "main" { name = "main-db-subnet-group" subnet_ids = module.vpc.database_subnets
tags = { Name = "main-db-subnet-group" }}
# RDS instanceresource "aws_db_instance" "main" { identifier = "production-db" engine = "postgres" engine_version = "15.4" instance_class = "db.r6g.large"
allocated_storage = 100 max_allocated_storage = 500 storage_encrypted = true
db_name = "myapp" username = "admin" password = var.db_password
# Networking db_subnet_group_name = aws_db_subnet_group.main.name vpc_security_group_ids = [aws_security_group.rds.id] publicly_accessible = false # NEVER set to true in production
# High availability multi_az = true
tags = { Name = "production-db" }}
# RDS security groupresource "aws_security_group" "rds" { name = "rds-sg" description = "Security group for RDS" vpc_id = module.vpc.vpc_id
# Only allow from app tier ingress { description = "PostgreSQL from app tier" from_port = 5432 to_port = 5432 protocol = "tcp" security_groups = [aws_security_group.app_tier.id] }
# Allow from Lambda ingress { description = "PostgreSQL from Lambda" from_port = 5432 to_port = 5432 protocol = "tcp" security_groups = [aws_security_group.lambda.id] }
# No egress needed for RDS
tags = { Name = "rds-sg" }}RDS Proxy for Lambda
When Lambda connects to RDS, each invocation can create a new database connection. This quickly exhausts connection limits. RDS Proxy solves this by pooling connections. The proxy maintains a pool of database connections and multiplexes Lambda invocations across them, dramatically reducing the connection overhead and allowing your database to handle many more concurrent Lambda executions.
RDS Proxy for Connection Pooling terraform
# RDS Proxy for connection poolingresource "aws_db_proxy" "main" { name = "rds-proxy" debug_logging = false engine_family = "POSTGRESQL" idle_client_timeout = 1800 require_tls = true role_arn = aws_iam_role.rds_proxy.arn vpc_security_group_ids = [aws_security_group.rds_proxy.id] vpc_subnet_ids = module.vpc.database_subnets
auth { auth_scheme = "SECRETS" iam_auth = "REQUIRED" secret_arn = aws_secretsmanager_secret.db_credentials.arn }
tags = { Name = "rds-proxy" }}
# Lambda connects to proxy endpoint instead of RDS directly# Proxy handles connection pooling automaticallyElastiCache (Redis/Memcached)
ElastiCache follows similar patterns to RDS—isolated subnets, no public access. Redis clusters should be deployed with replication across multiple AZs for high availability, and encryption should be enabled both at rest and in transit. The security group should only allow connections from the application tiers that need cache access.
ElastiCache Redis Cluster terraform
# ElastiCache subnet groupresource "aws_elasticache_subnet_group" "main" { name = "main-cache-subnet-group" subnet_ids = module.vpc.database_subnets}
# Redis clusterresource "aws_elasticache_replication_group" "main" { replication_group_id = "production-redis" description = "Production Redis cluster"
node_type = "cache.r6g.large" num_cache_clusters = 2 # Primary + replica port = 6379
subnet_group_name = aws_elasticache_subnet_group.main.name security_group_ids = [aws_security_group.redis.id]
at_rest_encryption_enabled = true transit_encryption_enabled = true auth_token = var.redis_auth_token
automatic_failover_enabled = true multi_az_enabled = true
tags = { Name = "production-redis" }}
# Redis security groupresource "aws_security_group" "redis" { name = "redis-sg" description = "Security group for ElastiCache Redis" vpc_id = module.vpc.vpc_id
ingress { description = "Redis from app tier" from_port = 6379 to_port = 6379 protocol = "tcp" security_groups = [ aws_security_group.app_tier.id, aws_security_group.lambda.id, aws_security_group.ecs_tasks.id ] }
tags = { Name = "redis-sg" }}Application Load Balancer (ALB)
ALBs are typically deployed in public subnets for internet-facing applications, or private subnets for internal services. Internet-facing ALBs need to be in public subnets with routes to an Internet Gateway, while internal ALBs can be in private subnets. Cross-zone load balancing should be enabled to distribute traffic evenly across all registered targets in all enabled Availability Zones.
Public and Internal ALB Configuration terraform
# Internet-facing ALBresource "aws_lb" "public" { name = "public-alb" internal = false load_balancer_type = "application" security_groups = [aws_security_group.alb.id] subnets = module.vpc.public_subnets
enable_deletion_protection = true enable_cross_zone_load_balancing = true
tags = { Name = "public-alb" }}
# Internal ALB (for service-to-service communication)resource "aws_lb" "internal" { name = "internal-alb" internal = true load_balancer_type = "application" security_groups = [aws_security_group.internal_alb.id] subnets = module.vpc.private_subnets
tags = { Name = "internal-alb" }}
# ALB security groupresource "aws_security_group" "alb" { name = "alb-sg" description = "Security group for public ALB" vpc_id = module.vpc.vpc_id
ingress { description = "HTTPS from internet" from_port = 443 to_port = 443 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] }
ingress { description = "HTTP from internet (redirect to HTTPS)" from_port = 80 to_port = 80 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] }
egress { description = "To targets" from_port = 0 to_port = 0 protocol = "-1" security_groups = [ aws_security_group.app_tier.id, aws_security_group.ecs_tasks.id ] }
tags = { Name = "alb-sg" }}API Gateway with VPC Integration
API Gateway is a managed service that doesn’t run in your VPC, but you can connect it to private resources using VPC Links. This allows API Gateway to route requests to private ALBs, NLBs, or directly to private IP addresses. VPC Links create a private connection between API Gateway and your VPC, keeping traffic off the public internet.
API Gateway with VPC Link to Private ALB terraform
# VPC Link for API Gateway to reach private resourcesresource "aws_apigatewayv2_vpc_link" "main" { name = "main-vpc-link" security_group_ids = [aws_security_group.vpc_link.id] subnet_ids = module.vpc.private_subnets
tags = { Name = "main-vpc-link" }}
# HTTP API with VPC integrationresource "aws_apigatewayv2_api" "main" { name = "my-api" protocol_type = "HTTP"}
# Integration to private ALBresource "aws_apigatewayv2_integration" "private_alb" { api_id = aws_apigatewayv2_api.main.id integration_type = "HTTP_PROXY" integration_uri = aws_lb_listener.internal.arn integration_method = "ANY" connection_type = "VPC_LINK" connection_id = aws_apigatewayv2_vpc_link.main.id}
# Security group for VPC Linkresource "aws_security_group" "vpc_link" { name = "api-gateway-vpc-link-sg" description = "Security group for API Gateway VPC Link" vpc_id = module.vpc.vpc_id
egress { description = "To internal ALB" from_port = 443 to_port = 443 protocol = "tcp" security_groups = [aws_security_group.internal_alb.id] }
tags = { Name = "api-gateway-vpc-link-sg" }}S3 Access Patterns
S3 is a regional service that doesn’t run in your VPC, but you can control access patterns using Gateway Endpoints and bucket policies. Gateway Endpoints are free and should always be used—they route S3 traffic through AWS’s private network instead of the internet. You can also restrict bucket access to only allow requests that come through your VPC endpoint, providing an additional layer of security for sensitive data.
S3 Gateway Endpoint with VPC-Restricted Bucket Policy terraform
# Gateway Endpoint for S3 (FREE - always use this)resource "aws_vpc_endpoint" "s3" { vpc_id = module.vpc.vpc_id service_name = "com.amazonaws.${var.region}.s3" vpc_endpoint_type = "Gateway" route_table_ids = concat( module.vpc.private_route_table_ids, module.vpc.database_route_table_ids )
tags = { Name = "s3-gateway-endpoint" }}
# S3 bucket policy - restrict to VPC endpoint onlyresource "aws_s3_bucket_policy" "private_bucket" { bucket = aws_s3_bucket.private.id
policy = jsonencode({ Version = "2012-10-17" Statement = [ { Sid = "AllowVPCEndpointOnly" Effect = "Deny" Principal = "*" Action = "s3:*" Resource = [ aws_s3_bucket.private.arn, "${aws_s3_bucket.private.arn}/*" ] Condition = { StringNotEquals = { "aws:sourceVpce" = aws_vpc_endpoint.s3.id } } } ] })}
# For S3 access from specific VPCs onlyresource "aws_s3_bucket_policy" "vpc_restricted" { bucket = aws_s3_bucket.internal.id
policy = jsonencode({ Version = "2012-10-17" Statement = [ { Sid = "AllowVPCOnly" Effect = "Deny" Principal = "*" Action = "s3:*" Resource = [ aws_s3_bucket.internal.arn, "${aws_s3_bucket.internal.arn}/*" ] Condition = { StringNotEquals = { "aws:sourceVpc" = module.vpc.vpc_id } } } ] })}MSK (Managed Streaming for Apache Kafka)
MSK requires careful network planning due to high throughput requirements. Kafka brokers should be deployed across three Availability Zones for durability and high availability. The cluster should be kept private with no public access, and security groups should allow the necessary ports for Kafka client connections (9094 for TLS), Zookeeper (2181), and inter-broker communication.
MSK Kafka Cluster Configuration terraform
# MSK clusterresource "aws_msk_cluster" "main" { cluster_name = "production-kafka" kafka_version = "3.5.1" number_of_broker_nodes = 3 # One per AZ
broker_node_group_info { instance_type = "kafka.m5.large" client_subnets = module.vpc.private_subnets # Must be in 3 AZs security_groups = [aws_security_group.msk.id]
storage_info { ebs_storage_info { volume_size = 1000 # GB } }
connectivity_info { public_access { type = "DISABLED" # Keep private } } }
encryption_info { encryption_in_transit { client_broker = "TLS" in_cluster = true } }
tags = { Name = "production-kafka" }}
# MSK security groupresource "aws_security_group" "msk" { name = "msk-sg" description = "Security group for MSK" vpc_id = module.vpc.vpc_id
# Kafka broker ports ingress { description = "Kafka TLS" from_port = 9094 to_port = 9094 protocol = "tcp" security_groups = [ aws_security_group.app_tier.id, aws_security_group.ecs_tasks.id ] }
# Zookeeper (if needed) ingress { description = "Zookeeper" from_port = 2181 to_port = 2181 protocol = "tcp" security_groups = [aws_security_group.app_tier.id] }
# Broker to broker communication ingress { description = "Inter-broker" from_port = 0 to_port = 0 protocol = "-1" self = true }
egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] }
tags = { Name = "msk-sg" }}Service Networking Checklist
Before Deploying Any AWS Service
Before deploying any AWS service, verify:
- Subnet selection: Right subnet type (public/private/database) for the service
- Security groups: Least-privilege rules allowing only required traffic
- VPC Endpoints: Add Gateway Endpoints (S3, DynamoDB) and Interface Endpoints for frequently used services
- DNS resolution: Enable DNS hostnames and support in VPC settings
- IP capacity: Ensure subnets have enough IPs for the service’s scaling needs
Next Steps
With these service-specific configurations, you’re ready to deploy production workloads on AWS.
Continue the series
- ← Part 1: AWS Networking Best Practices
- Part 2: Azure Networking Best Practices →
- Part 3: GCP Networking Best Practices →
Additional Resources
Need help? Contact Quabyt for AWS architecture and implementation support.