Running 50+ TCP Services on EKS: Working Around NLB Limits

TL;DR: Migrating TCP-based services from on-prem to EKS with strict regional data requirements. NLBs have a hard 50-listener limit, so I used multiple NLBs with dedicated ingress-nginx controllers, each with its own IngressClass. Not glamorous, but it works.

Update (2026): The Kubernetes community announced that ingress-nginx will reach end-of-life in March 2026. No further releases, bugfixes, or security patches after that date. If you’re implementing this pattern today, consider using Gateway API with a supported controller like NGINX Gateway Fabric, Traefik, or Envoy Gateway instead. The architecture concepts here still apply - you’ll just swap the ingress layer.

The Challenge

New project, new headaches. I was doing a lift-and-shift from on-prem to AWS, and the most complex piece was a fleet of services that needed to be:

Regionally split between US and EU (hello, GDPR)
Performance guaranteed for clients in each region
TCP-based with persistent encrypted connections - not HTTP

These weren’t your typical stateless web services. Each one maintained long-lived TCP connections with clients streaming encrypted data. The services needed stable network identities, hence StatefulSets.

The Architecture Problem

flowchart TB
    subgraph "What I Needed"
        CLIENT[Clients] --> NLB[Network Load Balancer]
        NLB --> |"Port 9000"| SVC1[Service A]
        NLB --> |"Port 9001"| SVC2[Service B]
        NLB --> |"Port 9002"| SVC3[Service C]
        NLB --> |"..."| MORE[50+ Services]
    end

    style MORE fill:#ff6b6b,color:#fff

The problem? AWS NLBs have a hard limit of 50 listeners. Each TCP port needs its own listener. With 50+ services, a single NLB wasn’t going to cut it.

This isn’t a soft limit you can raise with a support ticket - it’s a hard ceiling.

The Constraints

Requirement	Why
TCP (not HTTP)	Services use custom binary protocol over encrypted streams
Stable IPs	Clients have firewall rules; can’t use DNS-only
Regional isolation	GDPR requires EU data stays in EU
50+ ports	Each service needs a dedicated port
Health checks	Need to know when a service is actually ready

The Solution: Multiple NLBs

Since one NLB can’t handle all the ports, use multiple. Each NLB gets its own ingress-nginx controller with a dedicated IngressClass.

flowchart TB
    subgraph "Multi-NLB Architecture"
        CLIENT[Clients]

        CLIENT --> NLB1[NLB 1<br/>Ports 9000-9049]
        CLIENT --> NLB2[NLB 2<br/>Ports 9050-9099]

        NLB1 --> ING1[ingress-nginx<br/>tcp01]
        NLB2 --> ING2[ingress-nginx<br/>tcp02]

        ING1 --> SVC1[Services 1-50]
        ING2 --> SVC2[Services 51-100]
    end

    subgraph "EKS Cluster"
        SVC1
        SVC2
    end

    style NLB1 fill:#4ecdc4,color:#000
    style NLB2 fill:#45b7d1,color:#000

The ingress-nginx Configuration

Each NLB gets its own ingress-nginx deployment. The key is giving each one a unique IngressClass and election ID:

controller:
  # Unique identifiers for this controller instance
  ingressClassByName: true
  ingressClassResource:
    enabled: true
    name: tcp01
    controllerValue: "k8s.io/tcp01"
  ingressClass: tcp01
  electionID: ingress-controller-tcp01

  # NLB configuration
  service:
    enabled: true
    type: LoadBalancer
    enableHttp: false
    enableHttps: false
    externalTrafficPolicy: Local
    annotations:
      # NLB basics
      service.beta.kubernetes.io/aws-load-balancer-name: services01
      service.beta.kubernetes.io/aws-load-balancer-type: external
      service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
      service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: instance

      # Cross-AZ for high availability
      service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"

      # Health checks
      service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: http
      service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: traffic-port
      service.beta.kubernetes.io/aws-load-balancer-healthcheck-path: /GetStatus
      service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: "30"

      # Protocol settings
      service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
      service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"

      # Static IPs via Elastic IPs (for client firewall rules)
      service.beta.kubernetes.io/aws-load-balancer-eip-allocations: eipalloc-xxx,eipalloc-yyy
      service.beta.kubernetes.io/aws-load-balancer-subnets: subnet-xxx,subnet-yyy

  replicaCount: 1

  config:
    use-proxy-protocol: true

  metrics:
    enabled: true

# TCP services mapping: "external_port": "namespace/service:port"
tcp:
  "9000": services/service-alpha:9000
  "9001": services/service-beta:9001
  "9002": services/service-gamma:9002
  # ... up to 50 ports per NLB

Why These Annotations Matter

Annotation	Purpose
`aws-load-balancer-type: external`	Use AWS Load Balancer Controller (not in-tree)
`aws-load-balancer-nlb-target-type: instance`	Route to node ports, not pod IPs directly
`externalTrafficPolicy: Local`	Preserve client IP, avoid extra hop
`aws-load-balancer-proxy-protocol: "*"`	Pass client IP through TCP (no HTTP headers)
`aws-load-balancer-eip-allocations`	Static IPs for client firewall rules
`aws-load-balancer-cross-zone-load-balancing-enabled`	Distribute traffic across AZs evenly

The Service Definition

Each TCP service needs a ClusterIP service that the ingress-nginx controller routes to:

apiVersion: v1
kind: Service
metadata:
  name: service-alpha
  namespace: services
spec:
  type: ClusterIP
  selector:
    app: service-alpha
  ports:
    - name: tcp9000
      port: 9000
      targetPort: 9000
      protocol: TCP

The flow looks like this:

flowchart LR
    CLIENT[Client] --> |":9000"| NLB[NLB]
    NLB --> |"NodePort"| NODE[EKS Node]
    NODE --> |"iptables"| NGINX[ingress-nginx Pod]
    NGINX --> |"ClusterIP"| SVC[service-alpha:9000]
    SVC --> POD[StatefulSet Pod]

    style NLB fill:#4ecdc4,color:#000
    style NGINX fill:#ffd93d,color:#000
    style POD fill:#96ceb4,color:#000

Scaling to Multiple NLBs

For the second NLB (ports 9050+), deploy another ingress-nginx release with different identifiers:

controller:
  ingressClassResource:
    name: tcp02                          # Different class
    controllerValue: "k8s.io/tcp02"
  ingressClass: tcp02
  electionID: ingress-controller-tcp02   # Different election ID

  service:
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-name: services02  # Different NLB
      service.beta.kubernetes.io/aws-load-balancer-eip-allocations: eipalloc-aaa,eipalloc-bbb

tcp:
  "9050": services/service-delta:9050
  "9051": services/service-epsilon:9051
  # ... next batch of services

Planning Your Port Allocation

NLB	IngressClass	Port Range	Services
services01	tcp01	9000-9049	Batch 1 (50 services)
services02	tcp02	9050-9099	Batch 2 (50 services)
services03	tcp03	9100-9149	Batch 3 (50 services)

Regional Deployment

For GDPR compliance, each region gets its own EKS cluster with identical architecture:

flowchart TB
    subgraph "US Region (us-east-1)"
        US_CLIENT[US Clients] --> US_NLB[NLB US]
        US_NLB --> US_EKS[EKS US]
        US_EKS --> US_DATA[(US Data)]
    end

    subgraph "EU Region (eu-west-1)"
        EU_CLIENT[EU Clients] --> EU_NLB[NLB EU]
        EU_NLB --> EU_EKS[EKS EU]
        EU_EKS --> EU_DATA[(EU Data)]
    end

    style US_DATA fill:#45b7d1,color:#000
    style EU_DATA fill:#96ceb4,color:#000

No cross-region data flow. EU client data stays in EU infrastructure. Simple isolation.

What I Learned the Hard Way

The TCP configmap is separate. HTTP services go through Ingress resources. TCP services need the tcp configmap in ingress-nginx values. This took me longer to figure out than I’d like to admit.
Election IDs must be unique. Running multiple ingress-nginx controllers without unique election IDs causes leader election conflicts. Your controllers will fight each other.
Proxy protocol is all-or-nothing. If you enable it on the NLB, the backend must expect it. If there’s a mismatch, connections will fail with cryptic errors.
externalTrafficPolicy: Local has trade-offs. It preserves client IPs but can cause uneven load distribution if your pods aren’t spread across all nodes.
50 listeners is a hard limit. Don’t waste time trying to get it raised. Plan for multiple NLBs from the start if you have more than ~40 services.

Takeaways

NLB limits are real. 50 listeners per NLB is a hard ceiling. Plan your architecture around it.
Multiple ingress controllers work. Just give each one a unique IngressClass, election ID, and NLB name.
TCP services need ClusterIP. The ingress-nginx controller routes to ClusterIP services, not directly to pods.
Static IPs matter for enterprise. Clients with strict firewall rules need predictable IPs. Use Elastic IP allocations.
Regional isolation is simpler than you think. Separate clusters, separate NLBs, no cross-region traffic. GDPR compliance through architecture.

In a future post, I’ll cover how we manage all of this through GitOps using ArgoCD, GitHub Actions, and self-hosted ARC runners within each EKS cluster.