Skip to content

Support TLS termination for the Envoy Gateway ingress #2017

Description

@zhaohuabing

Problem Statement

By default the OpenShell gateway is only reachable inside the cluster. The Helm chart can expose it through the Kubernetes Gateway API (grpcRoute.enabled=true), but the generated Gateway listener is hardcoded to plaintext protocol: HTTP on port 80 (deploy/helm/openshell/templates/gateway.yaml). The values surface (grpcRoute.gateway.listener) exposes only port, protocol, and allowedRoutes — there is no tls block or certificateRefs. As a result the documented ingress path registers the gateway as http:// and is plaintext end-to-end, with no way to terminate TLS at the edge. Operators exposing the gateway outside the cluster need encrypted client connections.

Proposed Design

Add TLS termination at the Envoy Gateway listener, the standard Gateway API pattern, keeping the existing L7 GRPCRoute:

client → HTTPS → Envoy Gateway (terminate TLS) → plaintext gRPC → openshell gateway pod

  • Extend grpcRoute.gateway.listener with a tls.certificateRefs list. When protocol=HTTPS, render a listener with tls.mode: Terminate and the supplied certificateRefs (a kubernetes.io/tls Secret in the Gateway namespace). Default stays HTTP/80 so existing installs are unchanged.
  • Because Envoy terminates TLS and forwards plaintext, the gateway pod must run with server.disableTls=true. The chart guards both required inputs at render time: fail if certificateRefs is empty under HTTPS, and fail if HTTPS is selected without server.disableTls=true (the chart does not render a BackendTLSPolicy for re-encryption).
  • Client identity: Envoy only terminates TLS — it does not perform OIDC (its OIDC SecurityPolicy relies on browser redirects/cookies and cannot serve gRPC agents). Identity comes from the OpenShell gateway's existing OIDC bearer-token validation; the token rides in gRPC metadata that Envoy forwards untouched.
  • Docs (docs/kubernetes/ingress.mdx), a CI render overlay, the debug-openshell-cluster skill, and the chart README values table are updated accordingly.

Out of scope: TLS passthrough (preserving end-to-end mTLS via a TLSRoute), client mTLS at the Envoy edge (ClientTrafficPolicy), and mapping a forwarded client cert (XFCC) to a Principal.

Alternatives Considered

  • TLS passthrough (mode: Passthrough + TLSRoute): preserves end-to-end mTLS to the gateway, but cannot use the existing L7 GRPCRoute (Envoy can't see gRPC without decrypting), requires the external hostname in the gateway cert SANs, and is a larger templating change.
  • Re-encrypt to a TLS backend (BackendTLSPolicy): keeps the gateway pod on TLS, but adds a CRD dependency and more cert plumbing for no security gain over plaintext within the cluster's pod network.
  • Envoy Gateway OIDC SecurityPolicy for auth: rejected — it is a browser-only flow (redirects + cookies) incompatible with the OpenShell CLI and headless agents.

Agent Investigation

  • Explored deploy/helm/openshell/templates/gateway.yaml, values.yaml, and gateway-config.yaml — confirmed the listener is HTTP-only and that server.disableTls=true already drops the [openshell.gateway.tls] block, making a plaintext backend viable with no server change.
  • Traced the server auth chain (crates/openshell-server/src/auth/{oidc,k8s_sa,sandbox_jwt}.rs, multiplex.rs:558-585): identity comes from the auth chain (bearer token), with the mTLS peer cert only a fallback under --enable-mtls-auth (unsupported with the Kubernetes driver). The client cert is therefore not required for identity behind a terminating proxy.
  • Confirmed the CLI supports both browser PKCE and client-credentials grants (crates/openshell-cli/src/run.rs:974-1004, oidc_auth.rs), so headless agents can authenticate without a browser.

Checklist

  • I've reviewed existing issues and the architecture docs
  • This is a design proposal, not a "please build this" request

Metadata

Metadata

Assignees

No one assigned
    No fields configured for Enhancement.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions