ProficientNowTechRFCs

13. Rationale

RFC-WORKLOAD-IDENTITY-0001                                     Section 13
Category: Standards Track                                      Rationale

13. Rationale

← Previous: Federation | Index | Next: Evolution →


13.1 Why SPIFFE/SPIRE

13.1.1 Selection Criteria

CriterionWeightDescription
Standards-basedHighIndustry-standard specification
Multi-platformHighWorks across clouds and on-prem
Attestation-basedHighNo pre-shared secrets
Kubernetes-nativeMediumFirst-class K8s support
Federation readyMediumMulti-cluster capability
Open sourceMediumNo vendor lock-in
Active communityMediumLong-term viability

13.1.2 SPIFFE/SPIRE Strengths

StrengthBenefit
CNCF GraduatedProven, stable, well-maintained
Attestation modelProves workload identity without secrets
X.509 and JWT SVIDsFlexible identity formats
Federated by designMulti-cluster from the start
Vault integrationSPIFFE auth method available
Service mesh compatibleLinkerd, Istio, Envoy support

13.1.3 SPIFFE/SPIRE Considerations

ConsiderationMitigation
Operational complexityPhased rollout, training
Additional infrastructureStart with Kubernetes-native, add SPIRE later
Learning curveDocumentation, examples
SPIRE server as dependencyHA deployment, fallback patterns

13.2 Alternative Identity Frameworks

13.2.1 Kubernetes Native Only

Description: Use only Kubernetes ServiceAccounts and projected tokens.

Why It Was Attractive:

  • No additional infrastructure
  • Built into Kubernetes
  • Simple mental model
  • Works with Vault Kubernetes auth

Why It Was Not Sufficient:

  • Limited to Kubernetes only
  • No cross-cluster federation
  • No attestation beyond pod metadata
  • No standard identity format
  • Service mesh requires additional identity

Conclusion: Kubernetes-native is a foundation, not a complete solution. SPIRE builds on it.

13.2.2 HashiCorp Vault Only

Description: Use Vault as the sole identity provider.

Why It Was Attractive:

  • Already using Vault for secrets
  • AppRole, Kubernetes auth methods
  • Single system to manage
  • Good audit logging

Why It Was Not Sufficient:

  • Vault is credential authority, not identity issuer
  • No workload attestation
  • Not designed for service-to-service mTLS
  • Doesn't integrate with service mesh
  • SPIFFE auth method still needs SPIRE

Conclusion: Vault complements SPIFFE as credential authority, but doesn't replace it for identity.

13.2.3 Istio Service Mesh

Description: Use Istio's built-in identity (Citadel).

Why It Was Attractive:

  • Comprehensive service mesh
  • Built-in identity and mTLS
  • Rich traffic management
  • Well-documented

Why It Was Not Chosen:

  • Heavier than Linkerd (Envoy-based)
  • More complex configuration
  • Higher resource overhead
  • Linkerd better fits lightweight requirements

Conclusion: Istio is valid but Linkerd chosen for simplicity. Both support SPIFFE.

13.2.4 Cloud-Only Solutions

Description: Use only cloud provider identity (IRSA, Workload Identity).

Why It Was Attractive:

  • Native to cloud provider
  • No additional components
  • Well-integrated with cloud services
  • Managed by cloud provider

Why It Was Not Sufficient:

  • Cloud-specific, not portable
  • Different patterns per cloud
  • No on-premises support
  • No cross-cloud federation standard
  • Service mesh still needs identity

Conclusion: Cloud identity is used for cloud resources, but SPIFFE provides unified layer.


13.3 Why Linkerd for Service Mesh

13.3.1 Service Mesh Comparison

FeatureLinkerdIstioCilium
ProxyRust (linkerd2-proxy)Envoy (C++)eBPF (no sidecar)
Resource overheadLowMedium-HighVery low
Configuration complexityLowHighMedium
mTLSAutomaticConfigurableAutomatic
SPIRE integrationOptionalOptionalLimited
Learning curveGentleSteepMedium

13.3.2 Why Linkerd

ReasonExplanation
LightweightRust proxy uses less CPU/memory than Envoy
SimpleFewer configuration options, less to get wrong
Automatic mTLSWorks out of the box
SPIRE compatibleCan integrate when needed
CNCF GraduatedProduction-ready, well-maintained

13.3.3 Linkerd Considerations

ConsiderationMitigation
Less feature-rich than IstioSufficient for mTLS and authz needs
Smaller community than IstioActive development, responsive maintainers
No built-in Wasm supportNot required for current use cases

13.4 Why OAuth 2.0 Token Exchange for AI Agents

13.4.1 AI Agent Identity Alternatives

AlternativeIssue
Static API keysNo delegation tracking, long-lived
Service accountsAgent identity, not delegator
OAuth client credentialsSame as service accounts
Custom tokensNon-standard, maintenance burden

13.4.2 Token Exchange Advantages

AdvantageDescription
StandardRFC 8693, widely supported
Delegation chainact claim preserves who delegated
Scope attenuationEach exchange can reduce scope
Keycloak supportBuilt-in Token Exchange
AuditableStandard claims for logging

13.4.3 Token Exchange Considerations

ConsiderationMitigation
Keycloak configurationDocument setup process
Chain complexityLimit maximum chain depth
Token sizeChain in claims grows token size

13.5 Why Separate from RFC-PAM

13.5.1 Fundamental Differences

AspectRFC-PAM (Human)RFC-WORKLOAD-IDENTITY (Machine)
Principal typeHuman usersWorkloads, services, agents
AuthenticationInteractive (OIDC, MFA)Programmatic (certificates, tokens)
Session conceptRecorded interactive sessionConnection or request
Access patternJIT, approval-basedPre-authorized, policy-based
RecordingMandatoryOptional (per policy)
Credential flowHuman → Teleport → VaultWorkload → Vault directly

13.5.2 Why Not Combine

Argument for combiningCounter-argument
"Both are access management"Different principals, different patterns
"Same infrastructure"Share Vault, but different auth paths
"Simpler to have one RFC"Cleaner to separate concerns

13.5.3 Shared Components

ComponentPAM UsageWorkload Identity Usage
VaultSSH certs, DB creds via TeleportDirect creds via K8s auth
KeycloakHuman SSOAI agent delegation
Azure ADAuthorization ceilingAuthorization ceiling
TeleportHuman access brokerMachine ID for VMs

13.6 Architecture Decision Records

ADR-WI-001: SPIFFE/SPIRE as Primary Identity Framework

Status: Accepted

Context: Need unified workload identity across Kubernetes, VMs, and multi-cloud.

Decision: Use SPIFFE specification with SPIRE implementation as primary workload identity framework.

Consequences:

  • Portable, standards-based identity
  • Attestation-based security model
  • Additional infrastructure to manage
  • Training required for teams

ADR-WI-002: Linkerd for Service Mesh Identity

Status: Accepted

Context: Need mTLS and service-to-service authorization.

Decision: Use Linkerd for service mesh with optional SPIRE integration.

Consequences:

  • Lightweight, automatic mTLS
  • Simple authorization policies
  • Less feature-rich than Istio
  • Sufficient for current needs

ADR-WI-003: OAuth 2.0 Token Exchange for AI Agents

Status: Accepted

Context: AI agents need to act on behalf of humans with accountability.

Decision: Use RFC 8693 Token Exchange for delegation.

Consequences:

  • Standards-based delegation
  • Full chain visibility
  • Keycloak configuration needed
  • May need to limit chain depth

ADR-WI-004: Vault Kubernetes Auth as Primary

Status: Accepted

Context: Kubernetes workloads need Vault access without static credentials.

Decision: Use Vault Kubernetes auth method with projected ServiceAccount tokens.

Consequences:

  • No static credentials in cluster
  • Namespace-scoped policies possible
  • Vault becomes dependency
  • Token refresh needed

ADR-WI-005: Teleport Machine ID for VMs

Status: Accepted

Context: Non-Kubernetes machines need identity for automation.

Decision: Use Teleport Machine ID (tbot) for VM identity.

Consequences:

  • Consistent identity for VMs
  • Integrates with existing Teleport
  • Requires Teleport infrastructure
  • Cloud attestation support

13.7 Trade-off Summary

DecisionTrade-offRationale
SPIFFE over customComplexity vs standardsStandards enable ecosystem
Linkerd over IstioFeatures vs simplicitySimplicity reduces errors
Token Exchange over customFlexibility vs standardsStandards enable interop
Separate from PAMConsolidation vs clarityClarity enables ownership
Short-lived credsConvenience vs securitySecurity is non-negotiable

Document Navigation


End of Section 13