VPN
IKEv2 VPN Server
IKEv2 VPN server with kernel XFRM IPsec, PSK authentication, NAT traversal, and split-horizon DNS
Overview
The IKEv2 package implements a full IKEv2 VPN server (RFC 7296) with Linux kernel XFRM-based IPsec for high-performance encrypted tunnels. It uses a split-plane architecture: userspace handles IKE control exchanges while the kernel handles all ESP data plane encryption, decryption, routing, and forwarding.
Core capabilities:
- IKE_SA_INIT: Cryptographic negotiation and Diffie-Hellman key exchange
- IKE_AUTH: PSK authentication against LDAP directory with group membership checks
- CREATE_CHILD_SA: ESP Child SA rekeying with overlapping SA support (zero packet loss)
- INFORMATIONAL: Dead Peer Detection (DPD), SA deletion, configuration push (CFG_SET)
- NAT detection and traversal (RFC 3947/3948) with UDP encapsulation on port 4500
- BPF filter on port 4500 socket drops ESP packets at kernel level (95%+ CPU savings)
- Kernel XFRM integration: AES-GCM hardware-accelerated ESP via AES-NI
- Split-horizon DNS server on gateway IP for internal domain resolution
- DNAT rules for transparent service access and optional DNS hijacking
- Per-client nftables firewall chains with CIDR whitelist rules
- Distributed session storage replicated to all nodes (24-hour TTL)
- Distributed IP allocation handled by cluster leader (conflict-free)
- Geo-IP and time-based access restrictions at connection establishment
- Time restriction enforcement on established sessions (60-second check interval)
- IKE SA lifetime enforcement (default 7 days, configurable) with auto-reconnect
- Child SA lifetime management with automatic rekeying (default 24 hours)
Protocol exchanges handled:
IKE_SA_INIT (34) - Unencrypted DH exchange, algorithm negotiation, NAT detection IKE_AUTH (35) - Encrypted PSK authentication, Child SA establishment CREATE_CHILD_SA (36) - Encrypted ESP rekeying with optional PFS INFORMATIONAL (37) - Encrypted DPD probes, DELETE notifications, CFG_SET pushPerformance characteristics:
- 10K+ packets/second throughput with kernel crypto (AES-NI) - ESP packets never reach userspace (BPF filter + kernel XFRM) - Single XFRM interface (ipsec0) with IP-based differentiation for 1000+ sessions - 8MB UDP send/receive buffers for burst handling - TX queue length 100,000 for ~175MB buffered traffic capacityPlatform: Linux only (requires kernel XFRM, netlink, nftables, ip_forward).
Config
Configuration under [vpn], [vpn.network], [vpn.auth], [vpn.crypto], and [vpn.timeouts]:
[vpn]
enabled = true # Enable IKEv2 VPN server hostname = "vpn.example.com" # VPN server hostname for IKE negotiation network_interface = "eth0" # Network interface for IKE/ESP binding ikev2_port = 500 # IKE UDP port (default: 500) esp_port = 4500 # NAT-T/ESP UDP port (default: 4500) single_port = false # All IKEv2 on port 4500 only (L4 LB compatible) debug = false # Enable detailed protocol debug logging[vpn.network]
subnet = "100.64.200.0/22" # VPN IP pool subnet (supports up to 1022 clients) gateway = "100.64.200.1" # Gateway IP assigned to XFRM interface (ipsec0) dns_upstream = ["8.8.8.8", "1.1.1.1"] # Upstream DNS resolvers for external domains dns_domains = ["internal.example.com"] # DNS search domains pushed to clients dns_hijack = false # Force all VPN DNS through internal server (DNAT) mtu = 1420 # VPN tunnel MTU (ESP overhead subtracted)[vpn.auth]
group = "vpn-users" # Required LDAP group for VPN access lease_valid = "45d" # PSK expiration period (forces rotation)[vpn.crypto]
ike_proposals = ["aes256gcm16-prfsha384-ecp384"] # IKE SA algorithm proposals esp_proposals = ["aes256gcm16-ecp384"] # ESP Child SA algorithm proposals rekey_interval = "24h" # Child SA rekey interval (ESP key rotation) ike_sa_lifetime = "168h" # IKE SA maximum lifetime (default: 7 days)[vpn.timeouts]
dpd_interval = "30s" # Dead Peer Detection probe interval dpd_timeout = "90s" # DPD timeout before session eviction session_timeout = "24h" # Maximum session duration (TTL)Geo and time restrictions under [vpn]:
geo_enabled = true # Enable VPN-specific geo restrictions geo_allow_countries = ["US", "CA"] # Country whitelist (ISO 3166-1 alpha-2) geo_deny_countries = ["KP", "IR"] # Country blacklist (takes precedence) geo_allow_asn = [] # ASN whitelist geo_deny_asn = [] # ASN blacklist geo_bypass_cidr = ["10.0.0.0/8"] # Skip geo checks for these CIDRs time_enabled = true # Enable VPN-specific time restrictions time_timezone = "America/New_York" # Default timezone for time checks time_allow_days = ["Mon","Tue","Wed","Thu","Fri"] # Allowed days time_allow_hours = "08:00-20:00" # Allowed hours (business hours) time_bypass_cidr = ["10.0.0.0/8"] # Skip time checks for these CIDRs [[vpn.time_windows]] # Per-region time windows countries = ["GB", "DE", "FR"] timezone = "Europe/London" allow_days = ["Mon","Tue","Wed","Thu","Fri"] allow_hours = "09:00-18:00"Supported IKE algorithms:
Encryption: AES-CBC-128/192/256, AES-GCM-16-128/192/256, ChaCha20-Poly1305 Integrity: HMAC-SHA256/384/512 (non-AEAD ciphers only) PRF: HMAC-SHA256/384/512 DH Groups: MODP-2048(14), MODP-3072(15), MODP-4096(16), ECP-256(19), ECP-384(20), ECP-521(21)Supported ESP algorithms:
Encryption: AES-CBC-128/192/256, AES-GCM-16-128/192/256, ChaCha20-Poly1305 Integrity: HMAC-SHA256/384/512 (non-AEAD ciphers only) Kernel uses RFC 4106 encapsulation for AES-GCM: rfc4106(gcm(aes))XFRM interface details:
Interface: ipsec0 with fixed Interface ID 42 TX queue: 100,000 (burst handling for 10Gbps+ peak throughput) IP forwarding: /proc/sys/net/ipv4/ip_forward = 1 (enabled automatically) nftables: inet ikev2_filter (per-client chains) + inet ikev2_nat (DNAT/masquerade)Hot-reloadable: geo/time restrictions, PSK expiration, debug logging, DNS domains. Cold (restart required): enabled, network interface, ports, subnet, gateway, crypto proposals.
Architecture
Split-plane architecture:
Control Plane (Userspace Go):
- IKE protocol exchanges (SA_INIT, AUTH, CREATE_CHILD_SA, INFORMATIONAL) - PSK authentication against LDAP directory - Session lifecycle management with distributed storage - VPN IP address allocation (handled by cluster leader) - DPD monitoring, rekey scheduling, statistics collection - nftables firewall chain management (per-client rules)Data Plane (Kernel Space):
- ESP packet encryption/decryption (AES-GCM with AES-NI hardware acceleration) - IP routing through XFRM virtual interface (ipsec0) - nftables firewall with per-client chains for CIDR whitelist - NAT masquerading for internet access (POSTROUTING) - Connection tracking for stateful packet inspectionPacket flow (Client to Internet):
1. Client encrypts IP packet with ESP, sends via UDP 4500 2. Kernel receives UDP, BPF filter drops ESP (only IKE reaches userspace) 3. XFRM inbound state decrypts ESP (AES-GCM hardware-accelerated) 4. Decrypted packet emerges on ipsec0 interface 5. nftables ikev2_forward chain checks per-client rules 6. Kernel IP forwarding routes packet to eth0 7. NAT masquerading rewrites source to server public IP 8. Packet sent to internetXFRM state per session (2 states + 3 policies):
Inbound state: Client real IP -> Server real IP, server-generated SPI Outbound state: Server real IP -> Client real IP, client-provided SPI Inbound policy: clientVPNIP/32 -> 0.0.0.0/0 (XFRM_DIR_IN) Outbound policy: 0.0.0.0/0 -> clientVPNIP/32 (XFRM_DIR_OUT) Forward policy: clientVPNIP/32 -> 0.0.0.0/0 (XFRM_DIR_FWD) All states: AES-GCM 36-byte key (32 cipher + 4 salt), Interface ID 42Child SA rekeying (CREATE_CHILD_SA):
1. Rekey monitor triggers when ChildSAExpiresAt is reached 2. New ESP keys derived: prf+(SK_d, [g^ir] | Ni | Nr) 3. Old SA keys saved to session.Old* fields 4. New XFRM states installed, old states remain for 60-second transition window 5. Both old and new SPIs accepted simultaneously (zero packet loss) 6. Client sends DELETE for old SPI, old keys zeroizedBackground monitors (started at initialization):
Pending auth cleanup: removes stale IKE_AUTH contexts NAT-T keepalive: sends 0xFF byte every 20 seconds Session timeout: enforces 24-hour session TTL DPD monitor: liveness probes every 30 seconds (evict after 3 failures) Rekey monitor: Child SA lifetime management (every 5 minutes) Statistics monitor: queries kernel XFRM state counters every 10 seconds Time restrictions: checks time-based access policy every 60 secondsSession indexes (distributed, replicated to all nodes):
- Session ID → session data (primary storage, 24h TTL) - ESP SPI → session (hot path for kernel ESP packet routing) - VPN IP → session (routing decisions) - Username → sessions (user session listing) - VPN IP → allocation record (distributed IP pool)Split-horizon DNS server (UDP 53 on gateway IP):
1. Service hostname -> gateway IP (direct, no cache, hot-reload safe) 2. Cookie domain wildcard -> gateway IP (*.example.com matches) 3. External domains -> upstream DNS resolvers (cached via DNS module) DNS domains pushed to clients via IKE_AUTH CP_CFG_REPLY attributesDNAT rules (nftables ikev2_nat prerouting):
DNS hijack (optional): VPN subnet UDP 53 -> gateway IP:53 Service access (always): VPN subnet TCP gateway:443 -> service_interface_ip:443Security
Security model and protections:
PSK storage and lifecycle:
- PSKs stored AES-encrypted in LDAP (directory module_data, context: "vpn") - Decrypted only during IKE_AUTH authentication, never logged or stored plaintext - Each PSK has valid_until timestamp (RFC 3339), expired PSKs rejected - Default expiration: 45 days (forces periodic PSK rotation) - Expiration reminder emails configurable via vpn.auth.psk_reminder_* settingsMulti-step authentication flow:
1. Check user disabled status (via directory service) 2. Verify group membership (user must be in vpn.auth.group) 3. Retrieve and decrypt PSK from LDAP (via directory module data) 4. Check PSK expiration (valid_until timestamp) 5. Verify client AUTH payload (prf(prf(PSK, "Key Pad for IKEv2"), SignedOctets)) Failed authentications logged with reason server-side, not sent to client.Key zeroization:
- All cryptographic keys zeroed on session removal - SK_* keys (IKE SA keying material): byte-by-byte zeroing before GC - ESP keys (encryption, authentication, salts): explicit zeroing on session cleanup - Old Child SA keys zeroed after 60-second transition windowReplay protection:
- XFRM replay window: 256 packets (kernel maximum without ESN) - RFC 4303 §3.2: Anti-replay protection REQUIRED - Values >256 rejected by kernel and fall back to 0 (disabled) - 256-packet window provides strong protection while maintaining performanceSA lifetime limits (automatic expiration prevents sequence number exhaustion):
- Soft limits: 1GB / 268M packets (triggers rekey warning) - Hard limits: 2GB / 536M packets (forces SA deletion) - IKE SA lifetime: 168 hours default (24h min, 30d max configurable)Per-client firewall (nftables):
- Per-client chains: client_{vpn_ip_underscored} in ikev2_filter table - Default: 0.0.0.0/0 (route everything through tunnel) - Configurable per user/group via vpn_sessions authorization - Cleanup: chain removal + conntrack flush on session deletionGeo/time access restrictions:
- Geo checks at connection time only (IKE_SA_INIT/IKE_AUTH phase) - Blocked connections silently dropped (no IKE response, client retries then times out) - Time restrictions enforced on established sessions (60-second monitor interval) - Sessions terminated gracefully with DELETE notification when time window closes - VPN-specific config overrides, falls back to [service] sectionXFRM cleanup on startup:
- All XFRM policies and states are flushed before initialization - Ensures clean kernel state after crashes or restarts - Prevents stale XFRM entries from interfering with new sessionsTroubleshooting
Common symptoms and diagnostic steps:
VPN client cannot connect (IKE_SA_INIT fails):
- Check VPN service is running: 'vpn stats' for aggregate statistics - Verify UDP ports 500 and 4500 are reachable from client network - Check geo restrictions: 'geo lookup <client_ip>' then 'geo check <client_ip>' - Check time restrictions: 'geo timecheck <client_ip>' for time-based access - Verify network_interface is correct: must match the interface with public IP - Check firewall rules: 'firewall rules' for network-level blocks - Enable debug: set vpn.debug = true for detailed protocol logging - Start with: 'diagnose user <username>' for cross-subsystem checkAuthentication fails (IKE_AUTH rejected):
- Check user is not disabled: 'directory user <username>' - Verify group membership: user must be in vpn.auth.group (default: vpn-users) - Check PSK expiration: PSK valid_until may have passed (default: 45 days) - Verify PSK is registered: 'moduledata inspect <username>' shows vpn module data - Check logs for specific rejection reason (logged server-side, not sent to client) - Common: AUTHENTICATION_FAILED with "user not in group" or "PSK expired"Client connected but no internet access:
- Check XFRM interface: verify ipsec0 is UP with correct gateway IP - Check IP forwarding: /proc/sys/net/ipv4/ip_forward must be 1 - Verify nftables filter: 'nft list table inet ikev2_filter' for per-client chains - Verify nftables NAT: 'nft list table inet ikev2_nat' for masquerading rules - Check routing: VPN subnet must route through ipsec0 - Check kernel XFRM state: 'ip xfrm state' for active encryption states - Check kernel XFRM policies: 'ip xfrm policy' for traffic selectorsClient connected but cannot reach internal services:
- Check DNS resolution: verify split-horizon DNS resolves service hostname to gateway - Verify DNAT rule: service access DNAT must redirect gateway:443 to service IP - Check DNS hijack: if apps use hardcoded DNS, enable vpn.network.dns_hijack = true - Verify service is listening on network_interface IP (not just localhost) - Check nftables prerouting: 'nft list chain inet ikev2_nat ikev2_prerouting'Session drops or unexpected disconnections:
- Check DPD failures: 3+ consecutive DPD failures trigger eviction (30s interval) - Check session timeout: default 24-hour TTL, shorter if authorization required - Check time restrictions: sessions terminated when time window closes - Check IKE SA lifetime: sessions terminated at ike_sa_lifetime (default 7 days) - View session details: 'vpn list --user=<username>' for session state - Check statistics monitor: LastActivity should update if traffic is flowing - Check kernel XFRM statistics: 'ip -s xfrm state' for packet countersHigh CPU usage on VPN server:
- Check BPF filter: ESP packets should NOT reach userspace Go code - Verify kernel XFRM is handling ESP (not a userspace fallback) - Check error logs for repeated XFRM state installation failures - Monitor with: 'metrics prometheus vpn' for detailed VPN metricsChild SA rekey failures:
- Check rekey monitor logs for CREATE_CHILD_SA exchange errors - Verify client supports server-initiated rekeying - Check crypto proposal compatibility between client and server - Old SA transition window: 60 seconds before old keys are deleted - Metric: child SA rekey success/failure counters in PrometheusDNS resolution issues for VPN clients:
- Verify DNS server is running on gateway IP port 53 - Check split-horizon config: service.hostname and service.cookie_domain - Test upstream resolvers: 'dns test <hostname>' for external domain resolution - If DNS hijack enabled: all UDP 53 traffic intercepted regardless of destination - Unauthorized sessions: DNS only resolves service hostname (restricted state)NAT traversal issues:
- Verify NAT detection: check if session.BehindNAT is true for NAT'd clients - Check port 4500 connectivity: NAT-T requires UDP 4500 reachability - Verify keepalive: 0xFF byte sent every 20 seconds to maintain NAT mapping - Check XFRM encap: NAT-T sessions need XFRM_ENCAP_ESPINUDP on states - Common: symmetric NAT or carrier-grade NAT may cause connectivity issuesRelationships
Module dependencies and interactions:
- directory: User disabled checks and group membership verification.
Required dependency: VPN init waits for directory service readiness.- vpn_sessions: Post-authentication authorization with restricted/full firewall
states. Creates authenticated sessions, provides SPI identifiers for device code token binding. IKE DELETE sent on session termination.- firewall (nftables): Per-client chains in inet ikev2_filter table for CIDR
whitelist rules. NAT masquerading in inet ikev2_nat table. DNAT rules for DNS hijacking and service access. Chain cleanup + conntrack flush on removal.- dns: Split-horizon DNS server delegates external queries to infrastructure DNS
module. DNS module provides upstream resolution and caching for non-internal domains. VPN DNS listens on gateway IP, UDP port 53.- distributed memory cache: Distributed session storage (24h TTL,
replicated to all nodes). Session indexes for fast local lookups. Lifecycle callbacks coordinate session creation and deletion.- IP pool: VPN IP allocation and release operations handled by the cluster
leader for conflict-free distributed allocation.- geoaccess: Geo-IP restriction checks at IKE packet reception. VPN-specific
config overrides with fallback to [service] section. Silently drops blocked packets (no IKE response sent).- timeaccess: Time-based restriction checks at connection and during session.
Monitor enforces time policy every 60 seconds on established sessions. Graceful disconnect with IKE DELETE when time window closes.- sessions: Authentication session enforcement. VPN PSK provisioning requires
active web session. Session cookie domain used for split-horizon DNS matching.- certificates: TLS certificate for service hostname used in DNAT service access.
ACME CA integration for automatic certificate management.- config: Hot-reload support for geo/time restrictions, debug mode, DNS domains.
Cold restart required for core VPN settings (ports, subnet, crypto).- telemetry: Structured logging at all protocol stages. Prometheus metrics for
sessions_active, sessions_approaching_lifetime, sessions_terminated_total.Load balancer
L4 Load Balancer Deployment (single_port mode):
When deploying behind L4 load balancers (GKE, AWS NLB, Azure LB, MetalLB) that perform SNAT (externalTrafficPolicy: Cluster), enable single_port mode:
[vpn] single_port = trueWhat this does:
- Disables port 500 listener — all IKEv2 traffic on port 4500 only - Forces BehindNAT=true for all sessions (NAT-T encapsulation always) - Uses SPI-based cookies instead of IP:port (stable across SNAT) - Wildcard inbound XFRM source (0.0.0.0) — kernel matches by SPI only - Automatic SNAT address tracking on authenticated IKE packets - Outbound XFRM state updated transparently when source address changesRequirements:
- L4 LB must support UDP session affinity (5-tuple conntrack) - GKE: externalTrafficPolicy: Cluster works (Maglev per-flow hashing) - AWS NLB, Azure LB, MetalLB: all support UDP conntrack by default - Client must support NAT-T (strongSwan: encap=yes, port 4500) - DPD keepalives (30s default) prevent conntrack timeoutWhy single port eliminates the dual-port problem:
Standard IKEv2 uses port 500 (IKE_SA_INIT) + port 4500 (NAT-T/ESP). L4 LBs cannot guarantee both UDP ports route to the same backend node. Single-port mode puts everything on 4500 — one UDP 5-tuple = one conntrack entry = one backend node for all IKE + ESP traffic.Multi-node behavior:
UDP conntrack ensures all packets from a client flow to the same node. Node restart: client detects DPD timeout, retries IKE_SA_INIT (2-3 seconds). Node removal: LB rebalances, same retry behavior.Troubleshooting:
- pendingAuth miss in single-port mode: log includes LB session affinity hint - SNAT address changes logged at INFO level: "SNAT address change detected" - Verify inbound XFRM: 'ip xfrm state' should show src 0.0.0.0/0 - Verify only port 4500 in netstat (no port 500)VPN DNS Server
DNS server for VPN clients with ACL enforcement, rate limiting, DNSSEC validation, and wildcard dynamic rule injection
Overview
The VPN DNS server listens on the gateway IP port 53 for DNS queries from IKEv2 VPN clients. All query resolution is delegated to the infrastructure DNS module, providing cluster-wide caching, health checking, adaptive resolver selection, and lookup coalescing without duplicating logic.
Core capabilities:
- Unified resolution: ALL query types (A, AAAA, TXT, MX, SRV, CAA, NS, SOA, PTR, etc.)
route through the infrastructure DNS module- DNS ACL enforcement: firewall-based access control on internal domain responses
(defense-in-depth alongside packet filtering, prevents information disclosure)- Per-client rate limiting: O(1) time-bucketed rolling window, 500 qpm default,
DNS REFUSED (Rcode 5) on limit exceeded (RFC 5358 compliant)- Worker pool: bounded concurrent query processing (1000 workers max) with
non-blocking semaphore and immediate REFUSED on pool exhaustion- DNSSEC validation: full cryptographic chain-of-trust via infrastructure DNS module
- DNS-over-TLS (DoT): encrypted upstream transport via infrastructure DNS module (RFC 7858)
- Split DNS: configurable dns_domains for internal zone resolution (RFC 8598)
- Wildcard DNS ACL: dynamic nftables rule injection for pattern-based ACL entries,
with synchronous injection before DNS response to prevent race conditions- Buffer pooling: sync.Pool for UDP read buffers (60-80% fewer allocations)
- Query size validation: 12-512 byte bounds enforcement (RFC 1035)
Query resolution flow:
1. VPN client sends DNS query to configured DNS server IP 2. Packet routing intercepts port 53 traffic and forwards to gateway:53 3. Rate limiter checks per-client query count (REFUSED if exceeded) 4. Worker pool semaphore acquired (REFUSED if pool exhausted) 5. Query parsed and validated (size bounds, DNS wire format) 6. Resolution via infrastructure DNS module (all query types) 7. ACL check for internal domains (REFUSED if denied) 8. Wildcard rule injection if pattern-based ACL match 9. DNS wire format response constructed and returned to clientConfig
Configuration spans [dns], [vpn.network], and [vpn] sections:
Infrastructure DNS resolvers (shared with all Hexon components): [dns]
resolvers = ["192.168.11.101", "192.168.11.102"] # Upstream DNS resolvers timeout = 5 # Query timeout in seconds (default: 5) cache_ttl = 300 # Default cache TTL in seconds (default: 300) health_check_enabled = true # Resolver health monitoring (default: true) dnssec_full_validation = true # Full DNSSEC cryptographic validation dnssec_strict = false # Allow unsigned zones (default: false) dot_enabled = false # DNS-over-TLS transport (default: false) dot_port = 853 # DoT port per RFC 7858VPN-specific DNS settings: [vpn.network]
dns_cache_ttl = "5m" # VPN-specific cache TTL override dns_cache_size = 1000 # Maximum cache entries dns_hijack = false # Intercept all port 53 traffic dns_domains = ["hexon.private"] # Split DNS domains for ACL enforcement (RFC 8598) wildcard_max_hosts_per_domain = 100 # Max hostnames per wildcard pattern (default: 100) wildcard_max_hosts_total = 1000 # Max total tracked wildcard hostnames (default: 1000)Rate limiting: [vpn]
dns_queries_per_minute = 500 # Max DNS queries per minute per VPN client # 0 or unset = use default (500)DNS response codes:
- NOERROR (Rcode 0): successful resolution (ACL allowed or no check needed) - SERVFAIL (Rcode 2): DNS resolution error or DNSSEC validation failure - NXDOMAIN (Rcode 3): domain not found (unauthorized sessions) - REFUSED (Rcode 5): ACL denied, rate limited, or worker pool exhaustedHot-reloadable: dns.resolvers, DNSSEC settings, cache_ttl, health check parameters,
dns_domains, wildcard limits, dns_queries_per_minute.Cold (restart required): dot_enabled, dot_port, dns_hijack.
Troubleshooting
Common symptoms and diagnostic steps:
VPN client cannot resolve any hostnames:
- Check if session is authorized: 'vpn list --user=<username>' to see Authorized field - Unauthorized sessions only resolve the service hostname (captive portal pattern) - Check DNS server is running: 'logs search "ikev2.dns" --since=5m' - Verify infrastructure DNS health: 'dns resolvers' shows resolver status - Cross-subsystem check: 'diagnose user <username>' for full access diagnosticDNS queries returning REFUSED (Rcode 5):
- Rate limited: check 'logs search "DNS rate limit exceeded"' - Per-client limit is 500 qpm by default (vpn.dns_queries_per_minute) - Rate limiter is local-only (VPN clients have strict per-node affinity) - ACL denied: check 'logs search "DNS query blocked by ACL"' - Only applies to domains matching dns_domains configuration - Verify user group membership: 'directory user <username>' - Check firewall rules: 'firewall check <username>' - Worker pool exhausted: check 'logs search "pool exhausted"' - Default max concurrent queries: 1000DNS queries returning SERVFAIL (Rcode 2):
- DNSSEC validation failure: check dnssec_full_validation and dnssec_strict settings - Unsigned zone in strict mode: set dnssec_strict=false or disable per-route - Upstream resolver failure: 'dns resolvers' shows health and latencyInternal hostname resolves but connection fails:
- Wildcard ACL gap: DNS returns IP but nftables has no rule for that IP - Check wildcard limits: wildcard_max_hosts_per_domain (default 100), wildcard_max_hosts_total (default 1000) - Limits reached: 'metrics prometheus firewall_wildcard_limit_hit_total' - TTL expired: dynamic rules removed after DNS TTL (min 5min, max 24h)Slow DNS resolution for VPN clients:
- Cache hit rate: 'dns health' shows cluster-wide cache statistics - Typical performance: cache hits ~1ms, misses ~5-50ms - ACL check overhead: ~5-10ms (only for dns_domains, ~1-5% of queries) - Check: 'dns resolvers' for per-resolver latencyDNS ACL not enforcing (internal domains resolve for unauthorized users):
- Verify dns_domains is configured: domains must match for ACL to apply - Session must be authorized (ACL only applies to authorized sessions) - Firewall must be enabled (disabled firewall = no ACL enforcement) - Fail-closed: errors in ACL check return REFUSED (not bypass)Relationships
Module dependencies and interactions:
- DNS module: ALL resolution delegated to the infrastructure DNS service.
Provides cluster-wide caching, DNSSEC validation, adaptive resolver selection, health checking with circuit breaker, lookup coalescing, and DoT transport. VPN DNS adds no caching or resolution logic of its own.- Firewall: DNS ACL enforcement for internal domains.
Basic allow/deny check: determines if user can access the queried hostname. Extended check with wildcard detection: returns match type (exact, wildcard, or none) so that dynamic nftables rules can be injected for wildcard patterns. Cluster-wide hostname tracking with TTL-based expiry for wildcard matches. Dynamic nftables rule injection to peer chain before DNS response is sent. Fail-closed: ACL errors return REFUSED to prevent information disclosure.- VPN IKEv2: Parent module that initializes the DNS server on VPN start.
Provides session information for username lookup and authorization status.- VPN sessions: Authorization state determines DNS behavior.
Unauthorized sessions: only service hostname resolves (captive portal). Authorized sessions: full resolution with group-based ACL enforcement.- Directory: User group memberships drive ACL rule matching.
Groups fetched by firewall module during ACL checks (not cached in DNS server). Case-insensitive OR matching across user groups.- Distributed memory: Cluster-wide wildcard hostname tracking with automatic
TTL-based cleanup. When TTL expires, dynamic nftables rules are removed.- Config: Reads [dns], [vpn.network], and [vpn] sections. Hot-reload updates
resolvers, DNSSEC settings, dns_domains, wildcard limits, rate limit threshold.- Telemetry: Structured logging with module “ikev2.dns” and “ikev2.dns.ratelimit”.
Metrics for query totals, ACL results, rate limiting, pool exhaustion, query duration histograms, and wildcard tracking.VPN Session Authorization
Post-authentication VPN session authorization with device code support and group-based firewall management
Overview
The vpn_sessions module manages post-authentication authorization for IKEv2 VPN sessions. After PSK authentication, sessions enter a restricted state with limited network access (DNS + signin service only). Users must complete authorization via the signin service or device code flow before gaining full network access.
Core capabilities:
- Two-phase VPN access: PSK authentication first, then web-based authorization
- Restricted firewall chain on connect (DNS UDP 53 + service TCP port only)
- Full firewall chain upgrade after successful authorization with group-based rules
- Device code authorization (RFC 8628) for headless clients and mobile VPN
- QR code display at /vpn/connect for easy device code scanning
- Token binding: device code cryptographically bound to IKE SA via SHA256 hash of SPI values
- Automatic disconnection of unauthorized sessions after configurable timeout
- User disable handling: immediate VPN termination when user disabled in LDAP
- Group change monitoring with dual-tier approach (event-driven + periodic backup)
- CFG_SET policy push to VPN clients on group membership changes (RFC 7296)
- Idempotent re-authorization (already authorized sessions succeed silently)
- Atomic rollback: firewall upgrade failure rolls back authorization state
Authorization flow (standard mode):
1. User connects to VPN, PSK authentication (IKE_AUTH) succeeds 2. Session created: authenticated but not yet authorized, short TTL 3. Restricted firewall chain applied (captive portal pattern) 4. DNS resolver only resolves service hostname for unauthorized sessions 5. User accesses signin service via VPN tunnel 6. Signin service calls AuthorizeSession with user groups 7. Firewall chain upgraded from restricted to full access 8. Session TTL extended to 24h (full session lifetime)Authorization flow (device code mode, RFC 8628):
1. User connects to VPN, PSK authentication succeeds 2. Device code generated automatically with SPI token binding 3. User visits /vpn/connect from VPN client, sees QR code + 8-char user code 4. User scans QR or visits /vpn/device on secondary device (phone/laptop) 5. User authenticates via OIDC on secondary device 6. User enters 8-character code (BCDF-GHJK format, BASE20 charset) 7. VPN client polls and detects authorization, full access grantedPlatform: works on all platforms (no Linux-specific requirements). Firewall chain operations delegated to the firewall module.
Config
Configuration under [vpn.auth] and [vpn.auth.device_code]:
[vpn.auth]
group = "vpn-users" # Required LDAP group for VPN authorization authorization_required = true # Enable post-connect authorization requirement authorization_timeout = "10m" # Time limit to complete authorization (default: 10m) group_refresh_interval = "15m" # Backup group monitor interval (default: 15m)[vpn.auth.device_code]
enabled = false # Enable device code mode (RFC 8628) code_ttl = "10m" # Device code validity period polling_interval = "5s" # Client polling interval hint send_email = true # Send authorization email with code to userAuthorization timeout behavior:
- When authorization_required = true, sessions get a short TTL (authorization_timeout) - Successful authorization extends TTL to 24h (full session lifetime) - Expired TTL triggers handleIKEv2SessionExpired, terminating unauthorized sessions - When authorization_required = false, sessions get 24h TTL immediatelyGroup refresh behavior:
- Primary: event-driven callbacks from directory sync (immediate) - Backup: periodic monitor at group_refresh_interval (catches missed callbacks) - Group hash comparison: SHA256 of sorted group names, skip if unchanged - Firewall chain update + CFG_SET push on group changeHot-reloadable: authorization_timeout, group_refresh_interval, device code settings. Cold (restart required): authorization_required, group.
Troubleshooting
Common symptoms and diagnostic steps:
User connects to VPN but cannot access any services:
- Session may be in restricted state (not yet authorized) - Check session status: 'vpn list --user=<username>' to see Authorized field - Verify user completed signin flow or device code authorization - Check authorization_timeout: user may have been disconnected for timeout - Metric: vpn_sessions_authz_timeout_terminations_total shows timeout disconnectsDevice code not appearing at /vpn/connect:
- Verify vpn.auth.device_code.enabled = true in configuration - Check LookupDeviceCodeByIP response: must query all cluster nodes to find correct node - Verify VPN client IP is correct in the lookup request - Check device code TTL: code may have already expired (default 10m) - Verify devicecode module is registered and healthyAuthorization succeeds but no network access:
- Check firewall chain upgrade: AuthorizeSession calls UpdatePeerChain - Verify user group memberships match firewall ACL rules - Check for atomic rollback: if firewall upgrade fails, authorization is rolled back - Look for "firewall upgrade failed" in logs (authorization state reverted) - Verify groups passed in AuthorizeSession request match directory groupsSession binding mismatch error during device code authorization:
- Device code was generated for a different IKE SA than the one being authorized - Possible cause: VPN reconnection generated new SPI values - SPI binding: SHA256 hash of SPI values must match between creation and verification - Solution: user should generate a new device code after reconnectingUser disconnected unexpectedly:
- Check authorization timeout: unauthorized sessions expire after authorization_timeout - Check user disable: disabled users are immediately disconnected via directory callback - Check group changes: if user loses required group, access is revoked - Check session TTL in distributed memory cache: verify TTL extension happened after authorization - Metric: vpn_sessions_authz_timeout_terminations_total for timeout-related disconnectsGroup changes not reflected in firewall:
- Primary path (immediate): directory sync triggers OnUserUpdated callback - Backup path (periodic): groupMonitor checks at group_refresh_interval - Verify directory sync is running: 'directory status' to check sync health - Check group hash comparison: if SHA256 of sorted groups unchanged, no update occurs - CFG_SET push is best-effort: many IKEv2 clients ignore server-initiated CFG_SETMultiple sessions for same user on different nodes:
- LookupDeviceCodeByIP queries all cluster nodes to find the session - AuthorizeSession runs on the specific node hosting the session - Verify correct node is targeted for authorization operationsDNS not resolving for VPN client:
- Unauthorized sessions: DNS resolver only resolves service hostname - Authorized sessions: DNS resolves based on group-based ACL rules - Check DNS module health and conditional resolution configuration - Verify restricted firewall chain allows DNS UDP port 53Security
Security features and protections:
ACL protection on operations:
- LookupDeviceCodeByIP: restricted to VPN service handlers only - AuthorizeSession: restricted to signin service handlers only - Prevents unauthorized callers from authorizing sessionsToken binding (device code to IKE SA):
- Binding computed as SHA256 hash of the IKE SA's SPI values at device code creation - Verified at authorization time to prevent replay attacks - Ensures device code cannot authorize a different VPN session - Graceful degradation: binding optional (skipped for WireGuard, direct connect)Username verification:
- AuthorizeSession verifies request username matches session username - Prevents authorizing a session under a different user identityAtomic rollback:
- Firewall upgrade failure automatically rolls back authorization state - Session remains in restricted state if firewall chain update fails - Prevents partially authorized sessions with inconsistent firewall rulesTimeout enforcement:
- Unauthorized sessions disconnected after authorization_timeout (default 10m) - TTL-based expiry via distributed memory cache OnDelete callback - Only non-authorized sessions are terminated on TTL expiryDevice code security (RFC 8628 Section 5.2):
- Constant-time comparison for code verification - BASE20 charset to avoid profanity in generated codes - 8-character codes with BCDF-GHJK hyphenated format for readabilityUser disable immediate disconnection:
- Directory sync detects user.Disabled = true - All VPN sessions for disabled user terminated immediately - IKE DELETE notifications sent to clients - XFRM state and firewall rules cleaned upGroup-based access revocation:
- Group changes detected via callback (immediate) and periodic monitor (backup) - Loss of required group triggers firewall chain update - DNS stops resolving previously-allowed hostnames - CFG_SET push notifies client of DNS configuration changesRelationships
Module dependencies and interactions:
- ikev2: Creates VPN sessions (authenticated but not yet authorized).
Sets session TTL based on authorization requirement (short for authz-required, 24h for standard). Provides SPI identifiers for device code token binding. IKE DELETE notifications sent on session termination.- firewall (xfrmi_nftables): Manages restricted and full-access firewall chains.
CreatePeerChain for restricted state, UpdatePeerChain for full access upgrade. Group memberships passed via AllowedGroups field for ACL rule generation.- Directory: Provides user group memberships for authorization decisions.
OnUserSet callback fires on user changes (disable, group updates). Delta sync triggers OnUserUpdated for immediate group change detection. Groups fetched at authorization time and during periodic monitoring.- devicecode: RFC 8628 device code generation and verification.
Creates 8-character human-readable codes with configurable TTL. Manages code lifecycle (creation, verification, expiry).- sessions (distributed memory cache): TTL-based session storage with OnDelete callbacks.
Short TTL for unauthorized sessions, extended on successful authorization. DeleteCallback "vpn_authz_timeout" handles timeout enforcement.- dns: Conditional hostname resolution based on authorization status.
Unauthorized sessions: only service hostname resolved. Authorized sessions: group-based DNS ACL enforcement.- signin: Web-based authorization flow. Calls AuthorizeSession after user
authenticates via the signin service running inside the VPN tunnel.- vpn service: HTTP handlers for /vpn/connect (QR code display) and
/vpn/device (secondary device authorization). Calls LookupDeviceCodeByIP.- config: Hot-reload of timeout, refresh interval, and device code settings.
- telemetry: Structured logging for authorization events, group changes,
timeout enforcement. Metrics for authorization outcomes and timeouts.