Protection
End-to-Origin Encryption
AES-256-GCM over ECDH P-256 for all browser traffic — CDN/WAF/proxy protection
Overview
E2OE encrypts browser-to-server traffic, protecting against CDN/WAF/proxy plaintext exposure.
Architecture:
Crypto layer: Pure ECDH, HKDF, AES-GCM, sequence validation Runtime layer: Channel state, middleware, HTTP handlers, Service Worker Proxy integration: Interception of E2OE endpoints on proxied hostsChannel lifecycle:
1. Browser loads page → channel.js injected (before app JS) 2. ECDH key exchange → channel established in session 3. Service Worker registered → transparently decrypts responses 4. All fetch/XHR traffic encrypted between browser and server 5. Middleware decrypts request, encrypts response — handlers see plaintextMulti-channel support:
- Sessions support multiple concurrent channels (one per tab/origin) - Multiple tabs work without conflicts (no 421 ping-pong)Service Worker:
- Transparently decrypts encrypted responses for navigations and API calls - First page load after init is plaintext (SW activating), subsequent encrypted - Falls back gracefully on any errorSession-backed channels:
- Key stored in session metadata (JetStream KV replicated, cluster-aware) - TTL = session TTL (PoW or auth session) - No separate channel store or expiry managementAccess gate: valid PoW cookie (pre-auth) or session cookie (post-auth). Config: e2oe = true requires protection.pow = true.
Injection coverage:
- Service pages (login, console, profile): via server-rendered page templates - Proxied apps with rewrite_host=true: via HTML rewriter (script injection) - Proxied apps with rewrite_host=false: NOT injected (zero-copy streaming mode) These apps can add channel.js manually if needed. No injection = no encryption overhead, no breakage. E2OE is transparent: no channel header = passthrough.Tiers:
- Tier 2 (baseline): Standard ECDH, passive MitM resistance. Automatic for all browsers. - Tier 1 (WebAuthn): ECDH commitment in WebAuthn challenge, hardware MitM resistance. Auto-upgrades when user logs in with passkey. Persists across page loads via rebind proof. Inherits across origins via session secret in metadata.Endpoints
POST /_hexon/e2oe/init ECDH key exchange (PoW or session cookie required)
GET /_hexon/e2oe/channel.js Browser-side encryption JS (SRI hash, cache-busted) GET /_hexon/e2oe/sw.js Service Worker for response decryptionConfig
[service]
e2oe = false # Enable E2OE (requires protection.pow = true) e2oe_strict = false # Reject ALL requests without E2OE channel (no degradation)Troubleshooting
Common issues:
- E2OE not working: verify e2oe = true AND protection.pow = true - 421 errors: channel expired (session restart) or e2oe_strict enabled without channel - Proxied app not encrypted: check rewrite_host=true (required for JS injection) - Channel expired: parent session expired (check session_ttl / pow_session_ttl) - Tier 1 not activating: user must log in with passkey (WebAuthn), check audit logs - Multi-tab: each tab gets own channel, no conflicts - First page after init: plaintext (Service Worker activating), refresh to encryptNetwork Firewall
nftables-based firewall with per-peer VPN isolation and group-based policies
Overview
The firewall module provides a generic nftables abstraction layer for VPN implementations (IKEv2, WireGuard, OpenVPN). It manages per-peer chain isolation, flexible policy modes, group-based ACL, and cluster-wide operations.
Core capabilities:
- Per-peer nftables chain isolation (dedicated chain per VPN client)
- Three policy modes: restricted (captive portal), full (group-based ACL), custom
- Group-based ACL with host aliases, port aliases, and LDAP group matching
- Service DNAT for redirecting VPN traffic to the Hexon web portal
- Wildcard DNS dynamic rule injection for pattern-based ACL entries
- DNS background refresh with TTL-based re-resolution and automatic peer updates
- Dual-stack IPv4/IPv6 with deterministic address mapping (ULA fd7a:ec0a::/48)
- NAT64 for IPv6 VPN clients accessing IPv4 services
- nftables connection pooling for reduced netlink socket overhead
- DoS prevention via max rules per chain and wildcard hostname limits
- Audit logging for all ACL changes (create, update, remove)
- Input validation: ASCII-only peer IDs, interface name regex, CIDR validation
Policy modes:
Restricted: Captive portal mode. VPN clients can only reach the Hexon web portal for authentication. All other traffic is dropped. Applied on initial VPN connect before the user authenticates. Full: Group-based ACL with internet access. Three-tier evaluation: 1. Block RFC1918 and reserved networks (10/8, 172.16/12, 192.168/16, 169.254/16, 127/8, 224/4, 240/4) 2. Allow ACL exceptions based on user LDAP group memberships 3. Allow all internet traffic (non-RFC1918) Applied after successful authentication via UpdatePeerChain. Custom: Operator-defined rules only. No automatic internet access or RFC1918 blocking. Full control over allowed destinations.Platform: requires Linux with nftables (kernel 3.13+) and CAP_NET_ADMIN. Non-Linux platforms use stub implementations that return errors gracefully.
Config
Core configuration under [firewall] and related VPN sections:
[firewall]
enabled = true # Enable firewall module blocked_networks = [ # Networks blocked in PolicyFull (RFC1918 default) "10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16", ] nft_pool_size = 5 # nftables connection pool size (range: 1-100) max_rules_per_chain = 1000 # Max rules per peer chain (0 = unlimited, NOT recommended) # DNS Background Refresh for ACL hostnames dns_refresh_enabled = true # Enable automatic hostname re-resolution (default: true) dns_refresh_min_interval = "1m" # Min refresh interval / zero-TTL fallback dns_refresh_max_interval = "1h" # Max refresh interval (TTL clamped to this) dns_refresh_jitter = 10 # Jitter percentage 0-100 (default: 10, prevents thundering herd) dns_refresh_init_timeout = "10s" # Startup timeout for initial hostname resolutionHost aliases — reusable destination groups
[[firewall.aliases.hosts]]
name = "hosts_ipa" # Alias name referenced by rules hosts = [ "192.168.11.0/24", # CIDR notation "192.168.11.101", # Single IP (auto /32 or /128) "ipa.hexon.private", # Hostname (DNS resolved at rule generation, 5s timeout) ][[firewall.aliases.hosts]]
name = "hosts_remote_dc" hosts = ["gitlab.internal", "jenkins.internal"] site = "prod-dc-a8f3c1" # Route through connector (no nft rules, remote DNS)Port aliases — reusable protocol+port combinations
[[firewall.aliases.ports]]
name = "ports_web" entries = [ { proto = "tcp", ports = [80, 443, 8080, 8443] }, ][[firewall.aliases.ports]]
name = "ports_ldap" entries = [ { proto = "tcp", ports = [389, 636] }, { proto = "udp", ports = [389] }, ][[firewall.aliases.ports]]
name = "icmp" entries = [{ proto = "icmp", ports = [] }]ACL rules — map LDAP groups to allowed resources
[[firewall.rules]]
rule = "allow_ipa_access" # Rule name (used in metrics and audit logs) src = ["admins", "ipa_users"] # LDAP group names (case-insensitive, OR matching) dst = ["hosts_ipa"] # Host alias names ports = ["ports_web", "icmp"] # Port alias names ("any" = all traffic)Wildcard DNS limits (under VPN network config)
[vpn.network]
wildcard_max_hosts_per_domain = 100 # Max hostnames per wildcard pattern (default: 100) wildcard_max_hosts_total = 1000 # Max total tracked hostnames (default: 1000) ipv6_enabled = false # Enable dual-stack IPv4+IPv6 (default: false) ipv6_prefix = "fd7a:ec0a::/48" # ULA prefix for VPN client IPv6 addressesService DNAT redirects VPN client traffic to the Hexon web portal for authentication. It is auto-configured from the VPN and service settings at startup:
- The service IP is auto-detected from the network interface - ServicePublicPort (VPN-facing, e.g., 8443) maps to the actual service port (e.g., 443) - Set ServicePublicPort = 0 to use the same port as the serviceHot-reloadable: ACL rules, host/port aliases, DNS refresh settings, blocked_networks. New rules apply to new VPN connections only. Existing VPN sessions require a reconnect or explicit peer chain update to pick up changes. Cold (restart required): firewall.enabled, nft_pool_size.
Troubleshooting
Common symptoms and diagnostic steps:
User cannot reach internal services after VPN connect:
- Check group membership: 'directory user <username>' to verify LDAP groups - Check ACL rule matching: user groups must match at least one src group (OR, case-insensitive) - Verify peer chain policy: 'nft list chain inet hexon_ikev2 hexon_ikev2_peer_<user>_<ip>' - Check if still in PolicyRestricted (pre-auth): look for limited port rules only - DNS ACL blocking: unauthorized domains are refused (RCODE 5) by the DNS ACL check - Verify host alias resolution: hostnames have 5s DNS timeout, check dns module healthPeer isolated unexpectedly (all traffic dropped):
- Check per-peer chain exists: 'nft list chains inet hexon_ikev2 | grep peer' - Verify jump rule in forward chain: 'nft list chain inet hexon_ikev2 hexon_ikev2_forward' - Check max_rules_per_chain limit: error message includes "reached maximum rule limit" - Group sync timing: groups fetched when peer chain is created/updated, not continuously - DNS refresh failure: hostname resolution failed, rules removed (check metrics)Rules not applying after config change:
- Config changes only apply to NEW peer chain operations - Existing VPN sessions need a reconnect or explicit peer chain update - Verify config loaded: check structured log for firewall config reload - Host alias DNS resolution: hostnames resolved at rule generation, not config loadService DNAT not working (connection refused via VPN):
- Check DNAT rule: 'nft list chain inet hexon_ikev2_nat hexon_ikev2_prerouting' Expected: iifname "hexon0" tcp dport 8443 dnat ip to 192.168.21.10:443 - Check peer chain service acceptance: 'nft list chain inet hexon_ikev2 <peer_chain>' Expected: tcp dport 443 ip daddr 192.168.21.10 accept - Service IP changed: IP detected at Initialize time only, restart VPN or re-initialize - Port conflict: ServicePublicPort conflicts with another service on VPN interfaceTimeout after DNAT (packet reaches service but no response):
- Verify service acceptance rule exists in peer chain (all policy types get it) - Check NAT masquerade: return traffic must be NATed back through VPN interface - Conntrack state: established/related rule must precede service acceptance ruleWildcard DNS rules not being injected:
- Verify wildcard pattern in ACL host aliases (e.g., "*.example.com") - Check wildcard limits: wildcard_max_hosts_per_domain (default 100), wildcard_max_hosts_total (default 1000) - Monitor metrics: firewall.wildcard_limit_hit_total indicates limit reached - TTL expiry: rules removed after DNS TTL expires (min 5min, max 24h) - Rules injected synchronously BEFORE DNS response (no race condition)DNS background refresh not updating rules:
- Verify dns_refresh_enabled = true (default) - Check refresh cycle metrics: firewall.dns_refresh_cycle_total - Hostname change detection: firewall.dns_hostname_change_total - Rate limiting: peer updates limited to 1/sec burst 5 per peer - DNS module must be enabled and healthyIPv6 connectivity issues:
- Verify ipv6_enabled = true in [vpn.network] - Check dual-stack jump rules: 'nft list chain inet hexon_ikev2 hexon_ikev2_forward' Should show both IPv4 (@nh,96,32) and IPv6 (@nh,64,128) jump rules per peer - Verify IPv6 gateway rules: peer chain should include ip6 daddr fd7a:ec0a::... rules - Test: 'ping6 fd7a:ec0a::6440:cc01' from VPN client - NAT64 service access: 'curl -6 https://[fd7a:ec0a::6440:cc01]:8443/' - "No route to host": check XFRM policies include IPv6 selectors - "Connection timeout": IPv6 jump rule missing in forward chainnftables errors or operation failures:
- Verify Linux with nftables: 'nft --version' - Check CAP_NET_ADMIN capability: required for nftables operations - Thread safety: all ops protected by mutex with deadlock detection - Connection pool exhaustion: increase nft_pool_size for high-concurrency - Non-Linux: stub returns errors, VPN works without firewall (insecure)Firewall audit and metrics:
- Audit events: telemetry log at INFO level with action, peer_id, chain_name, rule_count - ACL metrics: firewall.acl_rules_generated_total, firewall.acl_rule_errors_total - DNS metrics: firewall.dns_resolution_total, firewall.dns_resolution_duration - CIDR validation: firewall.cidr_validation_errors_total (labels: source) - Wildcard metrics: firewall.wildcard_hostnames_total (gauge), dynamic_rules_injected_totalSecurity
Input validation and security hardening:
Peer ID validation (anti-homoglyph):
Only ASCII printable characters (33-126) allowed. Blocks Unicode homoglyph attacks where lookalike characters (Cyrillic 'e' vs Latin 'e') could spoof identities. Valid: "user@example.com", "alice_123", "device-01" Invalid: "useг@example.com" (Cyrillic), spaces, tabs, control charactersInterface name validation:
Strict regex: ^[a-zA-Z0-9_-]+$ with 15-char max (Linux kernel limit). Prevents command injection via crafted interface names.Custom rule validation:
SourceIP/DestIP: max 45 characters (IPv6 length), validated with net.ParseCIDR() Comment: max 256 characters Protocol: must be "tcp", "udp", "icmp", or "all" Action: must be "accept", "drop", or "reject" Ports: must be valid 0-65535Network segmentation (PolicyFull):
RFC1918 and reserved ranges blocked FIRST, then ACL exceptions applied. No group memberships = no ACL rules = internet-only access (fail-safe). ACL rules cannot override internet blocks, only RFC1918 blocks. Invalid config blocks peer chain creation entirely (fail-closed).Service DNAT security:
Service acceptance rule cannot be removed via custom rules (always present). DNAT does not bypass application-level authentication/authorization. Service IP auto-detected from interface (prevents IP spoofing in config). Interface-based matching: all traffic on VPN interface port goes to service.DNS refresh security:
Jitter uses crypto/rand (not math/rand) to prevent timing prediction attacks. 5-second DNS resolution timeout prevents DoS via slow DNS responses. NXDOMAIN removes hostname from rules immediately (fail-safe). Transient DNS errors keep last known good IPs (fail-open for availability).Wildcard DNS DoS prevention:
Per-domain limit (default 100) and total limit (default 1000). Rules injected synchronously before DNS response (prevents race condition). TTL-based expiry ensures eventual cleanup (min 5min, max 24h). Dynamic rules are tagged for deduplication and audit trail.Thread safety:
All nftables operations are protected against concurrent access. Deadlock detection warns if the same operation attempts to re-acquire a lock. Connection pool is designed for safe concurrent use.Audit logging:
All ACL changes logged at INFO with structured fields: action (create/update/remove), peer_id, chain_name, rule_count, timestamp. Flows through telemetry module to log files, SIEM, compliance reporting.Interpreting tool output:
'firewall rules': Normal: Rules listed per group with allow/deny targets and ports Empty: No firewall rules configured — all traffic allowed by default Action: Check specific user → 'firewall check <username>' for effective permissions 'firewall check <username>': Allowed targets: List of host:port patterns the user can access (via group membership) No rules match: User has no firewall-granted access — check group membership Action: Missing access → verify user groups with 'directory user <username>', then check which groups have rules in 'firewall rules'Relationships
Module dependencies and interactions:
- vpn.ikev2: Primary consumer. Creates restricted peer chains on connect, upgrades to
full policy after web authentication. Passes LDAP groups for ACL evaluation. Service DNAT enables auth flow over VPN tunnel (captive portal pattern).- vpn.wireguard: Same firewall API as IKEv2. Different VPNType and table name.
- vpn.openvpn: Same firewall API. VPNType=openvpn.
- Directory: Group membership drives ACL rule matching (case-insensitive,
OR semantics). Groups fetched at UpdatePeerChain time, not cached in firewall.- sessions: Session revocation or VPN disconnect triggers peer chain removal.
Session upgrade (post-auth) triggers peer chain update to full policy mode.- dns: Hostname resolution for ACL host aliases (5s timeout). DNS background refresh
uses Hexon DNS module for DNSSEC, distributed cache (80-95% hit rate), adaptive resolver selection. DNS layer enforces ACL permissions (returns REFUSED RCODE 5 on denial). Wildcard DNS queries trigger dynamic rule injection for matching patterns.- distributed cache: Cluster-wide hostname tracking for wildcard DNS with
automatic TTL expiry. Local counters for efficient limit checks.- config: Hot-reload of ACL rules, host/port aliases, DNS refresh settings.
Config is read directly on each operation (never cached internally) so changes take effect within one refresh interval.- telemetry: Structured logging (Info/Debug/Warn/Error) with vpn_type, peer_id,
peer_ip fields. Metrics for ACL operations, DNS resolution, wildcard tracking.- Rate limiting: Connection-level integration for VPN traffic.
- ippool: IPv4-to-IPv6 deterministic mapping (fd7a:ec0a::/48 ULA prefix) for
dual-stack peer chain creation.Protection
Multi-layered request and protocol protection: WAF, rate limiting, geo/time access, PoW, size limiting, IDS, and password policy
Overview
The protection subsystem provides defense-in-depth security across HTTP, IKEv2/VPN, and process layers. Each module targets a specific threat domain and operates independently.
Subsystems:
waf - Web Application Firewall (Coraza v3 with OWASP CRS). Inspects HTTP requests and responses for SQL injection, XSS, path traversal, command injection, and other application-layer attacks. Supports anomaly scoring and self-contained blocking modes with four OWASP paranoia levels. ratelimit - Distributed rate limiting with client fingerprinting. Tracks request counts per JA4 TLS fingerprint or IP address using a token bucket algorithm. Automatically bans clients exceeding thresholds. Cluster-wide protection via distributed storage with per-host isolation. geoaccess - Geo/IP and ASN access control using MaxMind databases. Evaluates client IP against country and ASN allow/deny lists. Supports CDN geo header trust, CIDR bypass rules, and IP lookup caching. timeaccess - Time-based access control with IANA timezone awareness. Enforces day-of-week and hour-of-day restrictions per country or CIDR range. Supports overnight hour ranges, deny rule overrides, and default fallback windows. pow - Proof-of-Work challenge-response anti-abuse. SHA-256 based challenges with configurable difficulty, anti-automation honeypot fields, random form field names, and timing validation to prevent bot submissions. sizelimit - HTTP request body size enforcement. Configurable default limit with per-host/path exceptions using exact, wildcard, or regex matching. ikev2ids - IKEv2 intrusion detection system for VPN traffic. Protocol validation, signature-based detection, statistical anomaly analysis, and DoS flood prevention. Inline inspection at sub-50 microsecond latency. password - Password strength validation using the zxcvbn algorithm. Pattern detection, dictionary matching, and entropy analysis rather than simple character rules.HTTP middleware execution order:
1. ratelimit - Block abusive clients first (cheapest check) 2. sizelimit - Enforce body size limits 3. pow - Proof-of-Work challenge for allowed clients 4. waf - Application-layer attack detection 5. geoaccess - Geographic restrictions 6. timeaccess - Temporal access policyRelationships
Cross-subsystem interactions:
- Listener: Chains ratelimit, sizelimit, pow, and waf middleware in order
before routing. Geo and time checks also integrated at the listener level.- VPN: IKEv2 IDS inspects every incoming UDP packet on ports 500 and 4500
before protocol state machine processing.- Proxy: WAF wraps the reverse proxy handler. Per-mapping overrides allow
disabling rate limiting or size limiting on specific routes.- Password change: Validates new passwords before LDAP update during
password change and reset flows.- Configuration: Most subsystems read from [protection] or [service] config.
WAF, ratelimit, geo, and time settings are hot-reloadable.- Admin CLI: Exposes diagnostics via metrics ratelimit, metrics sizelimit,
metrics waf, metrics ids, metrics pow, geo lookup, geo check, geo timecheck.Geo/IP and ASN Access Control
MaxMind-based geographic and ASN access restrictions with CDN header support and CIDR bypass
Overview
The geoaccess module provides geographic and network-based access control using MaxMind GeoLite2/GeoIP2 databases. It evaluates client IP addresses against country and ASN allow/deny lists to block or permit requests before they reach application logic.
Core capabilities:
- Country-based allow/deny lists using ISO 3166-1 alpha-2 codes
- ASN-based allow/deny lists for blocking hosting providers and VPN networks
- CIDR-based bypass rules for trusted internal networks
- CDN geo header integration (Cloudflare, AWS CloudFront, Fastly)
- IP lookup caching for high-throughput performance
- Dual operation modes: access control (Check) and informational (Lookup)
- Graceful degradation when MaxMind databases are missing or invalid
Two operation modes:
Check - Validate request against geo/ASN restrictions (returns allowed/blocked) Lookup - Retrieve geo information without access control (informational only)Evaluation priority (first match wins):
1. Bypass CIDR check (skip all checks if client IP matches) 2. ASN deny check (block if ASN is in deny list) 3. ASN allow check (block if ASN is NOT in allow list, when allow list is set) 4. Country deny check (block if country is in deny list) 5. Country allow check (block if country is NOT in allow list, when allow list is set) 6. Allow (default - permit if no rules matched)Database requirements:
- GeoLite2-Country.mmdb (required for country filtering) - GeoLite2-ASN.mmdb (optional, required only for ASN filtering)If database files are missing or invalid, the module falls back to an embedded database (if available) or disables itself with an error log. The service continues running without geo restrictions rather than failing completely (fail-open for availability).
CDN geo header support: When deployed behind a CDN, the country code can be provided via HTTP header instead of performing a MaxMind database lookup. This is faster and often more accurate since CDNs have extensive IP intelligence databases.
Common CDN headers:
- CF-IPCountry (Cloudflare) - CloudFront-Viewer-Country (AWS CloudFront) - Fastly-Client-GeoIP-Country (Fastly)When CDNCountry is set and valid (2-letter ISO code):
- MaxMind country lookup is skipped entirely - ASN lookup still occurs if ASN rules are configured (CDNs do not provide ASN) - The CDN-provided country is used for all country-based checksCommon ASN examples for blocking:
Cloud/Hosting: 14061 (DigitalOcean), 16509 (AWS), 15169 (Google Cloud), 8075 (Azure), 13335 (Cloudflare), 20473 (Vultr), 63949 (Linode) VPN providers: 55967 (NordVPN), 9009 (M247), 212238 (ExpressVPN)Config
Configuration in hexon.toml under [service]:
[service]
geo_enabled = true # Enable geo access control geo_database = "/etc/hexon/GeoLite2-Country.mmdb" # Path to country database geo_asn_database = "/etc/hexon/GeoLite2-ASN.mmdb" # Path to ASN database (optional) geo_allow_countries = ["US", "CA", "GB"] # ISO codes to allow (empty = all) geo_deny_countries = [] # ISO codes to deny geo_allow_asn = [] # ASN numbers to allow (empty = all) geo_deny_asn = ["14061", "16509", "15169"] # ASN numbers to deny geo_bypass_cidr = ["10.0.0.0/8", "100.64.0.0/10"] # CIDRs that skip all checks geo_deny_code = 403 # HTTP status code for blocked requests geo_deny_message = "" # Custom deny message (empty = default) # CDN geo header (requires proxy = true and proxy_cidr set) proxy = true # Required to trust proxy/CDN headers proxy_cidr = ["173.245.48.0/20"] # Trusted proxy IP ranges geo_country_header = "CF-IPCountry" # CDN header containing country codeConfiguration notes:
- Country codes must be ISO 3166-1 alpha-2 (e.g., “US”, “GB”, “DE”)
- ASN numbers are strings without the “AS” prefix (e.g., “14061” not “AS14061”)
- When both allow and deny lists are set, deny takes precedence (checked first)
- Empty allow list means “allow all” for that category
- CIDR bypass is checked before any country/ASN evaluation
- geo_country_header requires proxy = true and valid proxy_cidr
- Hot-reloadable: all geo settings can be changed without restart
- Database file changes require restart (loaded at startup only)
Troubleshooting
Common symptoms and diagnostic steps:
Legitimate users blocked by geo restrictions:
- Check user's detected country: use 'geo lookup <ip>' in admin CLI - Verify allow_countries includes the user's country code - MaxMind accuracy varies by region; consider adding nearby countries - VPN users may show the VPN exit country, not their actual country - CDN header may override MaxMind: check geo_country_header setting - Country code case: codes are normalized to uppercase internallyUsers from blocked countries still getting through:
- Check bypass CIDR: user IP may match geo_bypass_cidr - CDN header spoofing: ensure proxy = true and proxy_cidr is restrictive - IPv6 addresses: verify MaxMind database covers IPv6 ranges - Cache hit returning stale allow: cache entries expire, wait for refreshASN blocking not working:
- Verify geo_asn_database path is correct and file exists - ASN database is optional: if missing, ASN checks are silently skipped - Cloud provider IPs change: MaxMind ASN data may be stale - Shared hosting: multiple ASNs may serve the same IP rangeCDN geo header issues:
- Header not present: CDN may not send header for all requests - Invalid country code: non-2-letter codes fall back to MaxMind lookup - proxy = false: CDN headers are ignored when proxy is not enabled - proxy_cidr mismatch: request not from trusted proxy range - Header name case: HTTP headers are case-insensitive (handled automatically)Performance concerns:
- Check cache hit rate: geoaccess.cache metric (hit vs miss) - High miss rate: increase cache TTL or check for IP diversity - MaxMind lookup latency: typically sub-millisecond per lookup - CDN header mode skips MaxMind lookup entirely (faster)Geo module not loading:
- Missing database file: check error log for "geoaccess" messages - Invalid mmdb format: re-download from MaxMind - File permissions: hexon process must have read access to database files - Module disabled: verify geo_enabled = true in configMetrics for diagnostics:
- geoaccess.requests_total (status=allowed|blocked, reason=...) - geoaccess.blocked_by_country (country label) - geoaccess.blocked_by_asn (asn label) - geoaccess.cache (result=hit|miss) - geoaccess.cdn_country_used (country label)Security
Security considerations and hardening:
CDN header trust model:
CDN geo headers are only trusted when all conditions are met: - proxy = true is configured (required) - proxy_cidr defines trusted proxy IP ranges - Connection originates from within proxy_cidr ranges Without these safeguards, attackers can spoof CDN headers to bypass geo blocks.Input validation:
- Country codes must be exactly 2 ASCII letters (a-z, A-Z) - Codes are normalized to uppercase (e.g., "us" becomes "US") - Invalid codes (numeric, symbols, unicode) fall back to MaxMind lookup - Whitespace is trimmed from header values - ASN numbers validated as numeric stringsEvaluation order security:
Deny lists are always evaluated before allow lists within each category. This ensures that explicitly denied entries cannot be bypassed by being in an allow list. CIDR bypass is checked first to ensure internal networks always have access regardless of geo restrictions.Fail-open behavior:
If MaxMind databases are missing or corrupt, the module disables itself and allows all traffic. This is intentional for availability but means geo restrictions silently stop working. Monitor the error log for database loading failures.IP spoofing prevention:
When behind a reverse proxy, the module uses the client IP extracted by the trusted proxy chain (X-Forwarded-For validated against proxy_cidr), not the raw connection IP. Direct connections use the TCP source address.Rate limiting interaction:
Geo checks happen before rate limiting in the request pipeline. A blocked geo request never reaches the rate limiter, so geo-blocked IPs do not consume rate limit tokens.Relationships
Module dependencies and interactions:
- Request pipeline: Primary consumer. Geo checks are performed early in the
pipeline before routing, authentication, or application logic. Uses the extracted client IP from trusted proxy headers.- Rate limiting: Geo checks precede rate limiting. Blocked requests
do not consume rate limit tokens. Both modules share the client IP extraction.- Proof-of-work: PoW challenges may be served before geo checks depending
on configuration order. Typically geo blocks first, then PoW for allowed regions.- config: All geo settings are hot-reloadable. Reads current settings dynamically for
values on each request (no stale cache). Database paths are cold config (restart required to reload mmdb files).- telemetry: Structured logging for blocked requests with country, ASN, reason.
Metrics exported for monitoring dashboards and alerting.- dns: MaxMind lookups are IP-based (no DNS dependency). However, CDN header
trust depends on proxy_cidr which may include CDN IP ranges that change.- Directory: No direct dependency. Geo checks are pre-authentication
and identity-independent. Applied uniformly to all requests.- sessions: No session dependency. Each request is evaluated independently
against current geo rules (stateless check).- vpn.ikev2: VPN connections can be geo-checked at the IKE_SA_INIT stage
using the client's source IP before tunnel establishment.- Admin CLI: Exposes ‘geo lookup’, ‘geo check’, and ‘geo timecheck’ commands
for diagnostics and testing.IKEv2 Intrusion Detection System
Inline IDS for IKEv2/IPsec with protocol validation, signature detection, anomaly analysis, and DoS prevention
Overview
The ikev2ids module provides a comprehensive intrusion detection system specifically designed for IKEv2/IPsec VPN traffic. It integrates directly into the IKEv2 packet processing pipeline for low-latency inline inspection.
Four layers of threat detection:
- Protocol Validation (RFC 7296 compliance):
- IKE version checking (must be 2.0) - Valid exchange type verification - Message length consistency checks - Reserved flag validation- Signature-Based Detection (10 built-in signatures):
- SIG-001: CVE-2018-5389 INVALID_KE_PAYLOAD DoS - SIG-002: Malformed SA Proposals - SIG-003: Weak cipher detection (DES, 3DES, MD5) - SIG-004: Excessive re-keying DoS - SIG-005: Weak DH groups (less than 2048-bit) - SIG-006: Oversized messages (buffer overflow attempts) - SIG-007: Message ID manipulation (replay attacks) - SIG-008: NULL authentication attempts - SIG-009: IKEv2 fragmentation attacks - SIG-010: DELETE payload floods- Anomaly Detection (statistical analysis):
- Packet size anomaly detection using exponential moving averages - Connection rate anomaly tracking per client IP - Configurable sensitivity threshold (0.0 to 1.0)- DoS Detection (connection flood prevention):
- Connection flood detection with configurable threshold per IP per minute - Authentication failure flood detection - Automatic state cleanup every 5 minutes (30-minute TTL)Performance characteristics:
- Expected latency: less than 50 microseconds per packet - Memory overhead: approximately 500 bytes per tracked client - Automatic state cleanup every 5 minutes with 30-minute TTLOperational modes:
- block_malicious = true: detected threats cause packet to be dropped (inline prevention) - block_malicious = false: threats are logged but packets are allowed (detection only)Config
Configuration in hexon.toml under [protection.ikev2ids]:
[protection.ikev2ids]
enabled = true # Enable/disable the IDS module block_malicious = true # Block detected threats (false = log only / detection mode) log_level = "info" # Logging verbosity for IDS events dos_threshold = 100 # Maximum connections per minute per client IP anomaly_sensitivity = 0.95 # Statistical anomaly threshold (0.0-1.0, higher = more sensitive)Configuration notes:
- enabled = false completely disables packet inspection (zero overhead)
- block_malicious = false is useful for initial deployment to assess false positives
- dos_threshold applies per unique client IP address per minute window
- anomaly_sensitivity of 0.95 means traffic outside 95th percentile is flagged
- Lower anomaly_sensitivity reduces false positives but may miss subtle attacks
- All settings are evaluated at packet inspection time (hot-reloadable)
Recommended configurations by environment:
Standard deployment:
enabled = true block_malicious = true dos_threshold = 100 anomaly_sensitivity = 0.95High-security (stricter, more false positives):
enabled = true block_malicious = true dos_threshold = 50 anomaly_sensitivity = 0.99Initial rollout (detection only):
enabled = true block_malicious = false dos_threshold = 200 anomaly_sensitivity = 0.90Troubleshooting
Common symptoms and diagnostic steps:
Legitimate VPN clients being blocked:
- Set block_malicious = false temporarily to confirm IDS is the cause - Check metrics: ikev2ids_threats_detected_total with type and severity labels - Anomaly detection may flag unusual but legitimate traffic patterns - Lower anomaly_sensitivity (e.g., 0.90) to reduce false positives - Check if client VPN software sends non-standard IKEv2 extensions - SIG-003 (weak cipher): client may be offering DES/3DES/MD5 in proposals - SIG-005 (weak DH): client may propose DH groups below 2048-bitHigh false positive rate on anomaly detection:
- Reduce anomaly_sensitivity from 0.95 to 0.90 or 0.85 - Anomaly baselines adapt over time; new deployments have higher false positives - Exponential moving averages need time to converge to normal patterns - Consider running in detection-only mode (block_malicious = false) initiallyDoS threshold too aggressive:
- Increase dos_threshold if legitimate users trigger connection flood detection - Mobile users may reconnect frequently due to network changes - NAT environments: multiple users behind same IP may exceed per-IP threshold - Monitor ikev2ids_threats_detected_total with type=dos labelIDS not detecting known attacks:
- Verify enabled = true in configuration - Check that packet reaches IDS: inspect ikev2ids_packets_inspected_total - Signature detection is pattern-based; zero-day attacks need anomaly layer - Protocol validation requires well-formed IKE headers to parse - Severely malformed packets may be dropped before reaching IDSPerformance impact concerns:
- Monitor ikev2ids_inspection_duration histogram for latency - Expected: less than 50 microseconds per packet - High latency indicates too many tracked clients (memory pressure) - State cleanup runs every 5 minutes; 30-minute TTL for inactive clients - If latency exceeds 1ms, check total tracked client countIDS metrics not appearing:
- Verify module is enabled and processing packets - Check telemetry module health - Metrics are exported as Prometheus-compatible counters and histograms - ikev2ids_packets_inspected_total should increment for any VPN trafficSpecific signature troubleshooting:
SIG-001 (CVE-2018-5389): legitimate if client sends INVALID_KE_PAYLOAD during normal negotiation retries (rare but possible) SIG-004 (re-keying): high re-key rate may be legitimate under high traffic; adjust threshold if needed SIG-006 (oversized): legitimate if using certificate-based auth with large certificate chains SIG-009 (fragmentation): legitimate with large payloads; check if client uses IKEv2 fragmentation (RFC 7383) SIG-010 (DELETE floods): may occur during mass session cleanup eventsSecurity
Security design and considerations:
Fail-open design:
If the IDS module is disabled or encounters an internal error during inspection, packets are allowed through. This prioritizes VPN availability over security enforcement. Monitor the module health to ensure continuous protection.Inline vs passive inspection:
The IDS operates inline in the packet processing pipeline. When block_malicious = true, detected threats cause immediate packet drop before the IKEv2 state machine processes them. This prevents exploitation but means false positives cause connection failures.Signature coverage:
The 10 built-in signatures cover known CVEs and common IKEv2 attack patterns. They do not cover application-layer attacks that occur after tunnel establishment. Post-tunnel traffic is handled by the WAF and firewall modules.State tracking security:
Per-client state (connection counts, anomaly baselines) is stored in memory with automatic 30-minute TTL cleanup. An attacker rotating source IPs can evade per-IP DoS detection. Consider combining with network-level rate limiting for comprehensive protection.Cipher policy enforcement:
SIG-003 detects weak cipher proposals (DES, 3DES, MD5) but does not enforce cipher policy. The IKEv2 negotiation module handles actual cipher selection. IDS detection provides visibility into clients offering weak ciphers even when the server rejects them.Memory exhaustion prevention:
State cleanup runs every 5 minutes with 30-minute TTL for inactive entries. Under extreme conditions (millions of unique IPs), memory usage grows linearly at approximately 500 bytes per tracked client. Monitor memory usage in high-traffic deployments.Relationships
Module dependencies and interactions:
- vpn.ikev2: Primary consumer. Calls InspectPacket for every incoming UDP
packet on ports 500 and 4500 before IKEv2 state machine processing. Block response causes immediate packet drop with no IKE response sent.- Firewall: Complementary protection. Firewall handles
post-authentication network ACL; IDS handles pre-authentication protocol threats. No direct dependency between modules.- Rate limiting: IDS DoS detection is IKEv2-specific (protocol-aware).
Rate limiting is generic connection-level. Both can trigger independently. IDS provides deeper protocol insight; ratelimit provides broader coverage.- telemetry: All threat detections logged with structured fields including
threat type, severity, signature ID, client IP, and packet metadata. Metrics exported for monitoring and alerting.- config: Settings are hot-reloadable. Settings read dynamically at inspection time
so changes take effect immediately without restarting VPN service.- Admin CLI: Exposes ‘metrics ids’ command for viewing IDS statistics
including packets inspected, threats detected, and packets blocked.- Proof-of-work: No direct interaction. PoW operates at HTTP layer while
IDS operates at UDP/IKEv2 layer. Different protocol domains.- sessions: No direct interaction. IDS operates before session establishment.
Session-level security is handled by the IKEv2 authentication module.Proof-of-Work Challenge
SHA-256 proof-of-work challenges with anti-automation features for bot detection and abuse prevention
Overview
The PoW module provides SHA-256 proof-of-work challenge-response protection to prevent automated abuse without requiring third-party CAPTCHA services.
Core capabilities:
- SHA-256 challenges with configurable difficulty (leading zero bits)
- Anti-automation: randomized form field names to defeat hardcoded bots
- Honeypot decoy fields that catch bots filling all form fields
- Timing validation to detect pre-computed solutions
- One-time-use challenges with TTL expiration (prevents replay attacks)
- Session-based validation (solve once, access for session duration)
- POST body preservation across the challenge flow
- Distributed challenge storage via cluster storage
Challenge-response flow:
1. Client request arrives without valid PoW session 2. Middleware intercepts and renders challenge page inline 3. Client receives: challenge ID, SHA-256 challenge bytes, difficulty 4. Client JavaScript brute-forces a nonce where SHA-256(challenge + nonce) has N leading zero bits 5. Client submits nonce along with form values 6. Server validates: timing, honeypots, hash correctness, expiry 7. On success: PoW session cookie set, original request proceedsDifficulty recommendations:
16 bits: ~65K hashes, ~0.1 seconds (light protection) 20 bits: ~1M hashes, ~1 second (default, good balance) 24 bits: ~16M hashes, ~15 seconds (high protection) 28 bits: ~256M hashes, ~4 minutes (extreme, may frustrate users)Runs third in the HTTP middleware chain (after ratelimit and sizelimit).
Config
Configuration under the [protection] section:
[protection]
pow = true # Enable proof-of-work challenges pow_difficulty = 20 # Leading zero bits required (higher = harder) pow_difficulty_time = "5m" # Challenge token TTL (time to solve) pow_session_ttl = "30m" # PoW session TTL after successful challenge pow_cookie_name = "hexon_pow" # Cookie name for PoW sessions pow_random_fields = true # Randomize form field names per challenge pow_decoy_fields = 5 # Number of honeypot decoy fields pow_min_render_time = "200ms" # Minimum time before submission is accepted pow_body_ttl = "5m" # TTL for stored encrypted POST bodies pow_body_max_size = "1MB" # Maximum POST body size to preserveDifficulty tuning:
Each additional bit doubles the expected computation time: 16 bits: ~0.1s | 20 bits: ~1s | 24 bits: ~15s | 28 bits: ~4minAnti-automation settings:
pow_random_fields: Randomized form field names per challenge defeat bots that hardcode field names like "nonce" or "solution". pow_decoy_fields: Hidden honeypot fields that legitimate users never see. Bots filling all fields are detected and rejected. pow_min_render_time: Minimum elapsed time between challenge generation and submission. Prevents pre-computed or instant bot responses.POST body preservation:
When a POST triggers a PoW challenge, the original body is encrypted and stored, then replayed after the challenge is solved.Hot-reloadable: pow_difficulty, pow_difficulty_time, pow_random_fields,
pow_decoy_fields, pow_min_render_time, pow_body_ttl, pow_body_max_size.Cold (restart required): pow (enable/disable), pow_cookie_name.
Troubleshooting
Common symptoms and diagnostic steps:
Challenge page not appearing:
- Verify [protection] pow = true - Check if client already has a valid PoW session cookie - Check 'metrics pow' for challenges_issued counterUsers cannot solve the challenge (timeout):
- Difficulty too high: reduce pow_difficulty (20 is default) - TTL too short: increase pow_difficulty_time - Client JavaScript disabled: PoW requires JavaScript execution - Mobile devices are slower: consider lower difficultyBots bypassing the challenge:
- Enable honeypot decoys: set pow_decoy_fields > 0 - Enable random field names: set pow_random_fields = true - Increase difficulty: raise pow_difficulty - Check timing: bots solving faster than pow_min_render_time are rejectedTiming validation rejecting legitimate users:
- pow_min_render_time too high: lower to 200ms (default) - Clock skew between nodes: check NTP synchronizationHoneypot false positives:
- Browser auto-fill may populate hidden fields on some browsers - Reduce pow_decoy_fields to 2-3 for fewer false positivesPOST body lost after challenge:
- Body exceeds pow_body_max_size: increase limit or reduce POST size - Body TTL expired: increase pow_body_ttl - Large file uploads: consider disabling PoW for upload routesRelationships
Module dependencies and interactions:
- Listener: Third middleware in the protection chain (after ratelimit
and sizelimit).- Rate limiting: Runs before PoW, preventing challenge generation
resource exhaustion from abusive clients.- Distributed storage: Challenge records and PoW sessions stored cluster-wide
with TTL-based automatic cleanup.- Configuration: Reads [protection] section. Most settings hot-reloadable.
- Admin CLI: ‘metrics pow’ shows challenges issued, solved, and failed.
Rate Limiting
Distributed token bucket rate limiting with client fingerprinting, automatic banning, and per-host isolation
Overview
The ratelimit module provides distributed rate limiting and automatic client banning across the cluster. It protects all HTTP endpoints against request flooding, brute-force attacks, and automated abuse.
Core capabilities:
- Token bucket algorithm with burst support (1.5x capacity for brief spikes)
- Client identification via TLS fingerprint (JA4) or IP address
- Automatic banning when rate limits are exceeded
- Manual ban/unban operations via admin CLI
- Per-host rate limit isolation (independent limits per proxy mapping)
- Per-route custom rate limits (override global setting per proxy mapping)
- Cluster-wide protection via distributed storage
- Atomic per-node statistics (allowed, blocked, banned counts)
Token bucket algorithm:
- Bucket capacity is 1.5x the configured rate limit (allows brief bursts) - Refill rate equals limit / interval (tokens per second) - New clients start with a full bucket - Each request consumes one token - When bucket is empty the client is automatically banned - Banned clients immediately blocked without consuming resourcesRuns first in the HTTP middleware chain (before sizelimit, PoW, and WAF).
Config
Configuration under the [protection] section:
[protection]
rate_limit = "100/1m" # Requests per interval (e.g., "100/1m", "5000/1h") rate_limit_type = "fingerprint" # Client identification: "fingerprint" (JA4) or "ip" rate_limit_bantime = "5m" # Ban duration when limit is exceededRate limit format: “{count}/{interval}” where interval uses Go duration suffixes: s (seconds), m (minutes), h (hours).
Examples:
"100/1m" - 100 requests per minute (token bucket capacity: 150) "5/1m" - 5 requests per minute (strict, for sensitive endpoints) "5000/1h" - 5000 requests per hour (generous, for API gateways)Per-route overrides via [[proxy.mapping]]:
disable_rate_limit = false # Bypass rate limiting for this route rate_limit = "200/1m" # Custom rate limit for this routePer-host isolation:
When proxy routes provide a hostname, rate limits are tracked independently. A client can have separate counters for different applications. Bans are also per-host: being banned on one app does not block other apps.Fingerprint types:
"fingerprint" (default, recommended): Uses JA4 TLS fingerprint. Identifies clients by TLS handshake characteristics. Resistant to IP spoofing and NAT traversal. "ip": Uses client IP address. Simpler but affected by NAT and shared IPs.Hot-reloadable: rate_limit, rate_limit_type, rate_limit_bantime.
Troubleshooting
Common symptoms and diagnostic steps:
Legitimate users getting 429 Too Many Requests:
- Check current rate limit: 'metrics ratelimit' shows cluster-wide stats - Rate limit too low: add per-route rate_limit override - Shared IP (NAT/office): switch rate_limit_type to "fingerprint" - Token bucket burst is 1.5x limit; sustained traffic above base drains it - Temporarily increase rate_limit or set disable_rate_limit on the routeUsers banned unexpectedly:
- Check ban status: 'ratelimit stats' shows active bans - Short rate_limit_bantime causes frequent ban/unban cycling - Per-host bans: user may be banned on one app but not others - Unban manually: 'ratelimit unban <fingerprint>'Rate limiting not enforcing:
- Verify [protection] rate_limit is not empty (empty = disabled) - Check if route has disable_rate_limit = true - Counters are per-node with eventual consistency; a few extra requests may slip through during cluster propagationBan not taking effect across cluster:
- Bans propagate via broadcast; check cluster health - Verify all nodes can communicate: 'cluster status' and 'ping' - Ban propagation typically completes within 100msJA4 fingerprint issues:
- Some clients produce identical fingerprints (e.g., same curl version) - Requires TLS termination at Hexon (not upstream LB) - Fall back to "ip" type if fingerprinting is unreliableAll state is in-memory with TTL:
- Full cluster restart clears all counters and bans - No persistent state survives complete cluster outage (by design)Relationships
Module dependencies and interactions:
- Listener: First middleware in the HTTP protection chain. Runs before
sizelimit, PoW, and WAF.- JA4 fingerprinting: TLS fingerprint extracted during TLS handshake,
available on request context for rate_limit_type "fingerprint".- Configuration: Reads [protection] section. Hot-reloadable settings.
- Distributed storage: Counters and bans stored cluster-wide with TTL.
Bans are replicated to all nodes (typically under 100ms).- Proxy: Per-route overrides via disable_rate_limit and custom rate_limit.
- Admin CLI: ‘ratelimit stats’, ‘ratelimit ban <fp>’, ‘ratelimit unban <fp>‘,
and 'metrics ratelimit' commands.Request Size Limiting
HTTP request body size enforcement with per-host/path exceptions and multiple matching strategies
Overview
The sizelimit module prevents abuse by enforcing maximum request body sizes on all HTTP endpoints. It supports a configurable default limit with per-host and per-path exceptions for endpoints that require larger payloads.
Core capabilities:
- Default max request body size with human-readable format (e.g., “10MB”)
- Per-host/path exceptions with custom size limits
- Three path matching strategies: exact, wildcard (suffix /*), regex
- Regex validation at init time with graceful skip on invalid patterns
- Enforcement via http.MaxBytesReader (immune to faked Content-Length headers)
- Automatic statistics tracking (allowed vs blocked request counts)
- Routes can opt out individually via DisableSizeLimit flag
Middleware execution order in the request pipeline:
1. Rate limiting - Block abusive clients first 2. Size limiting - Enforce body size limits 3. Proof-of-Work challenge 4. Session management 5. Handler - Process requestSize format supports: B, KB, MB, GB, TB (case-insensitive). Values are binary-based (1 KB = 1024 bytes, 1 MB = 1048576 bytes).
The module returns HTTP 413 Payload Too Large immediately when a request exceeds the applicable size limit. It wraps the request body reader so the actual bytes read are measured, not the Content-Length header value.
Config
Configuration under the [protection] section in hexon.toml:
[protection]
max_bytes = "10MB" # Default limit for all endpoints (empty = disabled)Per-host/path exceptions (checked in order, first match wins)
[[protection.max_bytes_exceptions]]
host = "upload.example.com" # Optional: restrict to specific host path = "/api/upload/*" # Path pattern (exact, wildcard, or regex) bytes = "100MB" # Custom limit for this exception[[protection.max_bytes_exceptions]]
path = "/bulk/*" # All hosts, wildcard path bytes = "500MB"[[protection.max_bytes_exceptions]]
path = "^/api/v[0-9]+/upload$" # Regex pattern regex = true # Must be set for regex matching bytes = "200MB"Path matching strategies:
1. Exact: path = "/upload" matches only /upload 2. Wildcard: path = "/upload/*" matches /upload/file, /upload/x/y/z 3. Regex: path = "^/pattern$" with regex = trueException evaluation:
- Checked in config order (first match wins) - Host field is optional (empty = match all hosts) - Invalid regex patterns are logged as WARN and skipped at init time - Valid exceptions logged at INFO with match type and human-readable sizeDisabling:
- Set max_bytes = "" to disable size limiting entirely - Individual routes can opt out via DisableSizeLimit: true in RouteConfigHot-reloadable: No. Changes require restart. Init logging shows: default limit, exception count, valid/invalid breakdown.
Troubleshooting
Common symptoms and diagnostic steps:
Uploads failing with 413 Payload Too Large:
- Check if the endpoint has an exception configured - Verify exception path matches: exact vs wildcard vs regex - Check exception order: first match wins, reorder if needed - Verify host field matches the request Host header (if specified) - Check size units: "100MB" = 104857600 bytes (binary, not decimal)Size limit not enforced (large uploads succeeding):
- Verify max_bytes is not empty (empty = module disabled) - Check if route has DisableSizeLimit: true - Verify size limit middleware is active in the request chain - Check init logs for "DISABLED via config" or "INVALID config" messagesRegex exceptions not working:
- Check init logs for "Invalid regex in size limit exception - SKIPPED" - Verify regex = true is set in the exception config - Test regex pattern independently for validity - Common errors: unclosed brackets, unescaped special charactersException not matching expected requests:
- Wildcard requires /* suffix: "/upload/*" not "/upload*" - Exact match is literal: "/upload" does not match "/upload/" - Host matching is exact (no wildcard support for hosts) - Check exception_index in init logs to verify load orderStatistics show unexpected blocked count:
- Check 'metrics sizelimit' for allowed and blocked request counts - High blocked count may indicate: limit too low, missing exceptions, or actual abuse attempts - Check application logs for specific blocked requestsModule init shows INVALID config:
- Verify size format: must be number + unit (e.g., "10MB") - Supported units: B, KB, MB, GB, TB (case-insensitive) - No spaces between number and unit - Must be positive valueSecurity
Security design and enforcement model:
Body size enforcement:
Uses http.MaxBytesReader which wraps the request body reader at the transport level. This prevents attacks using: - Faked Content-Length headers (actual bytes read are measured) - Chunked transfer encoding abuse (reader counts all chunks) - Slow-drip attacks (reader enforces absolute byte limit)Authorization model:
The sizelimit module uses authorization for all operations. Default policy restricts size checking to the TLS listener middleware only. This prevents unauthorized callers from bypassing size restrictions.Middleware ordering:
Size limiting runs AFTER rate limiting. This ensures that abusive clients are blocked by rate limits before consuming resources on body reading. The order prevents resource exhaustion attacks where an attacker sends many large payloads to overwhelm the size checking logic itself.Regex safety:
Regex patterns are compiled once at init time. Invalid patterns are rejected with a warning and skipped entirely. This prevents: - Runtime compilation failures during request handling - ReDoS attacks via pathological regex patterns in config - Performance degradation from repeated regex compilationRelationships
Module dependencies and interactions:
- TLS listener: Primary consumer. The size limit middleware calls
CheckRequest for every incoming HTTP request. Only authorized caller.- Rate limiting: Runs before sizelimit in the middleware chain.
Rate limiting blocks abusive clients before size checking begins.- Proof-of-work: Runs after sizelimit. Proof-of-Work challenges are only
issued after the request passes size validation.- config: Reads [protection] section at init time for default limit and
exceptions. Not hot-reloadable (restart required for changes).- telemetry: Structured logging at init (config summary, exception details)
and at runtime (blocked requests). Metrics for allowed/blocked counts.- Admin CLI: Statistics exposed via the “metrics sizelimit” admin command.
Time-Based Access Control
Timezone-aware access restrictions with day/hour windows, country matching, and CIDR bypass rules
Overview
The timeaccess module enforces time-based access restrictions on incoming requests. It evaluates whether a request is allowed based on the current day of week, hour of day, client country, and IP address.
Core capabilities:
- Day-of-week filtering (Mon through Sun)
- Hour-of-day filtering in HH:MM-HH:MM format (24-hour clock)
- Overnight hour ranges supported (e.g., “22:00-06:00”)
- Multiple time windows per country or CIDR range
- IANA timezone-aware evaluation per window
- CIDR-based bypass rules (skip all time checks)
- Country-based window matching via geo lookup
- Deny rules take precedence over allow rules within each window
- Default fallback window when no country/CIDR match
Evaluation priority (first match wins):
1. Bypass CIDR check: if client IP matches any bypass CIDR, request is allowed 2. CIDR-based window match: most specific, checked by IP range 3. Country-based window match: matched via geo lookup country code 4. Default window: fallback using DefaultTimezone, DefaultAllowDays, DefaultAllowHoursWithin each window, deny rules override allow rules:
- DenyDays takes precedence over AllowDays - DenyHours takes precedence over AllowHours - Empty AllowDays list means all days are allowedThe response includes diagnostic information: which timezone was used, the current day and time in that timezone, what matched (cidr/country/default), and the reason if the request was blocked.
Config
Configuration under the [service] section in hexon.toml:
[service]
time_enabled = true # Enable time-based access control time_bypass_cidr = ["10.0.0.0/8", "100.64.0.0/10"] # CIDRs that skip all time checks time_deny_code = 403 # HTTP status code for denied requests time_deny_message = "" # Custom denial message (empty = default) # Default window (used when no country/CIDR window matches) time_default_timezone = "UTC" # IANA timezone for default window time_default_allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"] # Allowed days time_default_allow_hours = "08:00-18:00" # Allowed hours (HH:MM-HH:MM)Country-specific time windows
[[service.time_windows]]
countries = ["US", "CA"] # ISO 3166-1 alpha-2 country codes timezone = "America/New_York" # IANA timezone for this window allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"] # Weekdays only allow_hours = "08:00-18:00" # Business hours Eastern[[service.time_windows]]
countries = ["GB", "DE", "FR"] timezone = "Europe/London" allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"] allow_hours = "09:00-17:30" # UK/EU business hoursCIDR-specific time windows (takes precedence over country windows)
[[service.time_windows]]
cidr = ["192.168.100.0/24"] # Match by IP range timezone = "UTC" allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"] # 24/7 access allow_hours = "00:00-23:59"Deny rules (override allow rules within the same window)
[[service.time_windows]]
countries = ["US"] timezone = "America/New_York" allow_days = ["Mon", "Tue", "Wed", "Thu", "Fri"] allow_hours = "08:00-18:00" deny_days = ["Wed"] # Block Wednesdays (maintenance) deny_hours = "12:00-13:00" # Block lunch hourHour range format:
"08:00-18:00" - 8 AM to 6 PM "22:00-06:00" - 10 PM to 6 AM (overnight, wraps around midnight) "00:00-23:59" - All day (24/7)Day names: Mon, Tue, Wed, Thu, Fri, Sat, Sun (case-sensitive, 3-letter).
Hot-reloadable: Yes. Window changes apply to new requests immediately.
Troubleshooting
Common symptoms and diagnostic steps:
Users blocked outside expected hours:
- Check timezone configuration: IANA timezone string must be valid - Verify the window that matched: CheckResponse.MatchedBy shows cidr/country/default - Check CheckResponse.CurrentDay and CurrentTime for the evaluated timezone - Country code mismatch: verify geo lookup returns expected country code - Overnight ranges: "22:00-06:00" is valid and should wrap around midnightUsers not blocked when they should be:
- Check bypass CIDR list: client IP may match a bypass range - CIDR windows take precedence over country windows - Verify time_enabled = true in config - Check deny rules: DenyDays/DenyHours must be set to override allow rules - Empty AllowDays means all days allowed (not no days)Wrong timezone applied:
- Check window matching order: CIDR first, then country, then default - Multiple country windows: first match wins - Verify IANA timezone string (e.g., "America/New_York" not "EST") - Invalid timezone falls back to UTC silentlyBypass not working for internal IPs:
- Verify CIDR notation: "10.0.0.0/8" not "10.0.0.0" - Check time_bypass_cidr is a list, not a single string - Client IP must be the actual source IP (check proxy headers) - IPv6 addresses need proper CIDR notationDeny rules not taking effect:
- Deny rules only work within a matched window - deny_days takes precedence over allow_days in the SAME window - deny_hours takes precedence over allow_hours in the SAME window - Cannot use deny rules in the default window (use deny_days/deny_hours fields)Metrics and diagnostics:
- timeaccess.requests_total{status="allowed|blocked"} for traffic patterns - timeaccess.windows_checked{matched_by="cidr|country|default"} for match distribution - CheckResponse includes full diagnostic: Timezone, CurrentDay, CurrentTime, MatchedBy, and Reason (if blocked)Relationships
Module dependencies and interactions:
- Geo access: Provides country code for each client IP via geo lookup.
The country code is passed in CheckRequest.Country field. Without geo module, only CIDR-based and default windows are evaluated.- TLS listener: Invokes time access checks as part of the protection
middleware chain. Passes client IP and geo-resolved country.- config: Reads [service] section for time windows, bypass CIDRs, default
timezone, and deny code. Hot-reloadable for window changes.- telemetry: Metrics for allowed/blocked counts and window match distribution.
Structured logging for blocked requests with reason and timezone context.- Rate limiting: Complementary protection. Rate limiting handles
request volume; timeaccess handles temporal access policy.- Directory: Indirect relationship. User group membership determines
which proxy mappings a user can access; timeaccess adds temporal constraints on top of identity-based access control.Web Application Firewall
Coraza WAF v3 with embedded OWASP Core Rule Set for HTTP request/response inspection
Overview
The WAF module provides Web Application Firewall protection using Coraza WAF v3 with the embedded OWASP Core Rule Set (CRS). It inspects HTTP requests and responses against security rules to detect and block application-layer attacks.
Core capabilities:
- SQL injection detection and blocking (95% coverage at paranoia level 1)
- Cross-site scripting (XSS) detection (90% coverage)
- Path traversal, command injection, SSRF, LFI/RFI, XXE detection
- Scanner and bot detection (nikto, sqlmap, nmap, etc.)
- Two blocking modes: anomaly scoring (recommended) and self-contained
- Four OWASP paranoia levels (1=basic to 4=maximum security)
- Detection-only mode for safe deployment and tuning
- Per-route WAF bypass via context keys
- Custom rules via TOML configuration or .conf files
- Request body inspection with configurable size limits
- Optional response body inspection (disabled by default for performance)
- User-friendly block pages with correlation ID for incident tracking
- All logging and metrics via telemetry module (no separate WAF log files)
Architecture:
- WAF Engine: Single shared instance initialized once with embedded CRS
- Middleware: HTTP middleware for request/response inspection pipeline
- Per-Route Control: Per-route WAF bypass via configuration
- CRS Rules: Embedded in binary via git submodule (no external dependencies)
- Rules Location: bundled CRS rules directory
WAF inspection pipeline (HTTP middleware):
1. Check if WAF disabled for route via per-mapping configuration, bypass if disabled 2. Create Coraza transaction with correlation ID 3. Phase 1: Inspect URI, method, protocol, headers, query parameters 4. Check for rule interruption, block if triggered 5. Phase 2: Inspect request body (if enabled and body present) 6. Check for rule interruption, block if triggered 7. Request passed: continue to backend handler 8. Record metrics and log transaction details via telemetryImportant limitation: per-route paranoia levels are NOT supported in Coraza v3. The paranoia level is set globally during WAF initialization and applies to all routes uniformly. Use per-route WAF bypass if certain routes need no protection.
Config
Configuration under [waf] section:
[waf]
enabled = true # Enable WAF protection paranoia = 1 # OWASP paranoia level (1-4) detection_only = false # true = log only, false = block requests self_contained = false # false = anomaly scoring (recommended), true = immediate block max_body_size = "1MB" # Maximum request body to inspect inspect_body = true # Inspect POST/PUT request bodies inspect_response = false # Inspect response bodies (performance impact) # Rule exclusions (for tuning false positives) disabled_rules = [942100] # Disable specific OWASP CRS rule IDs disabled_tags = ["attack-sqli"] # Disable all rules with specific tagsCustom rules (operator-defined, use IDs 10000+ to avoid CRS conflicts)
[[waf.custom_rule]]
id = 10001 # Rule ID (10000+ recommended) name = "Block Security Scanners" # Human-readable rule name severity = "CRITICAL" # CRITICAL, WARNING, NOTICE, etc. phase = 1 # 1=headers, 2=body, 3=resp headers, 4=resp body variable = "REQUEST_HEADERS:User-Agent" # Variable to inspect operator = "rx" # rx=regex, eq=equals, contains=contains pattern = "(?i:sqlmap|nikto|nmap)" # Match pattern transform = ["lowercase"] # Transformations before matching action = "deny" # deny, redirect, log status = 403 # HTTP status code for deny action message = "Security scanner detected" # Log message on match tags = ["hexon-custom", "scanner-detection"] # Rule tagsParanoia levels control rule sensitivity:
Level 1 (default): Basic protection, minimal false positives Level 2: Increased security, moderate false positives Level 3: High security, higher false positives (needs tuning) Level 4: Maximum security, highest false positives (extensive tuning required)Blocking modes:
Anomaly scoring (self_contained = false, recommended): Multiple rules contribute to an anomaly score. Blocks only if total score exceeds threshold (default: 5). Fewer false positives, industry standard. Self-contained (self_contained = true): Each matched rule blocks immediately. More false positives but simpler to debug. Good for high-security environments.Hot-reloadable: disabled_rules, disabled_tags, detection_only, custom rules. Cold (restart required): enabled, paranoia, self_contained, max_body_size.
Troubleshooting
Common symptoms and diagnostic steps:
WAF not loading or initializing:
- Check CRS rules exist in the binary (embedded via git submodule) - Look for "waf.init" in application logs for initialization errors - Verify [waf] enabled = true in configuration - Check for Coraza initialization errors in startup logsRules not matching expected attack payloads:
- Enable trace-level logging: [telemetry] level = "trace" - Check waf.pass and waf.block events in logs for inspection details - Verify paranoia level is sufficient for the attack type - Test with known payloads: curl "http://host/api?id=1' OR '1'='1" - Check if rule ID is in disabled_rules listFalse positives blocking legitimate traffic:
- Identify triggering rule ID from waf.block log event (rule_id field) - Temporarily add rule to disabled_rules list for immediate relief - Switch to detection_only = true for non-blocking investigation - Consider lowering paranoia level if too many false positives - Use per-route WAF bypass for endpoints that trigger false positives - For anomaly scoring: check if multiple low-score rules accumulateWAF bypass not working for specific routes:
- Verify WAF bypass is configured on the proxy mapping - Check configuration propagation: per-route WAF bypass must be set in mapping config - Look for waf.bypass events in debug logs (event with path field) - Ensure WAF middleware wraps the correct handler chainPerformance degradation with WAF enabled:
- Expected overhead: headers-only +100-200us, body 1KB +500us-1ms, body 100KB +5-10ms - Reduce paranoia level (fewer rules evaluated) - Disable body inspection for large upload endpoints (inspect_body = false) - Lower max_body_size to skip inspection of large payloads - Disable response inspection if enabled (inspect_response = false) - Bypass WAF for high-throughput internal endpoints (metrics, health) - Check waf.duration_ms histogram for actual inspection timesBlocked requests missing correlation ID:
- Verify correlation ID middleware runs before WAF middleware - Check correlation_id field in waf.block log events - Block pages should display correlation ID for user to reportCustom rules not taking effect:
- Verify rule ID does not conflict with CRS rules (use 10000+) - Check rule syntax: variable, operator, pattern must be valid - Verify phase is correct for the data being inspected - Look for rule loading errors in initialization logsRecommended deployment process:
Week 1: Enable with detection_only = true, paranoia = 1 (monitor logs) Week 2: Tune false positives with disabled_rules, test attack payloads Week 3: Switch to detection_only = false (blocking mode) Week 4+: Gradually increase paranoia level, repeat tuning cycleSecurity
Security coverage and protection details:
OWASP CRS coverage at paranoia level 1:
SQL Injection: 95% detection rate Cross-Site Scripting (XSS): 90% detection rate Path Traversal: 95% detection rate Command Injection: 85% detection rate Server-Side Request Forgery (SSRF): 80% detection rate Local/Remote File Inclusion (LFI/RFI): 90% detection rate XML External Entity (XXE): 85% detection rate Protocol Attacks: 90% detection rate Scanner Detection: 95% detection rate Bot Detection: 80% detection rateHigher paranoia levels increase coverage but require tuning to manage false positives. Custom rules provide additional Hexon-specific coverage.
Anomaly scoring provides defense-in-depth: a single indicator may not block, but multiple suspicious indicators in the same request will trigger blocking. This significantly reduces false positives compared to self-contained mode while maintaining strong detection of actual attacks.
Request body inspection limits:
Bodies exceeding max_body_size are blocked with waf.body_too_large metric. This prevents memory exhaustion from oversized payloads while ensuring attack payloads in request bodies are inspected up to the configured limit.Correlation ID tracking:
Every blocked request includes a correlation ID in the block page. Users can report this ID for incident investigation. Correlation IDs link WAF events to upstream request tracing.Limitations to be aware of:
- HTTP-only protection (does not inspect TCP/UDP/VPN traffic) - CRS rules embedded at compile time (updates require recompilation) - Detection-only mode has same performance overhead as blocking mode - No separate WAF audit log (all logging via telemetry to stdout) - Per-route paranoia levels not supported (Coraza v3 limitation)Relationships
Module dependencies and interactions:
- TLS listener: Provides correlation IDs for request tracking.
Correlation ID middleware must run before WAF middleware. Correlation IDs appear in all WAF log events and block pages.- Configuration system: WAF configuration from [waf] section.
Config changes for disabled_rules and detection_only are hot-reloadable. Paranoia level and enabled state require restart.- Metrics subsystem: Exports counters (waf.requests, waf.blocked, waf.passed,
waf.bypassed, waf.body_too_large) and histograms (waf.duration_ms). Labels include method, path, blocked, rule_id, action.- telemetry: Structured logging for all WAF events at appropriate levels.
WARN for blocks, TRACE for passes, DEBUG for bypasses. No separate WAF log file; all events flow through telemetry.- Error page service: Provides user-friendly error/block pages with correlation ID.
Block pages shown to users when requests are denied by WAF rules.- proxy: WAF middleware wraps the reverse proxy handler chain.
Per-route WAF bypass configured via proxy mapping context. WAF inspects proxied requests before they reach backend servers.- Rate limiting: Complementary protection layer.
Rate limiting operates at connection level, WAF at application level. Both modules contribute to overall request protection pipeline.- Size limiting: Body size limits complement WAF max_body_size.
Size limiting may reject oversized requests before WAF inspection.