infrastructure/caddy-opnsense-blocker

Fork 0

Files

Codex, agent ChatGPT b7943e69db Harden verified bot detection

2026-03-12 16:45:11 +01:00

6.8 KiB

Raw Blame History

Configuration reference

The daemon is configured from one YAML file passed with -config.

Start from ../config.example.yaml.

Top-level structure

server:
storage:
investigation:
opnsense:
profiles:
sources:

`server`

Controls the built-in HTTP server.

listen_address
- default: 127.0.0.1:9080
- TCP listen address for both the UI and the JSON API.
read_timeout
- default: 5s
write_timeout
- default: 10s
shutdown_timeout
- default: 15s

The HTTP server has no built-in authentication or TLS.

`storage`

path
- default: ./data/caddy-opnsense-blocker.db
- SQLite database path.

The parent directory is created automatically if it does not already exist.

`investigation`

Controls bot detection and external IP lookups.

enabled
- default: true
- Enables investigations entirely.
refresh_after
- default: 24h
- Reserved for future automatic revalidation logic. Current releases do not automatically refresh cached investigations; cached entries are reused until a manual Refresh investigation action is triggered.
timeout
- default: 8s
- Timeout applied to one investigation run.
user_agent
- default: caddy-opnsense-blocker/0.2
- User-Agent sent to HTTP-based investigation providers.
spamhaus_enabled
- default: true
- Enables Spamhaus DNSBL lookups for non-bot investigations.
background_workers
- default: 2
- Number of background workers that fetch missing investigations.
background_poll_interval
- default: 30s
- Delay between background queue refill passes.
background_lookback
- default: 0s
- If 0s, the scheduler can pick any known IP missing cached intelligence.
- If greater than zero, only IPs seen within that lookback window are queued.
background_batch_size
- default: 256
- Maximum number of IPs to queue per scheduler pass.

Built-in investigation sources

Current releases can collect:

verified bot matches based on published ranges and reverse DNS logic
probable bot hints based on the observed User-Agent
reverse DNS and forward-confirmed reverse DNS
RDAP registration details such as network name, organization, country, prefix, and abuse contact
Spamhaus listed or not listed status

Built-in verified bot providers currently cover Google, Bing, Apple, Meta, DuckDuckGo, OpenAI, Perplexity, and Yandex.

When a provider publishes official crawler ranges, the daemon uses those published ranges as the source of truth and can also require a matching User-Agent token for provider families that expose several distinct crawlers.

`opnsense`

Controls the optional firewall backend.

enabled
- default: false
- When false, the daemon stays in review-only mode and does not call OPNsense.
base_url
- required when enabled: true
- Example: https://router.example.test
api_key
- optional if api_key_file is set
api_secret
- optional if api_secret_file is set
api_key_file
- recommended
- Path to a file containing the OPNsense API key.
api_secret_file
- recommended
- Path to a file containing the OPNsense API secret.
timeout
- default: 8s
insecure_skip_verify
- default: false
- Only use this for development or tightly controlled environments.
ensure_alias
- default: true
- If the target alias does not exist, the daemon will try to create it automatically.

`opnsense.alias`

name
- required when OPNsense is enabled
type
- default: host
description
- default: Managed by caddy-opnsense-blocker

`opnsense.api_paths`

Advanced option for environments where the default OPNsense API endpoints differ.

Defaults:

alias_get_uuid: /api/firewall/alias/get_alias_u_u_i_d/{alias}
alias_add_item: /api/firewall/alias/add_item
alias_set_item: /api/firewall/alias/set_item/{uuid}
alias_reconfigure: /api/firewall/alias/reconfigure
alias_util_list: /api/firewall/alias_util/list/{alias}
alias_util_add: /api/firewall/alias_util/add/{alias}
alias_util_delete: /api/firewall/alias_util/delete/{alias}

`profiles`

profiles is a mapping. Each source references one profile by name.

Example:

profiles:
  public-web:
    auto_block: true
    suspicious_path_prefixes:
      - /wp-admin

Supported fields per profile:

auto_block
- When true, suspicious matches immediately become blocked decisions.
- When false, suspicious matches become review decisions.
min_status
- default: 400
max_status
- default: 599
- Only events within this inclusive status range are evaluated.
block_unexpected_posts
- When true, POST requests are suspicious unless their normalized path is listed in allowed_post_paths.
block_php_paths
- When true, paths ending in .php are suspicious.
allowed_post_paths
- Exact normalized paths that remain allowed for POST requests.
suspicious_path_prefixes
- Prefixes matched against the normalized request path.
- / is rejected because it would be too broad.
excluded_cidrs
- CIDRs always allowed for this profile.
known_agents
- Explicit allow or deny rules matched against User-Agent prefixes, CIDR ranges, or both.

`profiles.<name>.known_agents[]`

name
- Human-readable rule name.
decision
- Required.
- Must be allow or deny.
user_agent_prefixes
- Optional if cidrs is present.
cidrs
- Optional if user_agent_prefixes is present.

At least one of user_agent_prefixes or cidrs must be defined.

`sources`

sources is a list of monitored log files.

Each item supports:

name
- required and unique
path
- required and unique
profile
- required
- Must reference an existing profile name.
initial_position
- default: end
- Accepted values: beginning, end
- end means “follow only new lines on first start”.
poll_interval
- default: 1s
batch_size
- default: 256
- Maximum number of lines read per poll.

Validation rules

The configuration loader rejects:

an empty profiles map
an empty sources list
invalid server.listen_address
duplicate source names
duplicate source paths
sources pointing to unknown profiles
invalid initial_position values
invalid status ranges
overly broad suspicious prefixes such as /
malformed CIDRs
invalid known-agent decisions
missing OPNsense credentials when opnsense.enabled: true

Design note: one source, one profile

One monitored log path equals one profile selection. This makes it easy to monitor, for example:

one public web vhost with aggressive auto-blocking
one Gitea vhost with a more conservative review-first profile
one internal service with no auto-blocking at all

All of them can still share the same OPNsense alias backend because OPNsense configuration is global.

6.8 KiB Raw Blame History

Configuration reference

Top-level structure

server

storage

investigation

Built-in investigation sources

opnsense

opnsense.alias

opnsense.api_paths

profiles

profiles.<name>.known_agents[]

sources