# caddy-opnsense-blocker

`caddy-opnsense-blocker` is a local-first daemon that follows one or more Caddy access log files in the default JSON format, applies per-source heuristics, stores events and investigations in SQLite, exposes a lightweight review UI, and optionally synchronizes manual or automatic block decisions to an OPNsense alias.

## Highlights

- Real-time ingestion of multiple Caddy JSON access log files.
- One heuristic profile per log source, so different applications can have different rules while sharing the same OPNsense destination alias.
- Persistent SQLite state for events, IP states, investigations, decisions, backend actions, and source offsets.
- Lightweight web UI with a Pi-hole-style dashboard, source-colored activity charts, split bot/non-bot leaderboards, a paginated requests log with collapsible filters and clickable column sorting, IP detail pages, decision history, and full request history per address.
- The requests log ships with vendored Tabulator assets served locally by the daemon, so the UI stays self-contained and does not depend on a CDN.
- Background investigation workers that fill in missing cached intelligence without slowing down page loads.
- Manual `Block`, `Unblock`, `Clear override`, and `Refresh investigation` actions from the UI or the HTTP API.
- Optional OPNsense integration; the daemon also works in review-only mode.
- Pure-Go build and first-class Nix and NixOS packaging.

## Documentation

- Installation guide: [`docs/install.md`](docs/install.md)
- Configuration reference: [`docs/configuration.md`](docs/configuration.md)
- HTTP API reference: [`docs/api.md`](docs/api.md)
- Example Caddy configuration: [`examples/Caddyfile`](examples/Caddyfile)
- Example systemd unit: [`examples/caddy-opnsense-blocker.service`](examples/caddy-opnsense-blocker.service)

## Requirements

- Linux.
- Caddy access logs written to files in the default JSON format.
- Read access to every configured log file.
- A writable state directory for the SQLite database.
- Outbound DNS and HTTPS access if IP investigation is enabled.
- OPNsense only if you want the daemon to push block or unblock actions to a firewall alias.

## Security model

- The built-in web UI and HTTP API do not provide authentication or TLS.
- The default listen address is `127.0.0.1:9080`; keep it on loopback unless another trusted layer protects access.
- OPNsense credentials should be supplied through files, not inline secrets committed to source control.
- Raw Caddy JSON log entries are stored in SQLite for inspection and auditing; plan storage and retention accordingly.

## How it works

1. A dedicated follower polls each configured log file and keeps a persistent inode plus offset checkpoint.
2. New Caddy JSON lines are parsed into normalized events.
3. Each source is evaluated against exactly one heuristic profile.
4. Events, decisions, and IP state are written to SQLite.
5. Manual overrides can force an IP to `blocked` or `allowed` regardless of later events.
6. If OPNsense is enabled, block and unblock decisions can be applied to one target alias.
7. Missing IP intelligence is fetched in the background and cached for later UI and API reads.

## State model

- `observed`: traffic was recorded, but the current rule set did not produce a suspicious decision.
- `review`: the rule set matched suspicious behavior, but the configured profile did not auto-block.
- `blocked`: the IP is currently blocked, either automatically or through a manual override.
- `allowed`: the IP is explicitly allowed, typically because of a manual override or an allow rule.

`Clear override` removes only the local manual override. It does not directly add or remove the IP on OPNsense.

## Investigation model

- IP investigations are cached in SQLite.
- The background worker only fills in missing investigations; it does not continuously re-check cached intelligence.
- Opening an IP details page reuses the cached investigation.
- `Refresh investigation` is the explicit action that forces a new lookup.
- Verified bot detection currently uses built-in provider logic for Google, Bing, Apple, Meta, DuckDuckGo, OpenAI, Perplexity, and Yandex.
- When an official crawler publishes IP ranges, the daemon prefers those ranges and can combine them with User-Agent verification when the provider documents distinct bot user agents.
- When an address is not identified as a verified bot, the daemon can collect reverse DNS, forward-confirmed reverse DNS, RDAP registration details, and Spamhaus DNSBL status.

## Caddy log requirements

The daemon expects Caddy access log entries in the default JSON structure. In practice, these fields must remain present:

- `ts`
- `status`
- `request.remote_ip`
- `request.client_ip` when available
- `request.host`
- `request.method`
- `request.uri`
- `request.headers.User-Agent`

The parser prefers `request.client_ip` and falls back to `request.remote_ip`. If Caddy itself sits behind another proxy or load balancer, configure Caddy so that `request.client_ip` reflects the real client address before feeding those logs to the blocker.

Use one log file per logical source. Different sources can share the same OPNsense alias while using different heuristic profiles.

## Quick start

For a local test run:

```bash
cp config.example.yaml config.yaml
CGO_ENABLED=0 go run ./cmd/caddy-opnsense-blocker -config ./config.yaml
```

Then open the configured address, for example `http://127.0.0.1:9080`.

For production deployment instructions, see [`docs/install.md`](docs/install.md).

## Nix and NixOS

The repository ships with:

- `package.nix`: reusable package definition
- `default.nix`: convenience entry point for `nix-build`
- `module.nix`: reusable NixOS module

Build the package directly from the repository root:

```bash
nix-build
```

Detailed NixOS installation examples are in [`docs/install.md`](docs/install.md).

## HTTP API

The UI is backed by a small JSON API. The main endpoints are:

- `GET /healthz`
- `GET /api/overview?hours=24`
- `GET /api/events`
- `GET /api/ips`
- `GET /api/recent-ips?hours=24`
- `GET /api/ips/{ip}`
- `POST /api/ips/{ip}/investigate`
- `POST /api/ips/{ip}/block`
- `POST /api/ips/{ip}/unblock`
- `POST /api/ips/{ip}/clear-override`

The legacy `POST /api/ips/{ip}/reset` route is still accepted as a backwards-compatible alias for `clear-override`.

The web UI itself exposes two main pages:

- `GET /` for the dashboard
- `GET /requests` for the paginated requests log

The legacy `GET /queries` route redirects permanently to `GET /requests`.

The full API reference, including payloads and response models, lives in [`docs/api.md`](docs/api.md).

## Configuration

See [`config.example.yaml`](config.example.yaml) for a ready-to-edit example and [`docs/configuration.md`](docs/configuration.md) for a field-by-field reference.

Key ideas:

- each `source` points to one log file
- each `source` references one `profile`
- multiple sources can share the same global OPNsense backend configuration
- `initial_position: end` means “start following new lines only” on first boot

## Development

Run the test suite:

```bash
CGO_ENABLED=0 go test ./...
```

Build the daemon:

```bash
CGO_ENABLED=0 go build ./cmd/caddy-opnsense-blocker
```

`CGO_ENABLED=0` is useful on systems without a C toolchain. The application depends only on pure-Go packages.

## Scope and roadmap

This first public version is intentionally strong on ingestion, persistence, investigation, UI, and OPNsense integration.
The current decision engine is deliberately simple and deterministic:

- suspicious path prefixes
- unexpected `POST` requests
- `.php` path detection
- explicit known-agent allow and deny rules
- excluded CIDR ranges
- manual overrides

Planned improvements include richer decision strategies, more investigation providers, additional blocking backends, and alternative ingestion transports beyond file polling.

## License

This project is licensed under the MIT License. See [`LICENSE`](LICENSE).