Blockyard provides four observability mechanisms: structured logging, Prometheus metrics, OpenTelemetry tracing, and an append-only audit log.
Logging#
Blockyard writes structured JSON logs to stderr. Control verbosity with
the log_level setting:
[server]
log_level = "info" # trace, debug, info, warn, errorOr via environment variable:
BLOCKYARD_SERVER_LOG_LEVEL=debugLog levels#
| Level | Use case |
|---|---|
trace | Fine-grained diagnostics (WebSocket frames, load-balancer decisions). Noisy. |
debug | Subsystem internals (health checks, session lifecycle, container operations) |
info | Normal operations (startup, requests, deploys). Default. |
warn | Degraded conditions (failed health checks, capacity limits) |
error | Failures requiring attention (container crashes, build failures) |
HTTP request logging#
All HTTP requests are logged automatically:
- Health probes (
/healthz,/readyz) are logged atdebuglevel to avoid noise in production - Other requests are logged at
info(2xx/3xx),warn(4xx), orerror(5xx) based on the response status code
Each log entry includes method, path, status, and duration_ms.
Management listener#
By default, operational endpoints (/healthz, /readyz, /metrics) are
served on the main listener alongside the application proxy and API. This
means containers running untrusted Shiny app code can reach these endpoints.
For production deployments, configure a separate management listener bound to a loopback address:
[server]
bind = "0.0.0.0:3838"
management_bind = "127.0.0.1:9100"When management_bind is set:
/healthz,/readyz, and/metricsmove to the management listener and are removed from the main listener- The management listener requires no authentication ā access is controlled by the network binding (loopback = host-only)
/readyzalways returns full per-component check details- Prometheus can scrape
/metricswithout a bearer token - Container bridge networks cannot reach
127.0.0.1, so untrusted workloads cannot access operational data
When AppRole auth is used (openbao.role_id), /readyz also reports a
vault_token check that reflects whether the token renewal goroutine is
healthy. A stale or expired token degrades readiness, signaling the
operator to re-bootstrap with a fresh secret_id.
Point your health checks and Prometheus scraper at the management port:
# prometheus.yml
scrape_configs:
- job_name: blockyard
static_configs:
- targets: ['localhost:9100']On shutdown, the management listener stops first (health probes fail, signaling load balancers to drain traffic), then the main listener is shut down gracefully.
Prometheus metrics#
Enable the /metrics endpoint in your configuration:
[telemetry]
metrics_enabled = trueWhen served on the main listener (no management_bind), the endpoint
requires authentication (bearer token or session cookie). When served on
the management listener, no authentication is required.
Available metrics#
Gauges:
| Metric | Description |
|---|---|
blockyard_workers_active | Number of currently running worker containers |
blockyard_sessions_active | Number of active proxy sessions |
Counters:
| Metric | Description |
|---|---|
blockyard_workers_spawned_total | Total workers spawned |
blockyard_workers_stopped_total | Total workers stopped |
blockyard_bundles_uploaded_total | Total bundles uploaded |
blockyard_bundle_restores_succeeded_total | Successful dependency restores |
blockyard_bundle_restores_failed_total | Failed dependency restores |
blockyard_proxy_requests_total | Total requests through the reverse proxy |
blockyard_health_checks_failed_total | Total failed worker health checks |
blockyard_audit_entries_dropped_total | Audit log entries dropped due to buffer overflow |
Histograms:
| Metric | Description |
|---|---|
blockyard_cold_start_seconds | Time from worker spawn to healthy (buckets: 0.5sā64s) |
blockyard_proxy_request_seconds | Proxy request duration, excluding cold start |
blockyard_build_seconds | Bundle dependency restore duration (buckets: 5sā640s) |
OpenTelemetry tracing#
Send distributed traces to an OpenTelemetry collector:
[telemetry]
otlp_endpoint = "http://otel-collector:4317"The service name is blockyard. Spans include http.method, http.route,
and http.status_code attributes. Endpoints using http://, localhost,
or 127.0.0.1 connect without TLS; all others use TLS.
Security headers#
All HTTP responses include the following security headers:
| Header | Value |
|---|---|
X-Content-Type-Options | nosniff |
X-Frame-Options | DENY |
Referrer-Policy | strict-origin-when-cross-origin |
Strict-Transport-Security | max-age=63072000; includeSubDomains (HTTPS only) |
API endpoints additionally set Content-Security-Policy: default-src 'none'; frame-ancestors 'none' and Cache-Control: no-store.
Audit logging#
Enable append-only audit logging to a JSONL file:
[audit]
path = "/data/audit/blockyard.jsonl"Each line is a JSON object with the following fields:
| Field | Description |
|---|---|
ts | Timestamp (RFC 3339 with nanoseconds) |
action | Event type (see below) |
actor | OIDC sub of the user who triggered the action |
target | Resource ID (app ID, user sub, etc.), if applicable |
detail | Additional context (map of key-value pairs), if applicable |
source_ip | Caller’s IP address, if applicable |
Audit actions#
| Action | Trigger |
|---|---|
app.create | App created |
app.update | App settings changed |
app.delete | App deleted |
app.start | Worker started for an app |
app.stop | App workers stopped |
app.rollback | App rolled back to a previous bundle |
app.restore | Soft-deleted app restored |
app.rename | App renamed (old name becomes an alias) |
bundle.upload | Bundle uploaded |
bundle.restore.success | Dependency restore completed |
bundle.restore.fail | Dependency restore failed |
access.grant | Per-app access granted to a user |
access.revoke | Per-app access revoked |
credential.enroll | User enrolled a credential in OpenBao |
user.login | User logged in via OIDC |
user.logout | User logged out |
user.update | User role or active status changed |
token.create | Personal Access Token created |
token.revoke | Single PAT revoked |
token.revoke_all | All PATs revoked for a user |
Buffering#
Audit entries are buffered in memory (up to 1000 entries) and flushed to
disk by a background writer. If the buffer is full, new entries wait up
to 500 ms for space before being dropped. Dropped entries increment the
blockyard_audit_entries_dropped_total metric. Under normal load,
entries are written within milliseconds.