Configuration Hot-Reload

Overview

The Wealth trading bot supports hot-reload for configuration changes, allowing you to update trading parameters without restarting the bot. The system automatically detects changes to config.toml, validates them, and applies safe changes immediately while rejecting unsafe changes that require a full restart.

Features

  • Automatic Detection: File system monitoring detects config.toml changes within 500ms
  • Validation: New configuration is validated before applying
  • Safe/Unsafe Classification: Changes are classified as safe (hot-reloadable) or unsafe (requires restart)
  • Atomic Updates: All safe changes apply atomically with automatic rollback on failure
  • Observability: Reload events are tracked via OpenTelemetry metrics and structured logs
  • Zero Disruption: Active trades continue unaffected during reload
  • Transaction Semantics: Rollback on validation failure ensures config consistency

Safe vs Unsafe Changes

Safe Changes (Hot-Reloadable) ✅

These changes can be applied at runtime without restarting the bot:

Trading Parameters ([trading] section):

min_funding_spread = 0.0005          # Minimum funding rate spread
min_expected_value = 0.0006          # Minimum EV threshold
staleness_penalty = 0.0003           # Penalty for stale data
max_concurrent_positions = 5         # Max number of positions
position_size_percent = 0.30         # Position size as % of balance
max_position_usd = 10000            # Maximum position size (USD)
max_notional_per_symbol = 10000     # Max notional per symbol
use_kelly_criterion = true           # Enable Kelly Criterion sizing
kelly_fraction = 0.25                # Kelly criterion fraction
max_exchange_utilization = 0.50      # Max balance utilization per exchange
target_profit_percent = 0.05         # Target profit %
update_interval_secs = 60            # Position update interval
max_hedge_attempts = 5               # Max hedging retry attempts
hedge_tolerance_percent = 0.001      # Hedge tolerance

Risk Management ([risk] section):

max_slippage_bps = 50               # Maximum slippage (basis points)
min_slippage_bps = 10               # Minimum slippage
slippage_volatility_multiplier = 1.5 # Volatility multiplier
market_order_fallback_enabled = true # Enable market order fallback
limit_order_timeout_secs = 5        # Limit order timeout
trailing_stops_enabled = true       # Enable trailing stops
trailing_stop_activation = 0.03     # Trailing stop activation threshold
trailing_stop_distance = 0.40       # Trailing stop distance
trailing_stop_min_lock = 0.02       # Minimum profit lock

# Fee estimates (nested under [risk.fees])
[risk.fees]
estimated_slippage = 0.001

Observability ([observability] section):

metrics_port = 9090                 # Metrics server port (won't restart server)
otlp_endpoint = "http://localhost:4317"  # OpenTelemetry endpoint
service_name = "wealth-bot"         # Service name for telemetry
environment = "production"          # Environment label

Licensing ([licensing] section):

license_key = "..."                 # License key (validated on next check)
account_id = "..."                  # Account ID

Unsafe Changes (Require Restart) ⚠️

These changes require a full bot restart to apply safely:

Execution Mode ([execution] section):

mode = "live"                       # paper → live or live → paper

Reason: Switching between paper and live trading requires reinitializing execution clients and ensuring clean state.

Trading Instruments ([[instruments]] sections):

[[instruments]]
exchange = "binance"
symbol = "BTCUSDT"
base_asset = "BTC"
quote_asset = "USDT"

# Adding or removing instruments requires restart

Reason: Adding/removing instruments requires WebSocket reconnection and market data stream initialization.

Leverage Settings ([leverage] section):

default = 3                         # Default leverage

[leverage.overrides]
"binance:BTCUSDT" = 5              # Per-symbol overrides

Reason: Leverage changes require exchange API calls to update margin mode and may affect position calculations.

Resilience Settings ([execution.resilience] section):

[execution.resilience]
circuit_breaker_failures = 5
circuit_breaker_timeout_secs = 60
max_retries = 3
retry_initial_backoff_ms = 100
websocket_max_reconnects = 100

Reason: Changing resilience settings affects circuit breaker and retry state machines that require clean initialization.

Usage

Editing Configuration

  1. Edit config.toml with your preferred editor:

    vim config.toml
    # or
    nano config.toml
    
  2. Save the file - The bot will automatically detect the change within 500ms

  3. Check logs for reload status:

    INFO wealth::config::watcher: Configuration file changed, reloading... config_path="config.toml"
    INFO wealth::config::reload_handler: Processing config reload safe_changes=3 unsafe_changes=0 requires_restart=false
    INFO wealth::config::reload_handler: Configuration reloaded successfully changes_applied=3 duration_ms=12
    

Safe Change Example

Before (config.toml):

[trading]
min_funding_spread = 0.0004
max_position_usd = 10000

After (edit and save):

[trading]
min_funding_spread = 0.0006    # Increased spread threshold
max_position_usd = 15000       # Increased position limit

Result:

  • Config file change detected within ~500ms (debounce)
  • Validation and application: ~10-50ms
  • Strategy effect: On next evaluation cycle (typically within 60s)
  • No restart required
  • Metrics recorded: wealth_config_reloads_total{status="success"}

Effect Timing: Configuration changes are applied in two phases:

  1. Shared Config Update (~500ms): Config validated and updated after debounce
  2. Strategy Effect (0-60s): Strategy reads config on next evaluation cycle

Unsafe Change Example

Before:

[execution]
mode = "paper"

After (edit and save):

[execution]
mode = "live"    # Attempting to switch to live trading

Result:

WARN wealth::config::reload_handler: Configuration contains unsafe changes - restart required unsafe_changes=[ExecutionModeChanged { old: Paper, new: Live }]
  • Changes NOT applied
  • Current config remains unchanged
  • Bot continues running with old config
  • Manual restart required to apply changes

Monitoring & Observability

Metrics

The following OpenTelemetry metrics track reload operations:

wealth_config_reloads_total (Counter)

  • Labels: status = success, restart_required, no_changes, error
  • Tracks total reload attempts and their outcomes

wealth_config_reload_duration_seconds (Histogram)

  • Tracks time taken to process reload (validation + application)
  • Typical values: 10-50ms for safe changes

wealth_config_reload_errors_total (Counter)

  • Labels: error_type = load_failed, validation_failed, apply_failed
  • Tracks reload failures by error type

Logs

Hot-reload events generate structured logs with full context:

File Change Detected:

INFO wealth::config::watcher: Configuration file changed, reloading...
  config_path="config.toml"

Reload Processing:

INFO wealth::config::reload_handler: Processing config reload
  safe_changes=3
  unsafe_changes=0
  requires_restart=false

Success:

INFO wealth::config::reload_handler: Configuration reloaded successfully
  changes_applied=3
  duration_ms=12
  changes=["MinFundingSpreadChanged { old: 0.0004, new: 0.0006 }", "MaxPositionChanged { old: 10000, new: 15000 }", "KellyFractionChanged { old: 0.25, new: 0.30 }"]

Restart Required:

WARN wealth::config::reload_handler: Configuration contains unsafe changes - restart required
  unsafe_changes=["ExecutionModeChanged { old: Paper, new: Live }"]

Validation Failure:

ERROR wealth::config::watcher: Failed to load new config, keeping current config
  error="TOML parse error: invalid value for field 'min_funding_spread'"

Grafana Dashboards

Query examples for monitoring hot-reload in Grafana:

Reload Success Rate:

rate(wealth_config_reloads_total{status="success"}[5m])
/ rate(wealth_config_reloads_total[5m])

Reload Latency (p95):

histogram_quantile(0.95,
  rate(wealth_config_reload_duration_seconds_bucket[5m])
)

Rejected Reloads (requires restart):

increase(wealth_config_reloads_total{status="restart_required"}[1h])

Best Practices

1. Test Changes in Paper Mode First

# Start in paper mode
export WEALTH__EXECUTION__MODE=paper
wealth run

# Edit config.toml with safe changes
vim config.toml

# Verify logs show successful reload
# Then test with live trading

2. Make Incremental Changes

# Good: Change one parameter at a time
[trading]
min_funding_spread = 0.0005  # Changed from 0.0004

# Avoid: Changing many parameters simultaneously
# (harder to debug if something goes wrong)

3. Monitor Metrics After Reload

# Check metrics endpoint
curl http://localhost:9090/metrics | grep config_reload

# Verify success
wealth_config_reloads_total{status="success"} 1

4. Keep Backup Configurations

# Before making changes
cp config.toml config.toml.backup

# If reload fails, restore quickly
mv config.toml.backup config.toml

5. Use Environment Variables for Sensitive Changes

# Avoid putting live API keys in config.toml
export WEALTH__EXECUTION__MODE=paper

# Edit config.toml safely
# Restart manually when ready for live trading

Troubleshooting

Reload Not Detected

Symptom: File changes don't trigger reload

Possible Causes:

  • Editor uses atomic write (creates temp file, then renames)
  • File watcher not initialized (check logs for "File watcher initialized")
  • Insufficient debounce time (rapid successive writes)

Solution:

  • Wait 1-2 seconds after saving
  • Check logs for file watcher errors
  • Verify file permissions on config.toml

Invalid Configuration

Symptom: Reload rejected with validation error

Log:

ERROR wealth::config::watcher: Failed to load new config, keeping current config
  error="missing field `min_funding_spread`"

Solution:

  • Fix validation error in config.toml
  • Refer to config.example.toml for correct format
  • Run validation manually: check logs during bot startup

Restart Required Warning

Symptom: Changes not applied, logs show "restart required"

Log:

WARN wealth::config::reload_handler: Configuration contains unsafe changes - restart required
  unsafe_changes=["ExecutionModeChanged { old: Paper, new: Live }"]

Solution:

  • Review Unsafe Changes section above
  • Restart bot to apply unsafe changes:
    # Stop bot (Ctrl+C)
    # Edit config.toml with unsafe changes
    wealth run
    

Unexpected Behavior After Reload

Symptom: Bot behavior changed but not as expected

Possible Causes:

  • Multiple safe changes applied simultaneously
  • Cached values in strategy (future: notify strategy)
  • Timing of reload during trade execution

Solution:

  • Check logs for exact changes applied
  • Verify current config values via metrics/logs
  • Consider restarting for clean state

Implementation Details

Architecture

The hot-reload system uses a shared configuration architecture:

Data Flow:

1. User edits config.toml
   ↓
2. File system detects change
   ↓
3. Wait 500ms (debounce rapid edits)
   ↓
4. Load and validate new config
   ↓
5. Classify changes:
   ├─→ Safe changes → Apply immediately
   └─→ Unsafe changes → Reject (restart required)
   ↓
6. Strategy uses new config on next cycle (0-60s)

Key Features

  1. Atomic Updates: Config changes apply all-or-nothing with automatic rollback on failure
  2. Zero Downtime: Active trades continue unaffected during reload
  3. Background Integration: Hot-reload watcher starts automatically with other tasks

Performance Impact

Minimal impact on trading performance:

  • Reload processing: 10-50ms for safe changes
  • No interruption to active trades
  • No additional latency for order execution

Limitations

Current Limitations (v0.33.0)

  1. Strategy Effect Delay: Strategy reads config on each evaluation cycle

    • Safe changes update shared config immediately (~500ms)
    • Strategy effect occurs on next evaluation (0-60s delay)
    • This is by design - avoids mid-evaluation config changes
  2. Metrics Server Port: Changing metrics_port requires restart

    • Port binding happens once at startup
    • Change is logged but doesn't take effect until restart
  3. Credentials: API credentials are not hot-reloadable

    • Loaded once at startup from encrypted file
    • Require full restart to change

Future Enhancements

  • Explicit Strategy Notification: Call strategy.apply_config_change() for immediate effect
  • Per-Component Reload: Selective component reinitialization for some unsafe changes
  • Configuration History: Track config change history for audit trail
  • Dynamic Instrument Updates: Add/remove instruments without restart

Security Considerations

  • File Permissions: Ensure config.toml has appropriate permissions (600 recommended)
  • Validation: All configs are validated before applying (invalid configs rejected)
  • Credentials: API keys should be in encrypted credentials.encrypted.json, not config.toml
  • Audit Trail: All reload events logged with timestamps and change details

See Also