Grafana Cloud Setup Guide

Quick 15-Minute Setup for Fully Managed Observability

Overview

This guide shows you how to migrate from self-hosted monitoring (Docker Compose) to Grafana Cloud - a fully managed observability platform with free tier (14-day trial, then free tier available).

Benefits:

  • Zero infrastructure - No Prometheus, Grafana, Loki, or Tempo containers to manage
  • Automatic scaling - Handles any volume of metrics/logs/traces
  • High availability - 99.9% uptime SLA
  • Global CDN - Fast dashboard loading worldwide
  • Built-in alerting - Email, Slack, PagerDuty integrations
  • Long-term retention - Metrics for 13 months (free tier)

Table of Contents


Quick Start (15 Minutes)

TL;DR - Fast track to Grafana Cloud:

  1. Create account → https://grafana.com/auth/sign-up/create-user
  2. Get OTLP endpoint → Your stack → Send Data → OpenTelemetry → Copy endpoint URL
  3. Configure collector (credentials already in otel-collector-grafana-cloud.yaml):
    # Verify endpoint in otel-collector-grafana-cloud.yaml matches your region
    # File already configured with prod-eu-central-0 credentials
    
  4. Deploy:
    docker compose up -d
    
    # Update .env to point bot to local collector
    OTLP_ENDPOINT=http://localhost:4317
    
    # Run bot
    export CREDENTIALS_PASSPHRASE="your_passphrase"
    wealth run
    
  5. View → https://YOUR-STACK.grafana.net → Explore → Query: wealth_funding_rate

Important: Do NOT send telemetry directly from the bot to Grafana Cloud's managed OTLP gateway. Always use a local OTLP Collector as a proxy (see Architecture and Troubleshooting for why).

For detailed instructions, continue reading below.


Prerequisites

  • ✅ Wealth trading bot installed (see Getting Started)
  • ✅ Docker and Docker Compose installed
  • ✅ Credit card (for Grafana Cloud trial - free tier available after)

Step 1: Create Grafana Cloud Account

1.1 Sign Up

  1. Go to https://grafana.com/auth/sign-up/create-user
  2. Click "Start for free"
  3. Fill in your details:
    • Email address
    • Company name (can use "Personal" or your name)
    • Password
  4. Verify your email

1.2 Create a Stack

After email verification:

  1. You'll be prompted to create a stack (your isolated Grafana Cloud instance)
  2. Choose:
    • Stack name: wealth-trading-bot (or any name you prefer)
    • Region: Choose closest to you (e.g., us-east-1, eu-west-1)
    • Plan: Start with Free Trial (14 days, then free tier)
  3. Click "Launch Stack"

1.3 Access Your Stack

Your stack will be created with a URL like:

https://YOUR-STACK-NAME.grafana.net

Save this URL - you'll need it later.


Step 2: Get Your Cloud Credentials

Note: The repository includes pre-configured credentials for prod-eu-central-0 region in otel-collector-grafana-cloud.yaml. If you're using a different region or want to use your own credentials, follow these steps.

2.1 Find Your OTLP Endpoint

  1. In Grafana Cloud UI, go to your stack homepage
  2. Click "Send Data" or "Connections"
  3. Search for "OpenTelemetry" or "OTLP"
  4. You'll see an endpoint like:
    • https://otlp-gateway-YOUR-REGION.grafana.net/otlp

Example regions:

  • prod-eu-central-0 (Europe - Frankfurt)
  • prod-us-east-0 (US East - Virginia)
  • prod-us-central-0 (US Central)

2.2 Create an Access Token

  1. Go to "Administration""Access Policies"
  2. Click "Create access policy"
  3. Set:
    • Name: wealth-bot-telemetry
    • Realms: Your stack
    • Scopes:
      • metrics:write
      • logs:write
      • traces:write
  4. Click "Create"
  5. Click "Add token" → Copy the token (starts with glc_)

2.3 Get Your Instance ID

Your instance ID (username) is visible in the OTLP configuration:

  • Usually a 6-7 digit number (e.g., 1446931)
  • Found in: "Send Data""OpenTelemetry" → Look for "Instance ID" or "Username"

2.4 Update Collector Configuration

Edit otel-collector-grafana-cloud.yaml:

exporters:
  otlphttp/grafana_cloud:
    endpoint: "https://otlp-gateway-YOUR-REGION.grafana.net/otlp"  # Update region
    auth:
      authenticator: basicauth/grafana_cloud

extensions:
  basicauth/grafana_cloud:
    client_auth:
      username: "YOUR_INSTANCE_ID"      # Update with your instance ID
      password: "YOUR_API_TOKEN"         # Update with your token (glc_...)

Security Note: These credentials are in the Docker Compose config file, not .env, as they're only used by the OTLP Collector container, not the bot itself.


Step 3: Configure Bot to Use Local Collector

The bot should send telemetry to the local OTLP Collector, not directly to Grafana Cloud.

3.1 Update Environment Variables

In your .env file:

# Point bot to LOCAL collector (NOT Grafana Cloud directly)
OTLP_ENDPOINT=http://localhost:4317

# Optional: Set service metadata
ENVIRONMENT=production
SERVICE_NAME=wealth-bot

Critical: Do NOT set OTLP_ENDPOINT to Grafana Cloud's managed gateway URL. Direct connections from the Rust SDK have compatibility issues. See Troubleshooting for details.


Step 4: Start the OTLP Collector

3.1 Collector Configuration Overview

The repository includes a pre-configured OTLP Collector at otel-collector-grafana-cloud.yaml. This collector:

  • Receives OTLP data from the bot (gRPC on port 4317, HTTP on 4318)
  • Processes telemetry (batching, resource detection, filtering)
  • Exports to Grafana Cloud via OTLP/HTTP endpoint
  • Generates additional metrics from traces (span metrics, service graphs)

Key Features:

  • Uses grafanacloud connector for automatic span metrics generation
  • Drops unnecessary resource attributes to reduce cardinality
  • Adds deployment environment and service version to metrics
  • Health check endpoint on port 13133

Configuration is already complete - you only need to update credentials if using a different region (see Step 2).


Step 4: Start the OTLP Collector

4.1 Start the Collector

The Docker Compose configuration is already in your repository at compose.yml (Grafana Cloud is now the default).

# Start the OTLP Collector
docker compose up -d

# Verify it's healthy
curl http://localhost:13133
# Expected: {"status":"Server available","upSince":"..."}

# Check logs
docker compose logs otel-collector
# Look for: "Everything is ready. Begin running and processing data."

What this deploys:

  • Single container: OpenTelemetry Collector (otel/opentelemetry-collector-contrib:latest)
  • Ports exposed:
    • 4317 - OTLP gRPC (bot connects here)
    • 4318 - OTLP HTTP (alternative)
    • 13133 - Health check endpoint
  • Credentials: Read from otel-collector-grafana-cloud.yaml (hardcoded, not from .env)

Step 5: Configure and Run the Bot

5.1 Update Bot Configuration

In your .env file:

# Point to LOCAL collector (NOT Grafana Cloud)
OTLP_ENDPOINT=http://localhost:4317

# Optional: Service metadata
ENVIRONMENT=production
SERVICE_NAME=wealth-bot

Critical: The bot should connect to localhost:4317 (local collector), NOT to Grafana Cloud's managed gateway. The collector handles the connection to Grafana Cloud.

5.2 Verify .env is Gitignored

# Check if .env is gitignored
grep "\.env" .gitignore

# If not, add it
echo ".env" >> .gitignore

Step 6: Test the Connection

6.1 Verify Collector is Running

# Check collector health
curl http://localhost:13133
# Expected: {"status":"Server available","upSince":"..."}

# Check collector logs
docker compose logs otel-collector | tail -20
# Look for: "Everything is ready. Begin running and processing data."

# Check for export errors
docker compose logs otel-collector | grep -i error
# Should be empty or only show benign warnings

6.2 Run the Bot

# Set required credentials
export CREDENTIALS_PASSPHRASE="your_passphrase"

# Ensure OTLP_ENDPOINT points to local collector
echo $OTLP_ENDPOINT
# Should output: http://localhost:4317

# Run the bot
wealth run

6.3 Verify Bot is Exporting

# Check bot health endpoint
curl http://localhost:9090/health | jq
# Look for: "otlp_status": "connected" or similar

# Watch bot logs for OTLP connection
# Should see: "OpenTelemetry OTLP export enabled to: http://localhost:4317"
# Should NOT see: "BatchLogProcessor.ExportError" or "Timeout expired"

6.4 Verify Data in Grafana Cloud

  1. Open your Grafana Cloud instance: https://YOUR-STACK.grafana.net
  2. Go to "Explore"
  3. Select data source: Prometheus (or "grafanacloud-YOUR-STACK-prom")
  4. Run query:
    {__name__=~"wealth.*"}
    
  5. You should see metrics appearing within 10-30 seconds

If no data appears, see Troubleshooting or OTLP Export Issues.


Step 7: Import Dashboards

7.1 Import Pre-built Dashboard

Your repository includes a pre-built dashboard at grafana/grafana-dashboard.json:

  1. In Grafana Cloud UI, go to "Dashboards""Import"
  2. Click "Upload JSON file"
  3. Select grafana/grafana-dashboard.json from your repository
  4. Configure:
    • Name: Keep default or rename (e.g., "Wealth Trading Bot")
    • Folder: Select or create a folder
    • Prometheus: Select your Grafana Cloud Prometheus data source
  5. Click "Import"

7.2 Verify Dashboard Data

After import:

  1. Dashboard should populate with data within 30 seconds
  2. Check panels for:
    • Funding Rates by symbol and exchange
    • Account Balances by exchange
    • Trade Execution Metrics (success rate, errors)
    • WebSocket Connection Status

7.3 Dashboard Variables (if needed)

If variables need adjustment:

  1. Open imported dashboard
  2. Click ⚙️ (Settings) → "Variables"
  3. Update $pair variable if needed:
    • Query: label_values(wealth_funding_rate, symbol)
    • Data source: grafanacloud-YOUR-STACK-prom
  4. Save dashboard

Migration Checklist

Use this checklist to track your migration progress:

Pre-Migration

  • Current self-hosted setup working (docker compose ps)
  • Bot exporting metrics (curl http://localhost:9090/health)
  • Optional: Export existing dashboards for backup

Grafana Cloud Setup

  • Create Grafana Cloud account
  • Create stack (save URL)
  • Get Prometheus endpoint + credentials
  • Get Loki endpoint (can reuse same credentials)
  • Get Tempo endpoint (can reuse same credentials)

Configuration

  • Copy .env.example to .env
  • Uncomment and fill GRAFANA_CLOUD_* variables
  • Set GRAFANA_CLOUD_USERNAME and GRAFANA_CLOUD_API_TOKEN
  • Verify .env is in .gitignore

Deployment

  • Stop old stack: docker compose down
  • Start Cloud stack: docker compose up -d
  • Verify collector health: curl http://localhost:13133
  • Check collector logs for errors
  • Run bot: wealth run
  • Verify bot health: curl http://localhost:9090/health

Verification

  • Open Grafana Cloud: https://YOUR-STACK.grafana.net
  • Query wealth_funding_rate in Explore
  • Check metrics appearing (10-30 seconds)
  • Verify logs in Loki
  • Check traces in Tempo (if enabled)

Dashboard & Alerts

  • Import grafana/grafana-dashboard.json
  • Update dashboard data source to Grafana Cloud Prometheus
  • Create alert rules (low balance, WebSocket down, etc.)
  • Set up notification channels (email, Slack)
  • Test notifications

Post-Migration

  • Monitor for 24 hours
  • Check usage: Administration → Usage insights
  • Optional: Clean up old Docker images
  • Update team documentation

Time estimate: ~45 minutes total (~15 minutes active work)


Step 8: Set Up Alerts

8.1 Create Alert Rules

Grafana Cloud has built-in alerting:

  1. In dashboard, click any panel → "Edit"
  2. Click "Alert" tab → "Create alert rule from this panel"
  3. Configure:
    • Name: Low Balance Alert
    • Query: sum(wealth_account_balance{type="total"}) < 1000
    • Threshold: Below 1000
    • Evaluation interval: 1m
  4. Click "Save"

8.2 Notification Channels

Set up notifications:

  1. Go to "Alerting""Contact points"
  2. Click "Add contact point"
  3. Choose:
    • Email (free)
    • Slack (requires webhook)
    • PagerDuty (requires integration key)
  4. Test notification → Save
AlertQueryThresholdSeverity
Low Balancesum(wealth_account_balance{type="total"}) < 1000< $1,000Critical
High Error Raterate(wealth_order_errors_total[5m]) > 0.1> 10%Warning
WebSocket Downwealth_websocket_status == 0== 0Critical
Low Win Ratewealth_trades_win_rate < 40< 40%Warning

Architecture Comparison

Before (Self-Hosted)

┌─────────────────────────────────────────────────────┐
│ Docker Compose (5 containers)                       │
├─────────────────────────────────────────────────────┤
│ Wealth Bot → OTLP Collector                         │
│              ↓                                       │
│              ├→ Prometheus (metrics)                │
│              ├→ Tempo (traces)                      │
│              └→ Loki (logs)                         │
│                 ↓                                    │
│              Grafana (dashboards)                   │
└─────────────────────────────────────────────────────┘

Resource Usage:

  • CPU: ~500-800 MB RAM (all containers)
  • Disk: ~1-5 GB (retention depends on volume)
  • Maintenance: Manual updates, backups, scaling

After (Grafana Cloud)

┌─────────────────────────────────────────────────────────────┐
│ Wealth Bot (Rust)                                           │
│    ↓ OTLP gRPC (localhost:4317)                            │
│ OTLP Collector (Docker)                                     │
│    ↓ OTLP/HTTP + Basic Auth                                │
│ Grafana Cloud                                               │
│   ├→ Prometheus (metrics)                                  │
│   ├→ Loki (logs)                                           │
│   └→ Tempo (traces)                                        │
└─────────────────────────────────────────────────────────────┘

Architecture Benefits:

  • Local Collector as Proxy: Handles batching, retries, and Grafana Cloud-specific requirements
  • No Direct Bot → Cloud Connection: Avoids SDK compatibility issues (see Why Use a Collector)
  • Single Point of Auth: Credentials managed in one place (collector config)
  • Debugging Layer: Collector logs show exactly what's sent to Grafana Cloud

Resource Usage:

  • CPU: ~100-200 MB RAM (OTLP Collector only)
  • Disk: Minimal (no local storage)
  • Maintenance: Zero (fully managed)
  • Network: ~5-10 MB/hour (compressed telemetry)

Why This Architecture?

According to Grafana's best practices:

"We advise that you send directly to an OTLP endpoint in testing or small scale development scenarios only. Use a collector for production."

Direct SDK → Grafana Cloud connections are unreliable due to:

  • Rate limiting on managed endpoints
  • SDK-specific compatibility issues
  • Missing batching/compression optimizations
  • No retry buffer for network failures

See OTLP Troubleshooting for detailed explanation.


Cost Optimization

Free Tier Limits (Grafana Cloud)

ServiceFree Tier QuotaOverage Cost
Metrics10k series, 13 months retention$0.30/1k series/month
Logs50 GB ingestion, 2 weeks retention$0.50/GB
Traces50 GB ingestion, 2 weeks retention$0.50/GB
Dashboards10 users, unlimited dashboardsFree

Current Bot Usage (Estimated)

Based on your metrics:

  • Metrics: ~500 series (well within free tier)
  • Logs: ~1-2 GB/month (within free tier)
  • Traces: ~500 MB/month (within free tier)

Expected Cost: $0/month (free tier) 🎉

Tips to Stay Within Free Tier

  1. Reduce log volume:

    # In .env, set log level to INFO (not DEBUG)
    RUST_LOG=info
    
  2. Sample traces (if you enable tracing heavily):

    # In otel-collector-grafana-cloud.yaml
    processors:
      probabilistic_sampler:
        sampling_percentage: 10  # Only send 10% of traces
    
  3. Monitor usage:

    • Go to "Administration""Usage insights"
    • Check monthly consumption

Troubleshooting

Issue: "Connection refused" to Grafana Cloud

Symptom: OTLP Collector logs show connection errors

Solutions:

  1. Check endpoints in .env:

    grep GRAFANA_CLOUD .env
    # Verify URLs match your region (e.g., us-east-1, eu-west-1)
    
  2. Test connectivity:

    # Test Prometheus endpoint
    curl -u "$GRAFANA_CLOUD_USERNAME:$GRAFANA_CLOUD_API_TOKEN" \
      "$GRAFANA_CLOUD_PROMETHEUS_ENDPOINT"
    
    # Expected: 401 (auth works), 404 (endpoint works), or 200 (success)
    # Not expected: Connection refused, timeout
    
  3. Check firewall:

    # Ensure outbound HTTPS (443) is allowed
    telnet prometheus-us-east-1.grafana.net 443
    

Issue: "Unauthorized" (401) from Grafana Cloud

Symptom: OTLP Collector logs show 401 errors

Solutions:

  1. Verify credentials:

    echo $GRAFANA_CLOUD_API_TOKEN
    # Should start with "glc_"
    
  2. Regenerate token:

    • Go to Grafana Cloud → "Administration""Access Policies"
    • Delete old token
    • Create new token with metrics:write, logs:write, traces:write scopes
    • Update .env
  3. Restart collector:

    docker compose restart otel-collector
    

Issue: No metrics in Grafana Cloud

Symptom: Dashboard shows "No data"

Solutions:

  1. Check bot is exporting:

    curl http://localhost:9090/metrics
    # Should show: "backend": "OpenTelemetry OTLP"
    
  2. Check collector is receiving:

    docker compose logs otel-collector | grep "Metric"
    # Should show: "Metrics" ... "sent": true
    
  3. Check data source in Grafana:

    • Go to "Connections""Data sources"
    • Click on Prometheus data source
    • Scroll down → "Save & test"
    • Should show: "Data source is working"
  4. Query directly:

    # Query Grafana Cloud API
    curl -u "$GRAFANA_CLOUD_USERNAME:$GRAFANA_CLOUD_API_TOKEN" \
      "https://prometheus-us-east-1.grafana.net/api/v1/query?query=wealth_funding_rate"
    

Issue: High data usage

Symptom: Approaching free tier limits

Solutions:

  1. Check current usage:

    • Grafana Cloud → "Administration""Usage insights"
  2. Reduce metric cardinality:

    # In otel-collector-grafana-cloud.yaml
    processors:
      filter:
        metrics:
          exclude:
            match_type: strict
            metric_names:
              - wealth_websocket_message_latency_milliseconds
    
  3. Aggregate metrics:

    processors:
      metricstransform:
        transforms:
          - include: wealth_.*
            match_type: regexp
            action: aggregate_labels
            aggregation_type: sum
            label_set: [exchange]
    
  4. Reduce log verbosity:

    # In .env
    RUST_LOG=warn  # Only warnings and errors
    

Migration Checklist

Before switching to Grafana Cloud:

  • Create Grafana Cloud account
  • Get Prometheus, Loki, and Tempo credentials
  • Create otel-collector-grafana-cloud.yaml
  • Grafana Cloud is now default (compose.yml)
  • Self-hosted stack available at compose.stack.yml
  • Update .env with Grafana Cloud credentials
  • Test connection: docker compose up -d
  • Verify metrics in Grafana Cloud
  • Import dashboards
  • Set up alerts
  • Configure notification channels
  • Update documentation (if team project)
  • Stop old local stack (if running): docker compose -f compose.stack.yml down

Next Steps

After successful migration:

  1. Explore Grafana Cloud features:

    • Pre-built dashboards
    • SLO (Service Level Objectives) tracking
    • Incident management
    • OnCall rotation
  2. Enable advanced features:

    • Grafana Machine Learning - Anomaly detection
    • Grafana Asserts - Automated root cause analysis
    • Grafana Faro - Frontend observability (if you add a web UI)
  3. Integrate with CI/CD:

    • Use Grafana Cloud API for automated dashboard updates
    • Add monitoring checks to GitHub Actions
  4. Join Grafana Community:

    • Slack: https://slack.grafana.com
    • Forum: https://community.grafana.com


Support

If you encounter issues:

  1. Check collector logs: docker compose logs otel-collector
  2. Verify credentials: grep GRAFANA_CLOUD .env
  3. Test Grafana Cloud API: See troubleshooting section above
  4. Consult Grafana Cloud docs: https://grafana.com/docs/grafana-cloud/

Grafana Cloud Support:

  • Free tier: Community support (Slack, forum)
  • Paid plans: Email and chat support

Still stuck? Open an issue in the project repo with:

  • OTLP Collector logs
  • Bot health check output: curl http://localhost:9090/health
  • Grafana Cloud stack URL (redact credentials)