Grafana Cloud Setup Guide
Quick 15-Minute Setup for Fully Managed Observability
Overview
This guide shows you how to migrate from self-hosted monitoring (Docker Compose) to Grafana Cloud - a fully managed observability platform with free tier (14-day trial, then free tier available).
Benefits:
- ✅ Zero infrastructure - No Prometheus, Grafana, Loki, or Tempo containers to manage
- ✅ Automatic scaling - Handles any volume of metrics/logs/traces
- ✅ High availability - 99.9% uptime SLA
- ✅ Global CDN - Fast dashboard loading worldwide
- ✅ Built-in alerting - Email, Slack, PagerDuty integrations
- ✅ Long-term retention - Metrics for 13 months (free tier)
Table of Contents
- Quick Start (15 Minutes)
- Prerequisites
- Step 1: Create Grafana Cloud Account
- Step 2: Get Your Cloud Credentials
- Step 3: Update OpenTelemetry Collector
- Step 4: Update Docker Compose
- Step 5: Configure Environment Variables
- Step 6: Test the Connection
- Step 7: Import Dashboards
- Step 8: Set Up Alerts
- Migration Checklist
- Architecture Comparison
- Cost Optimization
- Troubleshooting
Quick Start (15 Minutes)
TL;DR - Fast track to Grafana Cloud:
- Create account → https://grafana.com/auth/sign-up/create-user
- Get OTLP endpoint → Your stack → Send Data → OpenTelemetry → Copy endpoint URL
- Configure collector (credentials already in
otel-collector-grafana-cloud.yaml):# Verify endpoint in otel-collector-grafana-cloud.yaml matches your region # File already configured with prod-eu-central-0 credentials - Deploy:
docker compose up -d # Update .env to point bot to local collector OTLP_ENDPOINT=http://localhost:4317 # Run bot export CREDENTIALS_PASSPHRASE="your_passphrase" wealth run - View → https://YOUR-STACK.grafana.net → Explore → Query:
wealth_funding_rate
Important: Do NOT send telemetry directly from the bot to Grafana Cloud's managed OTLP gateway. Always use a local OTLP Collector as a proxy (see Architecture and Troubleshooting for why).
For detailed instructions, continue reading below.
Prerequisites
- ✅ Wealth trading bot installed (see Getting Started)
- ✅ Docker and Docker Compose installed
- ✅ Credit card (for Grafana Cloud trial - free tier available after)
Step 1: Create Grafana Cloud Account
1.1 Sign Up
- Go to https://grafana.com/auth/sign-up/create-user
- Click "Start for free"
- Fill in your details:
- Email address
- Company name (can use "Personal" or your name)
- Password
- Verify your email
1.2 Create a Stack
After email verification:
- You'll be prompted to create a stack (your isolated Grafana Cloud instance)
- Choose:
- Stack name:
wealth-trading-bot(or any name you prefer) - Region: Choose closest to you (e.g.,
us-east-1,eu-west-1) - Plan: Start with Free Trial (14 days, then free tier)
- Stack name:
- Click "Launch Stack"
1.3 Access Your Stack
Your stack will be created with a URL like:
https://YOUR-STACK-NAME.grafana.net
Save this URL - you'll need it later.
Step 2: Get Your Cloud Credentials
Note: The repository includes pre-configured credentials for prod-eu-central-0 region in otel-collector-grafana-cloud.yaml. If you're using a different region or want to use your own credentials, follow these steps.
2.1 Find Your OTLP Endpoint
- In Grafana Cloud UI, go to your stack homepage
- Click "Send Data" or "Connections"
- Search for "OpenTelemetry" or "OTLP"
- You'll see an endpoint like:
https://otlp-gateway-YOUR-REGION.grafana.net/otlp
Example regions:
prod-eu-central-0(Europe - Frankfurt)prod-us-east-0(US East - Virginia)prod-us-central-0(US Central)
2.2 Create an Access Token
- Go to "Administration" → "Access Policies"
- Click "Create access policy"
- Set:
- Name:
wealth-bot-telemetry - Realms: Your stack
- Scopes:
metrics:writelogs:writetraces:write
- Name:
- Click "Create"
- Click "Add token" → Copy the token (starts with
glc_)
2.3 Get Your Instance ID
Your instance ID (username) is visible in the OTLP configuration:
- Usually a 6-7 digit number (e.g.,
1446931) - Found in: "Send Data" → "OpenTelemetry" → Look for "Instance ID" or "Username"
2.4 Update Collector Configuration
Edit otel-collector-grafana-cloud.yaml:
exporters:
otlphttp/grafana_cloud:
endpoint: "https://otlp-gateway-YOUR-REGION.grafana.net/otlp" # Update region
auth:
authenticator: basicauth/grafana_cloud
extensions:
basicauth/grafana_cloud:
client_auth:
username: "YOUR_INSTANCE_ID" # Update with your instance ID
password: "YOUR_API_TOKEN" # Update with your token (glc_...)
Security Note: These credentials are in the Docker Compose config file, not .env, as they're only used by the OTLP Collector container, not the bot itself.
Step 3: Configure Bot to Use Local Collector
The bot should send telemetry to the local OTLP Collector, not directly to Grafana Cloud.
3.1 Update Environment Variables
In your .env file:
# Point bot to LOCAL collector (NOT Grafana Cloud directly)
OTLP_ENDPOINT=http://localhost:4317
# Optional: Set service metadata
ENVIRONMENT=production
SERVICE_NAME=wealth-bot
Critical: Do NOT set OTLP_ENDPOINT to Grafana Cloud's managed gateway URL. Direct connections from the Rust SDK have compatibility issues. See Troubleshooting for details.
Step 4: Start the OTLP Collector
3.1 Collector Configuration Overview
The repository includes a pre-configured OTLP Collector at otel-collector-grafana-cloud.yaml. This collector:
- Receives OTLP data from the bot (gRPC on port 4317, HTTP on 4318)
- Processes telemetry (batching, resource detection, filtering)
- Exports to Grafana Cloud via OTLP/HTTP endpoint
- Generates additional metrics from traces (span metrics, service graphs)
Key Features:
- Uses
grafanacloudconnector for automatic span metrics generation - Drops unnecessary resource attributes to reduce cardinality
- Adds deployment environment and service version to metrics
- Health check endpoint on port 13133
Configuration is already complete - you only need to update credentials if using a different region (see Step 2).
Step 4: Start the OTLP Collector
4.1 Start the Collector
The Docker Compose configuration is already in your repository at compose.yml (Grafana Cloud is now the default).
# Start the OTLP Collector
docker compose up -d
# Verify it's healthy
curl http://localhost:13133
# Expected: {"status":"Server available","upSince":"..."}
# Check logs
docker compose logs otel-collector
# Look for: "Everything is ready. Begin running and processing data."
What this deploys:
- Single container: OpenTelemetry Collector (
otel/opentelemetry-collector-contrib:latest) - Ports exposed:
4317- OTLP gRPC (bot connects here)4318- OTLP HTTP (alternative)13133- Health check endpoint
- Credentials: Read from
otel-collector-grafana-cloud.yaml(hardcoded, not from.env)
Step 5: Configure and Run the Bot
5.1 Update Bot Configuration
In your .env file:
# Point to LOCAL collector (NOT Grafana Cloud)
OTLP_ENDPOINT=http://localhost:4317
# Optional: Service metadata
ENVIRONMENT=production
SERVICE_NAME=wealth-bot
Critical: The bot should connect to localhost:4317 (local collector), NOT to Grafana Cloud's managed gateway. The collector handles the connection to Grafana Cloud.
5.2 Verify .env is Gitignored
# Check if .env is gitignored
grep "\.env" .gitignore
# If not, add it
echo ".env" >> .gitignore
Step 6: Test the Connection
6.1 Verify Collector is Running
# Check collector health
curl http://localhost:13133
# Expected: {"status":"Server available","upSince":"..."}
# Check collector logs
docker compose logs otel-collector | tail -20
# Look for: "Everything is ready. Begin running and processing data."
# Check for export errors
docker compose logs otel-collector | grep -i error
# Should be empty or only show benign warnings
6.2 Run the Bot
# Set required credentials
export CREDENTIALS_PASSPHRASE="your_passphrase"
# Ensure OTLP_ENDPOINT points to local collector
echo $OTLP_ENDPOINT
# Should output: http://localhost:4317
# Run the bot
wealth run
6.3 Verify Bot is Exporting
# Check bot health endpoint
curl http://localhost:9090/health | jq
# Look for: "otlp_status": "connected" or similar
# Watch bot logs for OTLP connection
# Should see: "OpenTelemetry OTLP export enabled to: http://localhost:4317"
# Should NOT see: "BatchLogProcessor.ExportError" or "Timeout expired"
6.4 Verify Data in Grafana Cloud
- Open your Grafana Cloud instance:
https://YOUR-STACK.grafana.net - Go to "Explore"
- Select data source: Prometheus (or "grafanacloud-YOUR-STACK-prom")
- Run query:
{__name__=~"wealth.*"} - You should see metrics appearing within 10-30 seconds
If no data appears, see Troubleshooting or OTLP Export Issues.
Step 7: Import Dashboards
7.1 Import Pre-built Dashboard
Your repository includes a pre-built dashboard at grafana/grafana-dashboard.json:
- In Grafana Cloud UI, go to "Dashboards" → "Import"
- Click "Upload JSON file"
- Select
grafana/grafana-dashboard.jsonfrom your repository - Configure:
- Name: Keep default or rename (e.g., "Wealth Trading Bot")
- Folder: Select or create a folder
- Prometheus: Select your Grafana Cloud Prometheus data source
- Click "Import"
7.2 Verify Dashboard Data
After import:
- Dashboard should populate with data within 30 seconds
- Check panels for:
- Funding Rates by symbol and exchange
- Account Balances by exchange
- Trade Execution Metrics (success rate, errors)
- WebSocket Connection Status
7.3 Dashboard Variables (if needed)
If variables need adjustment:
- Open imported dashboard
- Click ⚙️ (Settings) → "Variables"
- Update
$pairvariable if needed:- Query:
label_values(wealth_funding_rate, symbol) - Data source:
grafanacloud-YOUR-STACK-prom
- Query:
- Save dashboard
Migration Checklist
Use this checklist to track your migration progress:
Pre-Migration
-
Current self-hosted setup working (
docker compose ps) -
Bot exporting metrics (
curl http://localhost:9090/health) - Optional: Export existing dashboards for backup
Grafana Cloud Setup
- Create Grafana Cloud account
- Create stack (save URL)
- Get Prometheus endpoint + credentials
- Get Loki endpoint (can reuse same credentials)
- Get Tempo endpoint (can reuse same credentials)
Configuration
-
Copy
.env.exampleto.env -
Uncomment and fill
GRAFANA_CLOUD_*variables -
Set
GRAFANA_CLOUD_USERNAMEandGRAFANA_CLOUD_API_TOKEN -
Verify
.envis in.gitignore
Deployment
-
Stop old stack:
docker compose down -
Start Cloud stack:
docker compose up -d -
Verify collector health:
curl http://localhost:13133 - Check collector logs for errors
-
Run bot:
wealth run -
Verify bot health:
curl http://localhost:9090/health
Verification
- Open Grafana Cloud: https://YOUR-STACK.grafana.net
-
Query
wealth_funding_ratein Explore - Check metrics appearing (10-30 seconds)
- Verify logs in Loki
- Check traces in Tempo (if enabled)
Dashboard & Alerts
-
Import
grafana/grafana-dashboard.json - Update dashboard data source to Grafana Cloud Prometheus
- Create alert rules (low balance, WebSocket down, etc.)
- Set up notification channels (email, Slack)
- Test notifications
Post-Migration
- Monitor for 24 hours
- Check usage: Administration → Usage insights
- Optional: Clean up old Docker images
- Update team documentation
Time estimate: ~45 minutes total (~15 minutes active work)
Step 8: Set Up Alerts
8.1 Create Alert Rules
Grafana Cloud has built-in alerting:
- In dashboard, click any panel → "Edit"
- Click "Alert" tab → "Create alert rule from this panel"
- Configure:
- Name:
Low Balance Alert - Query:
sum(wealth_account_balance{type="total"}) < 1000 - Threshold: Below 1000
- Evaluation interval: 1m
- Name:
- Click "Save"
8.2 Notification Channels
Set up notifications:
- Go to "Alerting" → "Contact points"
- Click "Add contact point"
- Choose:
- Email (free)
- Slack (requires webhook)
- PagerDuty (requires integration key)
- Test notification → Save
8.3 Recommended Alerts
| Alert | Query | Threshold | Severity |
|---|---|---|---|
| Low Balance | sum(wealth_account_balance{type="total"}) < 1000 | < $1,000 | Critical |
| High Error Rate | rate(wealth_order_errors_total[5m]) > 0.1 | > 10% | Warning |
| WebSocket Down | wealth_websocket_status == 0 | == 0 | Critical |
| Low Win Rate | wealth_trades_win_rate < 40 | < 40% | Warning |
Architecture Comparison
Before (Self-Hosted)
┌─────────────────────────────────────────────────────┐
│ Docker Compose (5 containers) │
├─────────────────────────────────────────────────────┤
│ Wealth Bot → OTLP Collector │
│ ↓ │
│ ├→ Prometheus (metrics) │
│ ├→ Tempo (traces) │
│ └→ Loki (logs) │
│ ↓ │
│ Grafana (dashboards) │
└─────────────────────────────────────────────────────┘
Resource Usage:
- CPU: ~500-800 MB RAM (all containers)
- Disk: ~1-5 GB (retention depends on volume)
- Maintenance: Manual updates, backups, scaling
After (Grafana Cloud)
┌─────────────────────────────────────────────────────────────┐
│ Wealth Bot (Rust) │
│ ↓ OTLP gRPC (localhost:4317) │
│ OTLP Collector (Docker) │
│ ↓ OTLP/HTTP + Basic Auth │
│ Grafana Cloud │
│ ├→ Prometheus (metrics) │
│ ├→ Loki (logs) │
│ └→ Tempo (traces) │
└─────────────────────────────────────────────────────────────┘
Architecture Benefits:
- ✅ Local Collector as Proxy: Handles batching, retries, and Grafana Cloud-specific requirements
- ✅ No Direct Bot → Cloud Connection: Avoids SDK compatibility issues (see Why Use a Collector)
- ✅ Single Point of Auth: Credentials managed in one place (collector config)
- ✅ Debugging Layer: Collector logs show exactly what's sent to Grafana Cloud
Resource Usage:
- CPU: ~100-200 MB RAM (OTLP Collector only)
- Disk: Minimal (no local storage)
- Maintenance: Zero (fully managed)
- Network: ~5-10 MB/hour (compressed telemetry)
Why This Architecture?
According to Grafana's best practices:
"We advise that you send directly to an OTLP endpoint in testing or small scale development scenarios only. Use a collector for production."
Direct SDK → Grafana Cloud connections are unreliable due to:
- Rate limiting on managed endpoints
- SDK-specific compatibility issues
- Missing batching/compression optimizations
- No retry buffer for network failures
See OTLP Troubleshooting for detailed explanation.
Cost Optimization
Free Tier Limits (Grafana Cloud)
| Service | Free Tier Quota | Overage Cost |
|---|---|---|
| Metrics | 10k series, 13 months retention | $0.30/1k series/month |
| Logs | 50 GB ingestion, 2 weeks retention | $0.50/GB |
| Traces | 50 GB ingestion, 2 weeks retention | $0.50/GB |
| Dashboards | 10 users, unlimited dashboards | Free |
Current Bot Usage (Estimated)
Based on your metrics:
- Metrics: ~500 series (well within free tier)
- Logs: ~1-2 GB/month (within free tier)
- Traces: ~500 MB/month (within free tier)
Expected Cost: $0/month (free tier) 🎉
Tips to Stay Within Free Tier
-
Reduce log volume:
# In .env, set log level to INFO (not DEBUG) RUST_LOG=info -
Sample traces (if you enable tracing heavily):
# In otel-collector-grafana-cloud.yaml processors: probabilistic_sampler: sampling_percentage: 10 # Only send 10% of traces -
Monitor usage:
- Go to "Administration" → "Usage insights"
- Check monthly consumption
Troubleshooting
Issue: "Connection refused" to Grafana Cloud
Symptom: OTLP Collector logs show connection errors
Solutions:
-
Check endpoints in
.env:grep GRAFANA_CLOUD .env # Verify URLs match your region (e.g., us-east-1, eu-west-1) -
Test connectivity:
# Test Prometheus endpoint curl -u "$GRAFANA_CLOUD_USERNAME:$GRAFANA_CLOUD_API_TOKEN" \ "$GRAFANA_CLOUD_PROMETHEUS_ENDPOINT" # Expected: 401 (auth works), 404 (endpoint works), or 200 (success) # Not expected: Connection refused, timeout -
Check firewall:
# Ensure outbound HTTPS (443) is allowed telnet prometheus-us-east-1.grafana.net 443
Issue: "Unauthorized" (401) from Grafana Cloud
Symptom: OTLP Collector logs show 401 errors
Solutions:
-
Verify credentials:
echo $GRAFANA_CLOUD_API_TOKEN # Should start with "glc_" -
Regenerate token:
- Go to Grafana Cloud → "Administration" → "Access Policies"
- Delete old token
- Create new token with
metrics:write,logs:write,traces:writescopes - Update
.env
-
Restart collector:
docker compose restart otel-collector
Issue: No metrics in Grafana Cloud
Symptom: Dashboard shows "No data"
Solutions:
-
Check bot is exporting:
curl http://localhost:9090/metrics # Should show: "backend": "OpenTelemetry OTLP" -
Check collector is receiving:
docker compose logs otel-collector | grep "Metric" # Should show: "Metrics" ... "sent": true -
Check data source in Grafana:
- Go to "Connections" → "Data sources"
- Click on Prometheus data source
- Scroll down → "Save & test"
- Should show: "Data source is working"
-
Query directly:
# Query Grafana Cloud API curl -u "$GRAFANA_CLOUD_USERNAME:$GRAFANA_CLOUD_API_TOKEN" \ "https://prometheus-us-east-1.grafana.net/api/v1/query?query=wealth_funding_rate"
Issue: High data usage
Symptom: Approaching free tier limits
Solutions:
-
Check current usage:
- Grafana Cloud → "Administration" → "Usage insights"
-
Reduce metric cardinality:
# In otel-collector-grafana-cloud.yaml processors: filter: metrics: exclude: match_type: strict metric_names: - wealth_websocket_message_latency_milliseconds -
Aggregate metrics:
processors: metricstransform: transforms: - include: wealth_.* match_type: regexp action: aggregate_labels aggregation_type: sum label_set: [exchange] -
Reduce log verbosity:
# In .env RUST_LOG=warn # Only warnings and errors
Migration Checklist
Before switching to Grafana Cloud:
- Create Grafana Cloud account
- Get Prometheus, Loki, and Tempo credentials
-
Create
otel-collector-grafana-cloud.yaml -
Grafana Cloud is now default (
compose.yml) -
Self-hosted stack available at
compose.stack.yml -
Update
.envwith Grafana Cloud credentials -
Test connection:
docker compose up -d - Verify metrics in Grafana Cloud
- Import dashboards
- Set up alerts
- Configure notification channels
- Update documentation (if team project)
-
Stop old local stack (if running):
docker compose -f compose.stack.yml down
Next Steps
After successful migration:
-
Explore Grafana Cloud features:
- Pre-built dashboards
- SLO (Service Level Objectives) tracking
- Incident management
- OnCall rotation
-
Enable advanced features:
- Grafana Machine Learning - Anomaly detection
- Grafana Asserts - Automated root cause analysis
- Grafana Faro - Frontend observability (if you add a web UI)
-
Integrate with CI/CD:
- Use Grafana Cloud API for automated dashboard updates
- Add monitoring checks to GitHub Actions
-
Join Grafana Community:
- Slack: https://slack.grafana.com
- Forum: https://community.grafana.com
Related Documentation
- Monitoring Guide - All available metrics
- Grafana MCP Setup - AI-powered monitoring (works with Cloud too!)
- Deployment Guide - Production deployment
- Configuration Guide - Bot configuration
Support
If you encounter issues:
- Check collector logs:
docker compose logs otel-collector - Verify credentials:
grep GRAFANA_CLOUD .env - Test Grafana Cloud API: See troubleshooting section above
- Consult Grafana Cloud docs: https://grafana.com/docs/grafana-cloud/
Grafana Cloud Support:
- Free tier: Community support (Slack, forum)
- Paid plans: Email and chat support
Still stuck? Open an issue in the project repo with:
- OTLP Collector logs
- Bot health check output:
curl http://localhost:9090/health - Grafana Cloud stack URL (redact credentials)