Overview
Risk Legion includes built-in health monitoring endpoints and can integrate with external monitoring tools for comprehensive observability.Built-in Health Checks
Health Endpoint
Docker Health Check
Configured in Dockerfile:Uptime Monitoring
Using UptimeRobot
- Create account at uptimerobot.com
- Add monitors:
| Monitor | URL | Interval |
|---|---|---|
| API Health | https://api.risklegion.com/health | 5 min |
| Frontend | https://app.risklegion.com | 5 min |
- Configure alerts (email, Slack, etc.)
Using Better Uptime
- Create account at betteruptime.com
- Add heartbeat monitors
- Configure status page (optional)
Error Tracking
Sentry Integration
Backend Setup
Frontend Setup
Alert Configuration
In Sentry Dashboard:- Go to Alerts
- Create alert rules for:
- Error spike detection
- New issue alerts
- Performance regression
Application Metrics
Prometheus Integration
Backend Metrics
Prometheus Configuration
Grafana Dashboards
Create dashboards for:- Request rate and latency
- Error rate
- Database query performance
- Redis cache hit rate
Log Aggregation
Structured Logging
Log Output
Log Forwarding
Forward logs to:- AWS CloudWatch: Native EC2 integration
- Datadog: Via agent or API
- ELK Stack: Via Filebeat
Alerting
Alert Types
| Alert | Trigger | Severity |
|---|---|---|
| API Down | Health check fails 3x | Critical |
| High Error Rate | >5% 5xx errors for 5 min | High |
| High Latency | P95 >2s for 5 min | Warning |
| Database Issues | Connection failures | Critical |
Notification Channels
Configure alerts via:- Slack
- PagerDuty
- SMS (Twilio)
Example Slack Alert
Status Page
Using Atlassian Statuspage
- Create page at statuspage.io
- Add components:
- API
- Web Application
- Database
- Authentication
- Configure automation with API
Self-Hosted Option
Use Upptime for GitHub-powered status page.Runbooks
API Unresponsive
- Check EC2 instance status
- SSH to instance
- Check Docker container:
docker ps - Check container logs:
docker logs risk-legion-api - Restart if needed:
docker restart risk-legion-api
Database Connection Issues
- Check Supabase status
- Verify DATABASE_URL is correct
- Check connection pool status
- Restart application to reset connections
High Latency
- Check slow query logs
- Review recent deployments
- Check resource usage:
docker stats - Scale resources if needed
Checklist
Basic Monitoring
Basic Monitoring
- Health endpoint configured
- Uptime monitor active
- Alert notifications set up
Error Tracking
Error Tracking
- Sentry configured
- Alert rules created
- On-call rotation defined
Metrics
Metrics
- Prometheus/metrics endpoint
- Grafana dashboards
- Performance baselines set
Logging
Logging
- Structured logging enabled
- Log aggregation configured
- Log retention policy set