ServerBee Docs

ServerBee provides real-time monitoring of all your connected servers through a unified web dashboard. Metrics are streamed over WebSocket for instant updates without polling.

Dashboard Overview

The main dashboard shows all registered servers with their current status at a glance:

Online/Offline status with color indicators
Ring grid of four donut charts per card: CPU, Memory, Disk, and monthly Traffic quota utilization. When no quota is configured the traffic ring falls back to cumulative bytes transferred, and when a billing cycle is active a "days remaining" hint is shown under the footer
Disk I/O throughput (read / write bytes per second, aggregated across devices) streamed live alongside network speed
Network throughput (upload / download speed)
Load trend (load5 · load15 next to load1) and uptime / swap / process / TCP / UDP summary row
Region and country flags (when GeoIP is enabled)

Servers are organized by groups and sorted by weight. You can filter, search, and batch-operate on servers from this view.

For custom operations views, use Dashboards & Widgets to create additional dashboard layouts with charts, maps, service status widgets, Markdown notes, and uptime timelines.

Region/country labels and the Server Map widget require GeoIP data. You can configure a custom MaxMind-compatible MMDB path with geoip.mmdb_path, or download the DB-IP Lite database from Settings → GeoIP Database. The status endpoint is GET /api/geoip/status, and administrators can trigger a download with POST /api/geoip/download.

Real-Time Updates

The browser connects to the server via WebSocket at /api/ws/servers. The communication flow works as follows:

On initial connection, the server sends a FullSync message containing the current state of all servers
As agents report new metrics, the server broadcasts Update messages to all connected browsers
When an agent connects or disconnects, ServerOnline / ServerOffline events are sent

This means the dashboard updates in real time -- there is no need to refresh the page or wait for polling intervals.

Metric Types

The agent collects the following metrics at a configurable interval (default: every 3 seconds):

System Resources

Metric	Unit	Description
CPU Usage	%	Overall CPU utilization (0-100)
Memory Used	bytes	Current RAM usage
Swap Used	bytes	Current swap space usage
Disk Used	bytes	Total disk space consumption

Network

Metric	Unit	Description
Network In Speed	bytes/sec	Current download speed
Network Out Speed	bytes/sec	Current upload speed
Network In Transfer	bytes	Cumulative total download since agent start
Network Out Transfer	bytes	Cumulative total upload since agent start

System Load

Metric	Description
Load Average 1m	System load over the last 1 minute
Load Average 5m	System load over the last 5 minutes
Load Average 15m	System load over the last 15 minutes

Connections and Processes

Metric	Description
TCP Connections	Number of active TCP connections
UDP Connections	Number of active UDP connections
Process Count	Total number of running processes

Environment

Metric	Description
Temperature	CPU temperature in degrees Celsius (optional)
Uptime	System uptime in seconds

Disk I/O

The agent collects per-disk read/write throughput on all major platforms:

Metric	Unit	Description
Read Speed	bytes/sec	Read throughput per disk
Write Speed	bytes/sec	Write throughput per disk

Linux: Reads /proc/diskstats directly. Only physical block devices are tracked (e.g., sda, nvme0n1). Virtual devices (loop*, dm-*, ram*, sr*) and partitions are excluded. DiskIo.name is the block device name.

macOS / Windows: Uses sysinfo Disk::usage() API with mount_point() as the key. This provides per-mount-path semantics (e.g., /, /home, C:\) rather than per-physical-disk. Known limitation: on macOS with APFS, multiple volumes sharing one physical disk may report overlapping I/O counters.

On the first sample after agent startup, a baseline is established and an empty list is reported. Subsequent samples compute delta-based rates.

Disk I/O data is stored as a JSON column (disk_io_json) in the records and records_hourly tables. The hourly aggregator computes per-device average read/write rates.

GPU (Optional)

When GPU monitoring is enabled (enable_gpu = true), per-device metrics are collected:

Metric	Description
GPU Name	Device model name
GPU Utilization	GPU core utilization percentage
GPU Memory Used	VRAM usage in bytes
GPU Memory Total	Total VRAM in bytes
GPU Temperature	Device temperature in degrees Celsius

Server Information

In addition to periodic metrics, each agent reports static system information when it first connects:

CPU name, core count, and architecture
Operating system and kernel version
Total memory, swap, and disk capacity
IPv4 and IPv6 addresses
Virtualization type (KVM, Xen, Docker, etc.)
Agent version

This information is displayed on the server detail page and stored in the database.

Historical Data and Charts

ServerBee stores metric records at two levels of granularity:

Raw Records

Written every 60 seconds by the RecordWriter background task
Retained for 7 days by default (configurable via retention.records_days)
Each record captures all metric values at a single point in time

Hourly Aggregated Records

Computed by the Aggregator background task
Averages of all raw records within each hour
Retained for 90 days by default (configurable via retention.records_hourly_days)
Used for long-term trend visualization

The dashboard charts automatically switch between raw and hourly data depending on the selected time range.

GPU Records

GPU metrics are stored separately in a dedicated table with per-device granularity. Each record includes the device index, name, memory, utilization, and temperature. These are retained for 7 days by default.

Server Groups

Organize your servers into logical groups for easier management:

Create groups with custom names and sort weights
Assign servers to groups
Groups appear as sections in the dashboard
Sort weight controls the display order (lower weight = higher position)

Groups can represent environments (production, staging), regions (US-East, EU-West), providers (AWS, Hetzner), or any other organizational structure that makes sense for your setup.

Server Details

Each server has a detail page showing:

Real-time streaming charts (default mode)
System information (hardware, OS, network)
Historical trend charts with time range selection
Disk I/O charts with merged and per-disk views (all platforms, historical mode)
90-day uptime timeline with daily availability breakdown
Server metadata (group, tags, remarks, pricing)
Actions (terminal access, edit, delete)

Real-Time Charts

The server detail page defaults to Real-time mode. In this mode, charts display live data streamed from WebSocket updates:

Data source: Accumulated from BrowserMessage::Update events via the ['servers'] TanStack Query cache
Update interval: ~3 seconds (matches the agent report interval)
Buffer size: 10-minute ring buffer (~200 data points), automatically trimmed
Deduplication: Uses the server-side last_active timestamp to filter duplicate events
Available charts: CPU, Memory, Disk, Network In/Out, Load Average (1m)
Time axis: First tick shows HH:mm:ss, subsequent ticks show mm:ss

Temperature, GPU, and Disk I/O charts are not available in real-time mode because the WebSocket ServerStatus message does not include these fields. Switch to a historical view to see temperature, GPU, and disk I/O data.

Disk I/O Charts

When historical disk I/O data is available, the server detail page displays a Disk I/O chart with two views:

Merged -- Combined read/write throughput across all physical disks
Per Disk -- Individual charts for each physical disk (e.g., sda, nvme0n1)

Both views show read speed (blue) and write speed (green) as area charts. Missing data points are filled with zero values to maintain a continuous timeline.

Uptime Timeline

The server detail page includes an uptime card with a 90-day timeline. Each day is shown as a colored bar:

Green -- 100% uptime
Yellow -- Below the yellow threshold (degraded)
Red -- Below the red threshold (major outage)
Gray -- No data

Uptime data is queried via GET /api/servers/{server_id}/uptime-daily?days=90. The endpoint returns a UptimeDailyEntry per day with date, online_minutes, total_minutes, and uptime_percent fields. Missing dates are gap-filled with zero values.

Network Quality Views

The /network overview and /network/{server_id} detail pages summarize configured probe targets for each server.

Newly assigned targets appear immediately, even before the first probe result is written
Targets without probe data render a no-data state instead of disappearing from the summary
The overview search box follows the active UI language

Time Range Selector

The time range bar offers these options:

Mode	Data Source	Description
Real-time	WebSocket ring buffer	Live streaming data (default)
1h	REST API (raw records)	Last 1 hour from database
6h	REST API (raw records)	Last 6 hours
24h	REST API (raw records)	Last 24 hours
7d	REST API (hourly records)	Last 7 days (aggregated)
30d	REST API (hourly records)	Last 30 days (aggregated)

When switching from Real-time to a historical view, the REST API queries are enabled and chart data loads from the database. When switching back to Real-time, the accumulated buffer data is displayed immediately (data continues to accumulate in the background even while viewing historical data).

Data Flow

Agent                  Server                    Browser
  |                      |                         |
  |-- Report (3s) ------>|                         |
  |                      |-- cache in AgentManager |
  |                      |                         |
  |                      |-- RecordWriter (60s) -->|
  |                      |   writes to SQLite      |
  |                      |                         |
  |                      |-- Update (broadcast) -->|
  |                      |                     real-time UI
  |                      |                         |
  |                      |-- Aggregator (hourly) ->|
  |                      |   hourly averages       |
  |                      |                         |
  |                      |-- Cleanup (hourly) ---->|
  |                      |   delete old records    |

The agent reports every 3 seconds. The server caches the latest report in memory and immediately broadcasts it to connected browsers. Every 60 seconds, all cached reports are batch-written to SQLite. Every hour, raw records are aggregated into hourly summaries, and expired data is cleaned up based on retention settings.

Traffic Statistics

ServerBee tracks network traffic at hourly and daily granularity, enabling billing cycle-aware usage monitoring with prediction capabilities.

How It Works

Traffic tracking is integrated into the existing metric recording pipeline:

Each agent reports cumulative network byte counters (net_in_transfer, net_out_transfer) every 3 seconds
The RecordWriter calculates the delta between consecutive reports and accumulates hourly traffic
The Aggregator rolls up hourly data into daily totals (timezone-aware via scheduler.timezone)
The Cleanup task removes expired traffic records based on retention settings

Traffic API

Query traffic statistics for any server via GET /api/servers/{id}/traffic. The response includes:

Cycle totals -- Total bytes in/out for the current billing cycle
Usage percentage -- Traffic used vs. the server's traffic limit (if configured)
Prediction -- Estimated end-of-cycle usage based on current consumption rate
Daily breakdown -- Per-day traffic totals within the billing cycle
Hourly breakdown -- Per-hour traffic totals for today

Billing Cycles

Traffic is queried within billing cycles determined by:

billing_cycle -- The cycle type: monthly (default), quarterly, or yearly
billing_start_day -- The day of month when the billing cycle starts (1-28, default 1)

For example, if billing_start_day = 15 and billing_cycle = monthly, the cycle runs from the 15th of each month to the 14th of the next month.

Frontend Display

The server detail page shows a collapsible traffic card with:

Progress bar -- Visual usage against the traffic limit
Daily chart -- Bar chart showing daily in/out traffic within the cycle
Hourly chart -- Bar chart showing today's hourly traffic breakdown
Prediction -- Estimated total usage by end of cycle

Data Retention

Data Type	Default Retention	Config Key
Traffic hourly	7 days	`retention.traffic_hourly_days`
Traffic daily	400 days	`retention.traffic_daily_days`

Scheduler Timezone

Daily traffic aggregation respects the scheduler.timezone setting. Set this to your billing timezone (e.g., Asia/Shanghai) to ensure daily totals align with your billing provider's day boundaries.

Docker Container Monitoring

ServerBee supports real-time Docker container monitoring when the agent has access to the Docker daemon. This feature requires the CAP_DOCKER capability to be enabled.

Overview

The Docker monitoring page shows:

Overview cards -- Running/Stopped container counts, total CPU usage, total memory usage, Docker version
Container list -- Searchable, filterable table with Name, Image, Status, CPU%, Memory, Network I/O
Events timeline -- Container lifecycle events (start, stop, die, create, destroy) in reverse chronological order
Networks dialog -- List of Docker networks with driver, scope, and container count
Volumes dialog -- List of Docker volumes with driver, mountpoint, and creation time

Container Details

Click any container row to view:

Container info -- Image, status, ports, creation time, container ID
Stats cards -- CPU usage, memory (with progress bar), network I/O, block I/O
Log stream -- Real-time log output with:
- stdout in white text, stderr in red text
- Follow toggle for auto-scrolling
- Clear button to reset the log view
- Connection status indicator (Connected/Disconnected)

How It Works

When browsers navigate to the Docker page, a DockerSubscribe message is sent via WebSocket
The server instructs the connected agent to start Docker monitoring
The agent uses the bollard Docker API client to poll containers and stats
Updates are broadcast to subscribing browsers in real time
When all browsers leave the Docker page, monitoring stops automatically (viewer ref-counting)

Container logs use a dedicated WebSocket endpoint (/api/ws/docker/logs/{server_id}) with subscribe/unsubscribe protocol for per-container log streaming.

Data Retention

Data Type	Default Retention	Config Key
Docker events	7 days	`retention.docker_events_days`

Container stats and logs are not persisted — they are streamed in real time only.

Network Quality Monitoring

ServerBee includes a built-in network quality monitoring system that probes network targets from each agent and visualizes latency, packet loss, and anomalies.

Preset Targets

96 preset probe targets are embedded in the server binary (not stored in the database):

China Telecom -- 31 provincial nodes (TCP probe via Zstatic CDN)
China Unicom -- 31 provincial nodes (TCP probe via Zstatic CDN)
China Mobile -- 31 provincial nodes (TCP probe via Zstatic CDN)
International -- Cloudflare (1.1.1.1), Google DNS (8.8.8.8), AWS Tokyo (ICMP probe)

China targets use domain names ({province}-{isp}-v4.ip.zstaticcdn.com:80) that auto-resolve to the latest CDN node IPs. No IP maintenance is needed.

Preset targets are read-only and cannot be edited or deleted. You can also create custom targets via the settings page.

Configuration

Navigate to Settings > Network Probes to configure:

Target Management -- View all 96 preset targets (with lock icon and ISP label) and manage custom targets (create/edit/delete)
Global Settings -- Probe interval (30-600 seconds, default 60), packets per round (5-20, default 10), and default targets auto-assigned to new servers
Per-Server Targets -- Assign up to 20 probe targets per server via the network detail page's "Manage Targets" dialog

Network Overview Page

The /network page shows a card for each server with:

Target count and average latency
Availability percentage
Anomaly count with severity indicators

Network Detail Page

Click a server card to see /network/:serverId with:

Time range selector -- Real-time, 1h, 6h, 24h, 7d, 30d
Target cards -- Per-target latency and packet loss, with toggle visibility
Multi-line latency chart -- One colored line per target, tooltips with timestamps
Anomaly table -- High latency, high packet loss, and unreachable events
Statistics bar -- Average latency, availability percentage, target count
CSV export -- Download probe data for the selected time range

Traceroute

The network detail page can run traceroute from the selected agent to a target host or IP. This is useful for diagnosing routing changes and packet loss outside regular probe intervals.

Requires the agent's effective CAP_PING_ICMP capability.
The target may contain only letters, numbers, dots, hyphens, and colons.
The server sends max_hops = 30 to the agent.
The agent times out the command after 60 seconds.
On Linux/macOS, the agent tries traceroute first and falls back to mtr if traceroute is missing.
On Windows, the agent uses the platform traceroute command.

API flow:

POST /api/servers/{id}/traceroute
GET  /api/servers/{id}/traceroute/{request_id}

The first request returns a request_id; poll the second endpoint until completed is true.

Data Retention

Network probe records follow the same two-tier storage as system metrics:

Raw records -- Retained for 7 days (configurable via retention.network_probe_days)
Hourly aggregates -- Retained for 90 days (configurable via retention.network_probe_hourly_days)

Alert Integration

Two alert rule types are available for network quality:

network_latency -- Triggers when average latency exceeds a threshold
network_packet_loss -- Triggers when packet loss exceeds a threshold

Monitoring

On this page