Profiling
10 instruments · Real-time model health · Latency · Memory management · Drift tracking
Live Memory Monitor
Fetching…
Loading…
All Instruments
Memory Status
Real-time system and GPU memory with tiered alerts
No configuration required — ready to run.
Latency Benchmark
Time-to-first-token and tokens-per-second measurement
5
120
Memory Pre-Flight
Checks if a model will fit in available memory before loading
Output Drift Detector
Detects semantic drift in model outputs over time
10
250
Throughput Monitor
Sustained tokens-per-second under continuous load
30
5120
Cache Pressure Analysis
KV cache eviction rates and memory pressure under load
4096
25632768
Thermal Profile
CPU/GPU temperature and throttling detection during inference
1
0.110
Model Health Score
Composite health score from latency, drift, and memory metrics
No configuration required — ready to run.
Batch Size Optimizer
Finds max batch size that fits in memory without OOM
512
644096
Inference Profiler
Layer-by-layer time breakdown for a single forward pass