Skip to main content
Version: v2508

Metrics Details

Detailed list and descriptions of all metrics collected by AIBooster. The panel library provides 101 panels organized into the following categories.

Unit "-" represents dimensionless values.

GPU Metrics (DCGM)

Basic GPU Information (11 panels)

Metric NamePanel NameDescriptionUnit
DCGM_FI_DEV_GPU_UTILGPU UtilizationGPU utilization percentage%
DCGM_FI_DEV_GPU_TEMPGPU TemperatureGPU temperature
DCGM_FI_DEV_POWER_USAGEGPU Power UsageGPU power consumptionWatt
DCGM_FI_DEV_FB_USEDGPU Memory UsedFramebuffer memory usedBytes
DCGM_FI_DEV_FB_FREEGPU Memory FreeFramebuffer memory freeBytes
DCGM_FI_DEV_MEM_CLOCKGPU Memory ClockMemory clock frequencyMHz
DCGM_FI_DEV_SM_CLOCKGPU SM ClockStreaming multiprocessor clock frequencyMHz
DCGM_FI_DEV_MEMORY_TEMPGPU Memory TemperatureGPU memory temperature
DCGM_FI_DEV_MEM_COPY_UTILGPU Memory Copy UtilizationGPU memory copy utilization%
DCGM_FI_DEV_ROW_REMAP_FAILUREGPU Row Remap FailureGPU memory row remap failure count-
DCGM_FI_DEV_VGPU_LICENSE_STATUSvGPU License StatusvGPU license status-

Profiling Information (6 panels)

Metric NamePanel NameDescriptionUnit
DCGM_FI_PROF_DRAM_ACTIVEDRAM ActiveDRAM utilization%
DCGM_FI_PROF_PCIE_RX_BYTESPCIe RX BytesPCIe receive bytesBytes/sec
DCGM_FI_PROF_PCIE_TX_BYTESPCIe TX BytesPCIe transmit bytesBytes/sec
DCGM_FI_PROF_SM_ACTIVESM ActiveStreaming multiprocessor utilization%
DCGM_FI_PROF_SM_OCCUPANCYSM OccupancyStreaming multiprocessor occupancy%
DCGM_FI_PROF_PIPE_TENSOR_ACTIVETensor Core ActiveTensor core utilization%

System Metrics (Node Exporter)

CPU & Load (7 panels)

Metric NamePanel NameDescriptionUnit
node_load1Load Average 1m1-minute system load average-
node_load5Load Average 5m5-minute system load average-
node_load15Load Average 15m15-minute system load average-
node_cpu_frequency_max_hertzCPU Frequency MaxMaximum CPU frequencyHz
node_cpu_frequency_min_hertzCPU Frequency MinMinimum CPU frequencyHz
node_cpu_scaling_frequency_hertzCPU Scaling FrequencyCurrent CPU operating frequencyHz
node_cpu_scaling_governorCPU GovernorCPU governor setting status-

Memory (9 panels)

Metric NamePanel NameDescriptionUnit
node_memory_MemTotal_bytesNode Memory TotalTotal memory capacityBytes
node_memory_MemAvailable_bytesNode Memory AvailableAvailable memory capacityBytes
node_memory_MemFree_bytesNode Memory FreeFree memory capacityBytes
node_memory_Active_bytesNode Memory ActiveActive memory usageBytes
node_memory_Inactive_bytesNode Memory InactiveInactive memory usageBytes
node_memory_Cached_bytesNode Memory CachedCache memory usageBytes
node_memory_Buffers_bytesNode Memory BuffersBuffer memory usageBytes
node_memory_SwapTotal_bytesNode Swap TotalTotal swap capacityBytes
node_memory_SwapFree_bytesNode Swap FreeFree swap capacityBytes

Filesystem (5 panels)

Metric NamePanel NameDescriptionUnit
node_filesystem_size_bytesFilesystem SizeTotal filesystem capacityBytes
node_filesystem_avail_bytesFilesystem AvailableAvailable filesystem capacityBytes
node_filesystem_free_bytesFilesystem FreeFree filesystem capacityBytes
node_filesystem_filesFilesystem Files TotalTotal inode count-
node_filesystem_files_freeFilesystem Files FreeFree inode count-

Network (4 panels)

Metric NamePanel NameDescriptionUnit
node_network_infoNetwork InfoNetwork interface information-
node_network_upNetwork UpNetwork interface status-
node_network_speed_bytesNetwork SpeedNetwork speedBytes/sec
node_network_mtu_bytesNetwork MTUMaximum Transmission UnitBytes

Processes (2 panels)

Metric NamePanel NameDescriptionUnit
node_procs_runningProcesses RunningRunning process count-
node_procs_blockedProcesses BlockedBlocked process count-

File Descriptors (3 panels)

Metric NamePanel NameDescriptionUnit
node_filefd_allocatedFile Descriptors AllocatedAllocated file descriptor count-
node_filefd_maximumFile Descriptors MaximumMaximum file descriptor count-
node_arp_entriesARP EntriesARP table entry count-

System Boot Time (1 panel)

Metric NamePanel NameDescriptionUnit
node_boot_time_secondsBoot TimeSystem boot timeSeconds

ARC Cache (10 panels)

Metric NamePanel NameDescriptionUnit
node_zfs_arc_sizeZFS ARC SizeCurrent ARC cache sizeBytes
node_zfs_arc_cZFS ARC CARC target sizeBytes
node_zfs_arc_c_maxZFS ARC C MaxARC maximum sizeBytes
node_zfs_arc_c_minZFS ARC C MinARC minimum sizeBytes
node_zfs_arc_hitsZFS ARC HitsARC cache hit count-
node_zfs_arc_missesZFS ARC MissesARC cache miss count-
node_zfs_arc_mfu_hitsZFS ARC MFU HitsMost Frequently Used hit count-
node_zfs_arc_mru_hitsZFS ARC MRU HitsMost Recently Used hit count-
node_zfs_arc_demand_data_hitsZFS ARC Demand Data HitsDemand data hit count-
node_zfs_arc_demand_data_missesZFS ARC Demand Data MissesDemand data miss count-

ZFS Pool (6 panels)

Metric NamePanel NameDescriptionUnit
node_zfs_zpool_stateZFS Pool StateZFS pool status-
node_zfs_zpool_dataset_nreadZFS Dataset ReadsDataset read count-
node_zfs_zpool_dataset_nwrittenZFS Dataset WritesDataset write count-
node_zfs_zpool_dataset_readsZFS Dataset Read BytesDataset read bytesBytes
node_zfs_zpool_dataset_writesZFS Dataset Write BytesDataset write bytesBytes
node_zfs_zpool_dataset_nunlinksZFS Dataset UnlinksDataset unlink count-

Process & Application Metrics

Go Applications (18 panels)

Basic Information (4 panels)

Metric NamePanel NameDescriptionUnit
go_infoGo InfoGo language version information-
go_goroutinesGo GoroutinesRunning goroutine count-
go_threadsGo ThreadsThread count-
go_sched_gomaxprocs_threadsGo MAXPROCSGOMAXPROCS setting value-

Garbage Collection (2 panels)

Metric NamePanel NameDescriptionUnit
go_gc_gogc_percentGo GC PercentGOGC setting value%
go_gc_gomemlimit_bytesGo Memory LimitGOMEMLIMIT setting valueBytes

Memory Statistics (12 panels)

Metric NamePanel NameDescriptionUnit
go_memstats_alloc_bytesGo Memory AllocatedAllocated memoryBytes
go_memstats_sys_bytesGo Memory SystemSystem allocated memoryBytes
go_memstats_heap_alloc_bytesGo Heap AllocatedHeap allocated memoryBytes
go_memstats_heap_sys_bytesGo Heap SystemHeap system memoryBytes
go_memstats_heap_idle_bytesGo Heap IdleHeap idle memoryBytes
go_memstats_heap_inuse_bytesGo Heap In UseHeap in-use memoryBytes
go_memstats_heap_released_bytesGo Heap ReleasedHeap released memoryBytes
go_memstats_heap_objectsGo Heap ObjectsHeap object count-
go_memstats_stack_inuse_bytesGo Stack In UseStack in-use memoryBytes
go_memstats_stack_sys_bytesGo Stack SystemStack system memoryBytes
go_memstats_mspan_inuse_bytesGo MSpan In UseMSpan in-use memoryBytes
go_memstats_mspan_sys_bytesGo MSpan SystemMSpan system memoryBytes

Process Information (6 panels)

Metric NamePanel NameDescriptionUnit
process_resident_memory_bytesProcess Resident MemoryProcess physical memory usageBytes
process_virtual_memory_bytesProcess Virtual MemoryProcess virtual memory usageBytes
process_virtual_memory_max_bytesProcess Virtual Memory MaxProcess maximum virtual memoryBytes
process_open_fdsProcess Open FDsProcess open file descriptor count-
process_max_fdsProcess Max FDsProcess maximum file descriptor count-
process_start_time_secondsProcess Start TimeProcess start timeSeconds

Scraping Statistics (5 panels)

Metric NamePanel NameDescriptionUnit
scrape_duration_secondsScrape DurationMetric collection timeSeconds
scrape_samples_scrapedScrape Samples ScrapedCollected sample count-
scrape_samples_post_metric_relabelingScrape Samples Post RelabelingPost-relabeling sample count-
scrape_series_addedScrape Series AddedAdded time series count-
upScrape UpScrape success status-

Other System Metrics

Memory Cache (6 panels)

Metric NamePanel NameDescriptionUnit
go_memstats_mcache_inuse_bytesGo MCache In UseMCache in-use memoryBytes
go_memstats_mcache_sys_bytesGo MCache SystemMCache system memoryBytes
go_memstats_gc_sys_bytesGo GC SystemGC system memoryBytes
go_memstats_other_sys_bytesGo Other SystemOther system memoryBytes
go_memstats_buck_hash_sys_bytesGo Bucket Hash SystemBucket hash system memoryBytes
go_memstats_next_gc_bytesGo Next GCNext GC thresholdBytes

GC Statistics (1 panel)

Metric NamePanel NameDescriptionUnit
go_memstats_last_gc_time_secondsGo Last GC TimeLast GC execution timeSeconds

HTTP Statistics (1 panel)

Metric NamePanel NameDescriptionUnit
promhttp_metric_handler_requests_in_flightHTTP Requests In FlightIn-flight HTTP request count-