2. Metadata Collection
MemSysExplorer includes built-in support for collecting system-level metadata to accompany memory profiling outputs. This metadata provides detailed context about the hardware and software environment in which the profiling was performed, ensuring accurate interpretation and reproducibility of results.
Note
The BaseMetadata class assumes a Linux-based system. Users on other platforms may encounter incomplete or missing metadata fields unless modified.
The BaseMetadata implementation can be found in the repository here: BaseMetadata.py
In the future, we will provide community metadata profiles collected from different systems to help users compare workload behaviors across architectures.
Important
Every profiler must enforce integration of `BaseMetadata` or its subclass. Metadata collection is essential to ensure that workload traces are reproducible and properly contextualized based on their execution environment.
2.1 Collected Metadata Includes
GPU Information - Device name, driver version, and available GPU memory (nvidia-smi)
CPU Information - Full lscpu dump, including architecture, core/thread counts, CPU family/model, etc.
Cache Hierarchy - Sizes of L1 instruction/data, L2, and L3 caches from /sys/devices/system/cpu/cpu0/cache
Main Memory (DRAM) - Total physical memory size in megabytes (/proc/meminfo)
Software Environment - Operating system name and version - Installed compiler versions (e.g., GCC, Clang, AOCC) - BIOS and firmware information (dmidecode) - Filesystem type - Power policy and CPU governor
2.2 Class Structure
The BaseMetadata class implements the following key methods:
gpu_info()– Extracts GPU specificationscpu_info()– Parses CPU attributes from lscpucache_info()– Returns cache sizes per leveldram_info()– Measures DRAM size from system memorysoftware_info()– Reports OS, kernel, compilers, BIOS, and policy infoas_dict()– Converts all metadata to a single dictionary object__repr__()– Provides a human-readable summary string
2.3 Integration
Each profiler in MemSysExplorer (e.g., dynamorio, perf, sniper, nvbit, ncu) may inherit from the BaseMetadata class or integrate its output into their reporting structures.
Important
The use of BaseMetadata is mandatory across all profilers to ensure a unified and reproducible profiling environment. Reproducibility across profiling runs requires consistent capture of both hardware and software environment metadata.
To support consistent experimentation and collaboration, MemSysExplorer will include a community-contributed database of workload metadata and profiler outputs. This shared repository will facilitate reproducible research, cross-platform comparisons, and collaborative benchmarking across research groups.
2.4 Metadata Structure Reference
The memsysmetadata_<profiler>.json file contains system and environment information captured during profiling. Below is a detailed reference for each field.
2.4.1 System Information
Field |
Description |
|---|---|
|
Name of detected GPU (e.g., “NVIDIA GeForce RTX 3080”) |
|
GPU driver version string |
|
Total GPU memory in megabytes |
|
Total system DRAM in megabytes |
2.4.2 CPU Information (cpu_info)
Field |
Description |
|---|---|
|
CPU architecture (e.g., “x86_64”, “aarch64”) |
|
Supported operation modes (e.g., “32-bit, 64-bit”) |
|
Physical and virtual address bit widths |
|
Full CPU model string (e.g., “Intel(R) Core(TM) i9-12900K”) |
|
Total number of logical CPUs |
|
Number of threads per physical core |
|
Number of cores per CPU socket |
|
Number of CPU sockets |
|
Current CPU frequency in MHz |
|
Maximum CPU frequency |
|
L1 data cache size (from lscpu) |
|
L1 instruction cache size |
|
L2 cache size |
|
L3 cache size |
2.4.3 Cache Hierarchy (cpu_cache)
Field |
Description |
|---|---|
|
L1 data cache size in bytes (from /sys/devices) |
|
L1 instruction cache size in bytes |
|
L2 cache size in bytes |
|
L3 cache size in bytes |
2.4.4 Software Information (software_info)
Field |
Description |
|---|---|
|
Operating system name and version |
|
Kernel version string |
|
GCC compiler version |
|
Clang compiler version (if installed) |
|
Filesystem type of the working directory |
|
BIOS/firmware version (requires sudo) |
|
Current CPU power policy |
|
CPU frequency governor setting |
2.4.5 Profiler-Specific Fields
Field |
Description |
|---|---|
|
DynamoRIO version (for dynamorio profiler) |
|
CUDA version (for nvbit/ncu profilers) |
|
Sniper version (for sniper profiler) |
|
Perf version (for perf profiler) |
2.5 PatternConfig Structure Reference
The memsyspatternconfig_<workload>.json file contains aggregated memory statistics from the profiling run. Below is a detailed reference for each field.
2.5.1 Experiment Information
Field |
Description |
|---|---|
|
Experiment/profiler name (e.g., “dynamorio”, “sniper”) |
|
Name of the profiled workload or benchmark |
2.5.2 Read/Write Statistics
Field |
Description |
|---|---|
|
Read frequency (operations per second or ratio) |
|
Total read operations count |
|
Write frequency (operations per second or ratio) |
|
Total write operations count |
|
Average read size in bytes |
|
Average write size in bytes |
2.5.3 Detailed Counters
Field |
Description |
|---|---|
|
Instruction write count |
|
Data write count |
|
Data read count |
|
Instruction read count |
2.5.4 Memory Footprint
Field |
Description |
|---|---|
|
Working set size in bytes (unique memory addresses accessed) |
|
Approximate WSS using HyperLogLog (if available) |
2.5.5 Units Reference
The unit dictionary specifies the measurement unit for each metric field:
Field |
Unit |
|---|---|
|
“ops/sec” or “ratio” |
|
“count” |
|
“bytes” |
|
“bytes” |
2.6 Profiler-Specific Output Differences
Different profilers capture different subsets of metrics based on their instrumentation capabilities. The table below shows which fields are populated by each profiler:
Field |
DynamoRIO |
Perf |
Sniper |
NVBit |
NCU |
|---|---|---|---|---|---|
|
Yes |
Yes |
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
Yes |
Yes |
|
Yes |
No |
Yes |
Yes |
No |
|
Yes |
No |
No |
Yes |
No |
|
No |
Yes |
Yes |
No |
No |
|
No |
Yes |
Yes |
No |
No |
|
No |
Yes |
Yes |
No |
Yes |
|
No |
Yes |
Yes |
No |
Yes |
|
No |
Partial |
Yes |
No |
No |
|
No |
No |
Yes |
No |
Yes |
|
No |
Yes |
Yes |
No |
Yes |
|
No |
No |
No |
No |
Yes |
Legend:
Yes: Field is fully supported and populated
No: Field is not available with this profiler
Partial: Field is available on some hardware/kernel versions