Perf Documentation

Perfs leverage Hardware Performance Counters (HPC) to collect low-level hardware metrics from CPUs. These profiler provide valuable insights into memory access patterns, kernel performance, and overall system efficiency.

The profiling workflow in MemSysExplorer consists of two core actions, as provided by the main interface:

  1. Profiling (`profiling`) – Captures runtime execution metrics by specifying the required executable.

  2. Metric Extraction (`extract_metrics`) – Analyzes generated reports to extract memory and performance-related metrics.

When using the both action, profiling and metric extraction are performed sequentially.

Important

MemSysExplorer GitHub Repository

Refer to the codebase for the latest update: https://github.com/duca181/MemSysExplorer/tree/apps_dev/apps/profilers/perf

To learn more about license terms and third-party attribution, refer to the 6. Licensing and Attribution page.

Required Arguments

To execute Nsight Profilers, specific arguments are required based on the chosen action. The necessary arguments are defined in the code as follows: Perf can analyze three lavels of memory: l1, l2, and l3. The level of memory to be analyzed is specified using the level argument.

@classmethod
def required_profiling_args(cls):
    """
    Return required arguments for the profiling method.
    """
    return ["executable", "level"]

@classmethod
def required_extract_args(cls, action):
    """
    Return required arguments for the extract_metrics method.
    """
    if action == "extract_metrics":
        return ["report_file"]
    else:
        return []

Example Usage

Below are three examples of how to execute the profiling tool with different actions:

  • Profiling the application:

    python main.py --profiler perf --action profiling --level l1 --executable ./executable
    
  • Extracting metrics from an existing report:

    python main.py --profiler perf --action extract_metrics --level l1 --report_file ./report_file.ncu-rep
    
  • Performing both profiling and metric extraction:

    python main.py --profiler perf --action both --level l1 --executable ./executable
    

Note

Perf is orgninally designed to work on Intel CPUs, so come metrics might be unavailable on other architectures.

Sample Output

This profiler generates output traces that follow the standardized format defined by the MemSysExplorer Application Interface.

Troubleshooting

If you encounter issues while running or extracting metrics using the Perf profiler, consider the following checks:

  • Ensure `perf` is installed and available in your environment.

    You can verify this by running:

    which perf
    perf --version
    
  • Check whether hardware performance counters are accessible.

    On many Linux systems, user access to counters is restricted by default. You may need to reduce the kernel’s perf event restriction level:

    sudo sh -c 'echo -1 > /proc/sys/kernel/perf_event_paranoid'
    

    Alternatively, configure access with:

    sudo sysctl -w kernel.perf_event_paranoid=-1
    
  • Ensure you are running on a supported architecture. MemSysExplorer’s perf integration is designed and tested primarily on Intel CPUs. Some counters may be missing or unsupported on AMD, ARM, or virtualized environments.

  • Check for compatibility with your Linux kernel version and `perf` version.

    MemSysExplorer assumes compatibility with perf versions is above 6.x. Run:

    uname -r
    perf --version
    

    to check kernel and perf versions.

If the profiler fails silently or skips metrics, it’s likely due to unsupported or inaccessible counters. Consider testing a different memory level (–level l1, l2, or l3) or switching to another compatible platform.