DynamoRIO Documentation
DynamoRIO is a dynamic binary instrumentation framework that inserts analysis code at runtime, allowing fine-grained monitoring of program execution. It provides valuable insights into instruction-level behavior, memory access patterns, and control flow, enabling detailed performance and security analysis.
The profiling workflow in MemSysExplorer consists of two core actions, as provided by the main interface:
Profiling (`profiling`) – Captures runtime execution metrics by specifying the required executable.
Metric Extraction (`extract_metrics`) – Analyzes generated reports to extract memory and performance-related metrics.
When using the both action, profiling and metric extraction are performed sequentially.
Important
MemSysExplorer GitHub Repository
Refer to the codebase for the latest update: https://github.com/duca181/MemSysExplorer/tree/apps_dev/apps/profilers/dynamorio
To learn more about license terms and third-party attribution, refer to the 6. Licensing and Attribution page.
Required Arguments
To execute DynamoRIO profilers, specific arguments are required based on the chosen action. The necessary arguments are defined in the code as follows:
@classmethod
def required_profiling_args(cls):
"""
Return required arguments for the profiling method.
"""
return ["executable"]
@classmethod
def required_extract_args(cls, action):
"""
Return required arguments for the extract_metrics method.
"""
if action == "extract_metrics":
return ["report_file"]
else:
return []
Configuration File Support
The DynamoRIO memcount client supports runtime configuration through a text-based configuration file. This allows fine-grained control over profiling behavior without recompiling the instrumentation client.
Configuration Parameters
The following parameters can be configured in config/memcount_config.txt:
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
uint |
64 |
Cache line size in bytes for address alignment |
|
uint |
8 |
HyperLogLog precision bits (4-16) |
|
uint |
8 |
HLL precision for windowed sampling |
|
uint |
2000 |
Number of memory references per sampling window |
|
uint |
8192 |
Maximum buffered memory references before flush |
|
bool |
false |
Enable detailed protobuf trace output |
|
bool |
true |
Enable working set size statistics tracking |
|
bool |
true |
Enable exact WSS tracking (memory intensive) |
|
bool |
true |
Enable HLL-based approximate WSS tracking |
|
bool |
false |
Enable instruction count threshold termination |
|
uint64 |
100000000 |
Number of instructions before auto-termination |
|
string |
“memtrace” |
Base name for protobuf trace output |
|
string |
“timeseries” |
Base name for protobuf time-series output |
Instruction Threshold Feature
The instruction threshold feature automatically terminates profiling after a specified number of instructions. This is particularly useful for:
Limiting trace file sizes for long-running applications
Profiling initialization phases by setting a low threshold
Controlled experiments requiring consistent instruction counts
Testing and debugging with reproducible cutoff points
When the threshold is reached, the profiler will:
Print a notification message with the current captured statistics
Flush all buffered data to output files
Print final statistics
Terminate the application
Example configuration:
enable_instruction_threshold=true
instruction_threshold=100000000
Example Usage
Below are examples of how to execute the profiling tool with different actions:
Profiling the application:
python3 main.py --profiler dynamorio --action profiling --executable ./executable
Profiling with custom configuration:
python3 main.py --profiler dynamorio --action profiling --config config/memcount_config.txt --executable ./executable
Profiling with instruction threshold enabled:
Edit
config/memcount_config.txtto enable the threshold:enable_instruction_threshold=true instruction_threshold=100000000
Then run:
python3 main.py --profiler dynamorio --action profiling --config config/memcount_config.txt --executable ./executable
Extracting metrics from an existing report:
python3 main.py --profiler dynamorio --action extract_metrics --report_file ./report_file
Performing both profiling and metric extraction:
python3 main.py --profiler dynamorio --action both --executable ./executable
Sample Output
This profiler generates output traces that follow the standardized format defined by the MemSysExplorer Application Interface.
Output Files
The DynamoRIO profiler generates several output files during execution:
File |
Description |
|---|---|
|
Binary protobuf file containing detailed memory trace events (when |
|
Binary protobuf file containing time-series WSS metrics sampled at configurable intervals |
|
JSON file with aggregated memory statistics (read/write counts, working set size, etc.) |
|
JSON file with system metadata (CPU info, cache hierarchy, software versions) |
Time-Series Output
The time-series output (timeseries_<pid>.pb) captures windowed Working Set Size (WSS) samples throughout program execution.
What is Captured:
WSS measurements at configurable intervals (controlled by
sample_window_refs)Per-window read and write counts
Both exact and approximate (HyperLogLog) WSS estimates
Size histograms for memory accesses
Configuring Sampling Interval:
The sampling window size is controlled by the sample_window_refs parameter in the configuration file:
sample_window_refs=2000
This captures a WSS sample every 2000 memory references.
Fields in Time-Series Output:
Field |
Description |
|---|---|
|
Sequential index of the sampling window |
|
Thread ID for the sample |
|
Timestamp of the sample |
|
Number of read operations in the window |
|
Number of write operations in the window |
|
Total memory references in the window |
|
Exact working set size (unique cache lines accessed) |
|
HyperLogLog-estimated WSS (memory-efficient approximation) |
|
Distribution of read sizes (1, 2, 4, 8, 16, 32, 64, other bytes) |
|
Distribution of write sizes |
Parsing Time-Series Output:
Use the timeseries_parser.py tool to convert protobuf output to readable formats:
# Get summary statistics
python3 tools/timeseries_parser.py output/timeseries_12345.pb
# Export to CSV
python3 tools/timeseries_parser.py output/timeseries_12345.pb --format csv --output wss_data.csv
# Visualize with plots
python3 tools/timeparser_plot.py output/timeseries_12345.pb --output wss_plot.png
See 4. Tools for complete documentation on parsing tools.
Memory Trace Output
The memory trace output (memtrace_<pid>.pb) captures per-access memory events when enable_trace=true.
What is Captured:
Every memory read and write operation
Memory addresses accessed
Thread context for each access
Cache hit/miss information (if available)
Fields in Trace Output:
Field |
Description |
|---|---|
|
Event timestamp (monotonically increasing counter) |
|
Thread ID performing the memory access |
|
Memory address accessed (hexadecimal) |
|
Operation type: |
|
Cache result: |
Parsing Memory Trace Output:
Use the trace_parser.py tool to convert protobuf output:
# Get summary statistics
python3 tools/trace_parser.py output/memtrace_12345.pb
# Export to JSON
python3 tools/trace_parser.py output/memtrace_12345.pb --format json --output trace.json
# Export first 10000 events to CSV
python3 tools/trace_parser.py output/memtrace_12345.pb --format csv --limit 10000 --output trace.csv
# Filter by thread
python3 tools/trace_parser.py output/memtrace_12345.pb --thread 12345 --format csv
See 4. Tools for complete documentation on parsing tools.
Parsing Workflow
Note
Working Set Size (WSS) estimation is an area of ongoing research. The current implementation
uses ws_tsearch (Working Set Tree Search) for exact tracking and HyperLogLog for
approximate estimation. For implementation details, see the common library documentation
at Common Profiler Library Documentation.
End-to-End Workflow for DynamoRIO Output:
Step 1: Run profiling with configuration
python3 main.py --profiler dynamorio --action profiling \
--config config/memcount_config.txt --executable ./my_workload
Step 2: Parse time-series data for WSS trends
python3 tools/timeseries_parser.py output/timeseries_*.pb --format summary
Step 3: Visualize WSS over time
python3 tools/timeparser_plot.py output/timeseries_*.pb --output wss_analysis.png
Step 4: Parse memory trace for detailed analysis (if enabled)
python3 tools/trace_parser.py output/memtrace_*.pb --format csv --output detailed_trace.csv
Step 5: Calculate reuse distance (optional)
python3 tools/reuse_distance.py detailed_trace.csv --output reuse_analysis.txt
Additional Notes
The DynamoRIO must be correctly installed and accessible via the system PATH variable.
Troubleshooting
If you encounter issues when building the DynamoRIO profiler:
Ensure that the environment has been set up properly using:
source setup/setup.csh dynamorioor
source setup/setup.sh dynamorio
Verify that the correct GCC version is installed and exported in your environment. The profiler expects a compatible GCC version as configured in your setup script (e.g., GCC 11.2.0).
Check for missing compiler paths: Make sure PATH, LD_LIBRARY_PATH, LIBRARY_PATH, and C_INCLUDE_PATH are configured to include your GCC installation directories.
If problems persist, rebuild the profiler after re-sourcing your environment.