- use mnt-backend's request type
- DeviceSetGpuLockedClocks
- DeviceSetPersistenceMode
- profile to the same file
- gpu
- nsight compute
- lib/tracer_tool.so
- lib/post-traces-processing
Data Collector for the MGPUSim NVIDIA Trace Project.
[Commands]
traces
profiles
simulations
[Flags]
-collect to specify the collection settings file
-secret to specify the secret tokens file
[Example]
./mnt-collector simulations --collect etc/simulations.yaml
Usage:
mnt-collector [command]
Available Commands:
completion Generate the autocompletion script for the specified shell
help Help about any command
profiles Use Nvidia system to profile the cases and upload the data to database & cloud.
simulations Use the given simulator to run traces and upload the data to database.
traces Use Nvbit to generate traces and upload the data to database & cloud.
delete Connect to both mongodb and s3 to delete specified entries together.
Flags:
--collect string yaml file that store collection settings (default is etc/collects.yaml) (default "ect/collects.yaml")
-h, --help help for mnt-collector
--secret string yaml file that store secret tokens (default is etc/secrets.yaml) (default "etc/secrets.yaml")
--machine string machine name filter, for delete only (required)
--cuda-version string CUDA version filter, for delete only (required)
--suite string suite name filter, for delete only (optional, default is "all")
--benchmark string benchmark title filter, for delete only (optional, default is "all")
-t, --toggle Help message for toggle
Use "mnt-collector [command] --help" for more information about a command.
To run mnt-collector with profiles, traces, and simulations together, use the Python script schedule/schedule.py.
-
Create a Configuration File
Create a new YAML file under the pathetc/{suite}/{benchmark}.yaml. For example,etc/polybench/atax.yaml. Below is an example of the content:# Environment device-id: 1 exclusive-mode: false # Simulation Config upload-to-server: true trace-collect: enable: false profile-collect: enable: true repeat-times: 3 # Benchmark Details cases: - title: atax suite: polybench directory: /home/exu03/workspace/GPU_Benchmarks/polybench/ATAX/ command: /home/exu03/workspace/GPU_Benchmarks/polybench/ATAX/atax.exe args: - size: 32 - size: 64
-
Remove Existing Trace ID File
Ensure there is no file at the pathtraceid/{suite}-{benchmark}.txt. For example,traceid/polybench-atax.txt. If it exists, delete it. -
Run the Schedule Script
Activate your Python environment and execute the script with the configuration file:python schedule/schedule.py --collect etc/{suite}/{benchmark}.yaml # Example: python schedule/schedule.py --collect etc/polybench/atax.yaml -
Check Progress Logs
Monitor the progress of profiles, traces, and simulations in the log fileetc/{suite}/{benchmark}-log.csv. For example,etc/polybench/atax-log.csv.
The delete command allows you to remove specific entries from both MongoDB and S3 storage based on the provided filters.
mnt-collector delete --machine <machine_name> --cuda-version <cuda_version> [--suite <suite_name>] [--benchmark <benchmark_title>]