Benchmark Runner
This script benchmarks forward/backward performance of several operations (rms
, add_rms
, poly
, mul_poly
).
Results can be saved as CSV files or plots.
Note
To run the benchmarks, you must select the appropriate Torch version along with the corresponding CUDA/ROCm build from within thebuild
directory.Example:
export PYTHONPATH=$PYTHONPATH:<YOUR_PATH>/activation/build/torch27-cxx11-cu128-x86_64-linux
Usage
python main.py --case <CASE> [--plot] [--save-path <DIR>]
--case
(required): one ofrms
,add_rms
,poly
,mul_poly
--plot
: save plots instead of CSVs--save-path
: output directory (default:./configs/
)
Examples
python main.py --case add_rms --save-path ./results/
python main.py --case poly --plot --save-path ./plots/
Output
- CSV:
<case>-fwd-perf.csv
,<case>-bwd-perf.csv
- Plots:
plot_<case>-fwd-perf.png
,plot_<case>-bwd-perf.png