activation / benchmarks /README.md
TaehyunKim
Add fusion (#3)
e5e2eeb unverified

Benchmark Runner

This script benchmarks forward/backward performance of several operations (rms, add_rms, poly, mul_poly). Results can be saved as CSV files or plots.

Note

To run the benchmarks, you must select the appropriate Torch version along with the corresponding CUDA/ROCm build from within the build directory.

Example:

export PYTHONPATH=$PYTHONPATH:<YOUR_PATH>/activation/build/torch27-cxx11-cu128-x86_64-linux

Usage

python main.py --case <CASE> [--plot] [--save-path <DIR>]
  • --case (required): one of rms, add_rms, poly, mul_poly
  • --plot: save plots instead of CSVs
  • --save-path: output directory (default: ./configs/)

Examples

python main.py --case add_rms --save-path ./results/
python main.py --case poly --plot --save-path ./plots/

Output

  • CSV: <case>-fwd-perf.csv, <case>-bwd-perf.csv
  • Plots: plot_<case>-fwd-perf.png, plot_<case>-bwd-perf.png