File size: 1,000 Bytes
e5e2eeb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# Benchmark Runner

This script benchmarks **forward/backward performance** of several operations (`rms`, `add_rms`, `poly`, `mul_poly`).
Results can be saved as **CSV files** or **plots**.

> **Note**<br>  
> To run the benchmarks, you must select the appropriate Torch version along with the corresponding CUDA/ROCm build from within the `build` directory.  
>
> **Example:**  
>
> ```bash
> export PYTHONPATH=$PYTHONPATH:<YOUR_PATH>/activation/build/torch27-cxx11-cu128-x86_64-linux
> ```

## Usage

```bash
python main.py --case <CASE> [--plot] [--save-path <DIR>]
```

- `--case` (required): one of `rms`, `add_rms`, `poly`, `mul_poly`
- `--plot`: save plots instead of CSVs
- `--save-path`: output directory (default: `./configs/`)

## Examples

```bash
python main.py --case add_rms --save-path ./results/
python main.py --case poly --plot --save-path ./plots/
```

## Output

- CSV: `<case>-fwd-perf.csv`, `<case>-bwd-perf.csv`
- Plots: `plot_<case>-fwd-perf.png`, `plot_<case>-bwd-perf.png`