kdizzled commited on
Commit
eb56d3b
·
verified ·
1 Parent(s): 23c2adf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -12
README.md CHANGED
@@ -41,7 +41,7 @@ Go to [https://github.com/JetBrains-Research/big-rocq](https://github.com/JetBra
41
  ### Training Procedure
42
 
43
  * **Objective:** InfoNCE
44
- * **Batch size:** 32
45
  * **Optimizer / LR:** AdamW, lr = 4e‑6, linear warm‑up 10 %, 22k steps
46
  * **Hardware:** 1× NVIDIA H100 GPU, 160 GB RAM, 14 h wall‑clock
47
 
@@ -52,17 +52,6 @@ Go to [https://github.com/JetBrains-Research/big-rocq](https://github.com/JetBra
52
  * **Dataset:** IMM‑300 (300 Rocq theorems) from the IMM project
53
  * **Metrics:** Downstream proof success rate of CoqPilot when given top‑7 retrieved premises; averaged over 12 generations.
54
 
55
- ### Results
56
-
57
- | Model (back‑end) | Bucket | Baseline Jaccard | **RocqStar** |
58
- | ---------------- | ------ | ---------------- | ------------ |
59
- | GPT‑4o | ≤ 4 | 48 ± 5 % | **51 ± 5 %** |
60
- | GPT‑4o | 5–8 | 18 ± 4 % | **25 ± 3 %** |
61
- | GPT‑4o | 9–20 | 11 ± 4 % | 11 ± 5 % |
62
- | Claude 3.5 | ≤ 4 | 58 ± 5 % | **61 ± 4 %** |
63
- | Claude 3.5 | 5–8 | 28 ± 5 % | **36 ± 5 %** |
64
- | Claude 3.5 | 9–20 | 16 ± 5 % | **21 ± 5 %** |
65
-
66
  #### Summary
67
 
68
  RocqStar delivers consistent gains, up to 28% relative improvement over Jaccard-index based retrieval, especially for medium‑length theorems where proof similarity diverges most from statement similarity.
 
41
  ### Training Procedure
42
 
43
  * **Objective:** InfoNCE
44
+ * **Batch size:** 16
45
  * **Optimizer / LR:** AdamW, lr = 4e‑6, linear warm‑up 10 %, 22k steps
46
  * **Hardware:** 1× NVIDIA H100 GPU, 160 GB RAM, 14 h wall‑clock
47
 
 
52
  * **Dataset:** IMM‑300 (300 Rocq theorems) from the IMM project
53
  * **Metrics:** Downstream proof success rate of CoqPilot when given top‑7 retrieved premises; averaged over 12 generations.
54
 
 
 
 
 
 
 
 
 
 
 
 
55
  #### Summary
56
 
57
  RocqStar delivers consistent gains, up to 28% relative improvement over Jaccard-index based retrieval, especially for medium‑length theorems where proof similarity diverges most from statement similarity.