Improve model card with metadata and additional usage information (#2)

Browse files

- Improve model card with metadata and additional usage information (9e6b3fc81dda74449f1042525ab289f163a9bb80)

Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show

README.md +75 -75

README.md CHANGED Viewed

@@ -1,90 +1,34 @@
 ---
-license: apache-2.0
 base_model:
 - Qwen/Qwen3-8B
 ---
 <div align="center">
     <h1> <a href="http://blog.goedel-prover.com"> <strong>Goedel-Prover-V2: The Strongest Open-Source Theorem Prover to Date</strong></a></h1>
 </div>
-<head>
-  <meta charset="UTF-8">
-  <style>
-    .badges {
-      display: flex;
-      justify-content: center;
-      gap: 0.5rem;  /* eliminate whitespace between badges */
-      padding: 0.8rem;
-    }
-    .badge {
-      display: inline-flex;       /* inline flex for pill layout */
-      align-items: center;        /* center icon & text vertically */
-      padding: 0.4rem 0.4rem;        /* larger badge size */
-      background: #333;
-      color: #fff;                 /* white text */
-      text-decoration: none;
-      font-family: sans-serif;
-      font-size: 0.8rem;          /* larger font */
-      border-radius: 999em;        /* full pill shape */
-      transition: background 0.15s;/* smooth hover */
-    }
-    .badge:hover {
-      background: #444;
-    }
-    .badge svg,
-    .badge .emoji {
-      width: 1.7em; /* scale icons with badge text */
-      height: 1.7em;
-      margin-right: 0.5em;
-      fill: currentColor; /* inherit badge color */
-    }
-  </style>
-</head>
-<body>
-  <div class="badges">
-    <a href="https://arxiv.org/abs/2508.03613"
-       class="badge"
-       target="_blank"
-       rel="noopener">
-      <!-- use an emoji or inline-SVG for the PDF icon -->
-      <span class="emoji"><h1>📄</h1></span>
-      arXiv
-    </a>
-    <a href="http://blog.goedel-prover.com"
-         class="badge" target="_blank" rel="noopener">
-      <span class="emoji"><h1>🌐</h1></span>Website</a><a href="https://huggingface.co/Goedel-LM/Goedel-Prover-V2-32B"
-         class="badge" target="_blank" rel="noopener">
-      <span class="emoji"><h1>🤗</h1></span>HuggingFace</a>
-    <a href="https://github.com/Goedel-LM/Goedel-Prover-V2" class="badge" target="_blank" rel="noopener">
-      <svg viewBox="0 0 64 64" xmlns="http://www.w3.org/2000/svg">
-        <path d="M32.029,8.345c-13.27,0-24.029,10.759-24.029,24.033c0,10.617 6.885,19.624
-                 16.435,22.803c1.202,0.22 1.64-0.522 1.64-1.16c0-0.569-0.02-2.081-0.032-4.086
-                 c-6.685,1.452-8.095-3.222-8.095-3.222c-1.093-2.775-2.669-3.514-2.669-3.514
-                 c-2.182-1.492,0.165-1.462,0.165-1.462c2.412,0.171 3.681,2.477 3.681,2.477
-                 c2.144,3.672 5.625,2.611 6.994,1.997c0.219-1.553 0.838-2.612 1.526-3.213
-                 c-5.336-0.606-10.947-2.669-10.947-11.877c0-2.623 0.937-4.769 2.474-6.449
-                 c-0.247-0.608-1.072-3.051 0.235-6.36c0,0 2.018-0.646 6.609,2.464c1.917-0.533
-                 3.973-0.8 6.016-0.809c2.041,0.009 4.097,0.276 6.017,0.809c4.588-3.11
-                 6.602-2.464 6.602-2.464c1.311,3.309 0.486,5.752 0.239,6.36c1.54,1.68
-                 2.471,3.826 2.471,6.449c0,9.232-5.62,11.263-10.974,11.858c0.864,0.742
-                 1.632,2.208 1.632,4.451c0,3.212-0.029,5.804-0.029,6.591c0,0.644
-                 0.432,1.392 1.652,1.157c9.542-3.185 16.421-12.186 16.421-22.8c0-13.274
-                 -10.76-24.033-24.034-24.033"/>
-      </svg>
-      Code
-    </a>
-  </div>
-</body>
 ## 1. Introduction
-We introduce Goedel-Prover-V2, an open-source language model series that sets a new state-of-the-art in automated formal proof generation. Built on the standard expert iteration and reinforcement learning pipeline, our approach incorporates three key innovations: (1) <strong>Scaffolded data synthesis</strong>: We generate synthetic proof tasks of increasing difficulty to progressively train the model, enabling it to master increasingly complex theorems; (2) <strong>Verifier-guided self-correction</strong>: The model learns to iteratively revise its own proofs by leveraging feedback from Lean’s compiler, closely mimicking how humans refine their work; (3) <strong>Model averaging</strong>: We combine multiple model checkpoints to improve robustness and overall performance.
-Our small model, Goedel-Prover-V2-8B, reaches 83.0% on MiniF2F test set at Pass@32, matching the performance of prior state-of-the-art DeepSeek-Prover-V2-671B while being nearly 100 times smaller in model size.  Our flagship model, Goedel-Prover-V2-32B, achieves 88.0% on MiniF2F at Pass@32 on standard mode and 90.4% on self-correction mode, outperforming prior SOTA DeepSeek-Prover-V2-671B and concurrent work Kimina-Prover-72B by a large margin. Additionaly, our flagship model with self-correction solves 64 problems on PutnamBench at Pass@64, securing the 1st on the leaderboard surpassing DeepSeek-Prover-V2-671B's record of solving 47 problems by Pass@1024.
 ## 2. Benchmark Performance
-<strong>Self-correction mode</strong>: Our model improves proof quality by first generating an initial candidate and then using Lean compiler feedback to iteratively revise it. We perform two rounds of self-correction, which remain computationally efficient—the total output length (including the initial proof and two revisions) increases only modestly from the standard 32K to 40K tokens.
 <style>
@@ -231,11 +175,63 @@ We release our Goedel-Prover-V2 models and the new MathOlympiadBench benchmark t
 </div>
-<strong>MathOlympiadBench</strong> (Math Olympiad Bench) comprises human-verified formalizations of Olympiad-level mathematical competition problems, sourced from [Compfiles](https://dwrensha.github.io/compfiles/imo.html) and [IMOSLLean4 repository](https://github.com/mortarsanjaya/IMOSLLean4). MathOlympiadBench contains 360 problems, including 158 IMO problems from 1959 to 2024, 131 IMO shortlist problems covering 2006 to 2023, 68 regional mathematical Olympiad problems, and 3 additional mathematical puzzles.
 This model is being released to aid other open-source projects, including those geared towards the upcoming IMO competition. A full paper with all details will be released in the coming weeks.
-## 5. Quick Start
 You can directly use [Huggingface's Transformers](https://github.com/huggingface/transformers) for model inference.
 ```python
@@ -284,9 +280,13 @@ print(tokenizer.batch_decode(outputs))
 print(time.time() - start)
 ```
-### Cite
 ```bibtex
 @article{lin2025goedelproverv2,
       title={Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction},

 ---
 base_model:
 - Qwen/Qwen3-8B
+license: apache-2.0
+library_name: transformers
+pipeline_tag: text-generation
 ---
 <div align="center">
     <h1> <a href="http://blog.goedel-prover.com"> <strong>Goedel-Prover-V2: The Strongest Open-Source Theorem Prover to Date</strong></a></h1>
 </div>
+<div align="center">
+[![Website](https://img.shields.io/badge/%F0%9F%A4%96%20Homepage-Goedel-536af5?color=536af5&logoColor=white)](http://blog.goedel-prover.com)
+[![GitHub](https://img.shields.io/badge/GitHub-Code-black.svg?logo=github)](https://github.com/Goedel-LM/Goedel-Prover-V2)
+[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20face-Goedel-ffc107?color=ffc107&logoColor=white)](https://huggingface.co/Goedel-LM/Goedel-Prover-V2-32B)
+[![arXiv](https://img.shields.io/badge/arXiv-2508.03613-b31b1b.svg?style=flat)](https://arxiv.org/abs/2508.03613)
+[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
+</div>
 ## 1. Introduction
+We introduce Goedel-Prover-V2, an open-source language model series that sets a new state-of-the-art in automated formal proof generation. Built on the standard expert iteration and reinforcement learning pipeline, our approach incorporates three key innovations: (1) **Scaffolded data synthesis**: We generate synthetic proof tasks of increasing difficulty to progressively train the model, enabling it to master increasingly complex theorems; (2) **Verifier-guided self-correction**: The model learns to iteratively revise its own proofs by leveraging feedback from Lean’s compiler, closely mimicking how humans refine their work; (3) **Model averaging**: We combine multiple model checkpoints to improve robustness and overall performance.
+Our small model, Goedel-Prover-V2-8B, reaches 83.0% on MiniF2F test set at Pass@32, matching the performance of prior state-of-the-art DeepSeek-Prover-V2-671B while being nearly 100 times smaller in model size. Our flagship model, Goedel-Prover-V2-32B, achieves 88.0% on MiniF2F at Pass@32 on standard mode and 90.4% on self-correction mode, outperforming prior SOTA DeepSeek-Prover-V2-671B and concurrent work Kimina-Prover-72B by a large margin. Additionaly, our flagship model with self-correction solves 64 problems on PutnamBench at Pass@64, securing the 1st on the leaderboard surpassing DeepSeek-Prover-V2-671B's record of solving 47 problems by Pass@1024.
 ## 2. Benchmark Performance
+**Self-correction mode**: Our model improves proof quality by first generating an initial candidate and then using Lean compiler feedback to iteratively revise it. We perform two rounds of self-correction, which remain computationally efficient—the total output length (including the initial proof and two revisions) increases only modestly from the standard 32K to 40K tokens.
 <style>
 </div>
+**MathOlympiadBench** (Math Olympiad Bench) comprises human-verified formalizations of Olympiad-level mathematical competition problems, sourced from [Compfiles](https://dwrensha.github.io/compfiles/imo.html) and [IMOSLLean4 repository](https://github.com/mortarsanjaya/IMOSLLean4). MathOlympiadBench contains 360 problems, including 158 IMO problems from 1959 to 2024, 131 IMO shortlist problems covering 2006 to 2023, 68 regional mathematical Olympiad problems, and 3 additional mathematical puzzles.
 This model is being released to aid other open-source projects, including those geared towards the upcoming IMO competition. A full paper with all details will be released in the coming weeks.
+## 5. Environment Setup
+We follow [DeepSeek-Prover-V1.5](https://github.com/deepseek-ai/DeepSeek-Prover-V1.5), which uses Lean 4 version 4.9 and the corresponding Mathlib. Please refer to the following instructions to set up the environments.
+### Requirements
+* Supported platform: Linux
+* Python 3.10
+### Installation
+1.  **Install Lean 4**
+    Follow the instructions on the [Lean 4 installation page](https://leanprover.github.io/lean4/doc/quickstart.html) to set up Lean 4.
+2.  **Clone the repository**
+```sh
+git clone --recurse-submodules https://github.com/Goedel-LM/Goedel-Prover-V2.git
+cd Goedel-Prover-V2
+```
+3.  **Install required packages**
+```sh
+conda env create -f goedelv2.yml
+```
+If you encounter installation error when installing packages (flash-attn), please run the following:
+```sh
+conda activate goedelv2
+pip install torch==2.6.0
+conda env update --file goedelv2.yml
+```
+4.  **Build Mathlib4**
+```sh
+cd mathlib4
+lake build
+```
+5.  **Test Lean 4 and mathlib4 installation**
+```sh
+cd ..
+python lean_compiler/repl_scheduler.py
+```
+If there is any error, reinstall Lean 4 and rebuild mathlib4.
+If you have installed Lean and Mathlib for other projects and want to use the pre-installed things, note that you might need to modify `DEFAULT_LAKE_PATH` and `DEFAULT_LEAN_WORKSPACE` in `lean_compiler/repl_scheduler.py`.
+## 6. Quick Start
 You can directly use [Huggingface's Transformers](https://github.com/huggingface/transformers) for model inference.
 ```python
 print(time.time() - start)
 ```
+## 7. Batch Inference and Self-correction
+```sh
+bash scripts/pipeline.sh
+```
+### 8. Citation
 ```bibtex
 @article{lin2025goedelproverv2,
       title={Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction},