|
--- |
|
license: mit |
|
--- |
|
|
|
# 🧠 AlphaMed |
|
|
|
This is the official model checkpoint for the paper: |
|
**[AlphaMed: Incentivizing Medical Reasoning with minimalist Rule-Based RL](https://www.arxiv.org/abs/2505.17952)** |
|
AlphaMed is a medical large language model trained **without supervised fine-tuning on chain-of-thought (CoT) data**, |
|
relying solely on reinforcement learning to elicit step-by-step reasoning in complex medical tasks. |
|
|
|
## 🚀 Usage |
|
|
|
To use the model, format your input prompt as: |
|
|
|
> **Question:** [your medical question here] |
|
> **Please reason step by step, and put the final answer in \boxed{}** |
|
|
|
### 🔬 Example |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline |
|
|
|
# Load model and tokenizer |
|
model_id = "che111/AlphaMed-3B-instruct-rl" # Replace with actual repo path |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained(model_id) |
|
|
|
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) |
|
|
|
# Format question |
|
prompt = ( |
|
"Question: A 45-year-old patient presents with chest pain radiating to the left arm and elevated troponin levels. " |
|
"What is the most likely diagnosis?\n" |
|
"Please reason step by step, and put the final answer in \\boxed{}" |
|
) |
|
|
|
# Generate output |
|
max_new_tokens=8196 |
|
output = pipe(prompt, max_new_tokens=max_new_tokens, do_sample=False)[0]["generated_text"] |
|
print(output) |
|
|