MedLLM-10M: Medical Language Model
Model Description
MedLLM-10M is a lightweight GPT-style language model specifically trained on medical literature and clinical text. This model is designed for educational and research purposes in the medical domain.
⚠️ Important Disclaimer: This model is for research and educational purposes only. It should never be used for actual medical diagnosis, treatment recommendations, or clinical decision-making without proper medical supervision.
Model Details
- Model Type: Causal Language Model (GPT-style)
- Parameters: ~27.7M
- Architecture: Transformer decoder
- Training Data: Medical literature, PubMed abstracts, clinical guidelines
- Vocabulary Size: 5,000
- Context Length: 512 tokens
- License: Apache 2.0
Architecture
Layers: 8
Hidden Size: 512
Attention Heads: 8
Feed Forward Size: 2048
Dropout: 0.1
Activation: gelu
Training Details
The model was trained on a curated dataset of medical literature including:
- PubMed abstracts and research papers
- Medical journal articles
- Clinical practice guidelines
- Medical Q&A datasets
- Healthcare websites (Mayo Clinic, WebMD, etc.)
Training Hyperparameters
- Epochs: 10
- Batch Size: 4
- Learning Rate: 0.0003
- Optimizer: AdamW
- Weight Decay: 0.01
- Mixed Precision: FP16 (if available)
Hardware
- Training Hardware: NVIDIA RTX 3060 (12GB VRAM)
- Framework: PyTorch + Transformers
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("raihan-js/medllm-10m")
model = AutoModelForCausalLM.from_pretrained("raihan-js/medllm-10m")
# Generate medical text
prompt = "Symptoms of diabetes include"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
**inputs,
max_length=100,
do_sample=True,
temperature=0.7,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Model Performance
This is an early-stage model trained on limited data. Current capabilities include:
- Basic medical terminology understanding
- Simple text completion in medical contexts
- Educational content generation
Known Limitations:
- May generate incoherent or medically inaccurate text
- Requires significant additional training for production use
- Should not be used for medical advice or diagnosis
Intended Use Cases
✅ Appropriate Uses
- Educational demonstrations of medical language models
- Research into medical NLP applications
- Text completion for medical writing assistance (with human review)
- Learning and experimentation with transformer models
❌ Inappropriate Uses
- Medical diagnosis or treatment recommendations
- Clinical decision-making
- Patient care without human oversight
- Emergency medical situations
- Replacement for professional medical advice
Ethical Considerations
Medical Disclaimer
⚠️ CRITICAL WARNING: This model is NOT intended for medical use. Always consult qualified healthcare professionals for medical advice, diagnosis, or treatment.
Limitations and Biases
- Training data may contain biases present in medical literature
- Model may reflect historical or cultural biases in healthcare
- Performance varies significantly across different medical specialties
- May generate plausible but medically incorrect information
Development Status
This is an experimental model in early development. Future improvements planned:
- Expanded training dataset
- Longer training duration
- Better medical accuracy evaluation
- Safety filtering and alignment
- Domain-specific fine-tuning
Citation
@misc{medllm2024,
title={MedLLM: A Lightweight Medical Language Model},
author={Raihan},
year={2024},
publisher={HuggingFace},
url={https://huggingface.co/raihan-js/medllm-10m}
}
Contact
For questions about this model, please open an issue in the model repository.
Last Updated: December 2024 Model Version: 1.0-alpha Status: Experimental - Not for production use
- Downloads last month
- 16