File size: 4,914 Bytes
9afef3b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
---
license: apache-2.0
language: en
tags:
- medical
- healthcare
- gpt
- text-generation
- clinical
- biology
- medicine
datasets:
- medical-literature
- pubmed
widget:
- text: "Symptoms of diabetes include"
example_title: "Medical Symptoms"
- text: "Treatment for hypertension involves"
example_title: "Medical Treatment"
- text: "The patient presents with chest pain and"
example_title: "Clinical Note"
- text: "Question: What is high blood pressure? Answer:"
example_title: "Medical Q&A"
pipeline_tag: text-generation
---
# MedLLM-10M: Medical Language Model
## Model Description
MedLLM-10M is a lightweight GPT-style language model specifically trained on medical literature and clinical text. This model is designed for educational and research purposes in the medical domain.
⚠️ **Important Disclaimer**: This model is for research and educational purposes only. It should never be used for actual medical diagnosis, treatment recommendations, or clinical decision-making without proper medical supervision.
## Model Details
- **Model Type**: Causal Language Model (GPT-style)
- **Parameters**: ~27.7M
- **Architecture**: Transformer decoder
- **Training Data**: Medical literature, PubMed abstracts, clinical guidelines
- **Vocabulary Size**: 5,000
- **Context Length**: 512 tokens
- **License**: Apache 2.0
## Architecture
```
Layers: 8
Hidden Size: 512
Attention Heads: 8
Feed Forward Size: 2048
Dropout: 0.1
Activation: gelu
```
## Training Details
The model was trained on a curated dataset of medical literature including:
- PubMed abstracts and research papers
- Medical journal articles
- Clinical practice guidelines
- Medical Q&A datasets
- Healthcare websites (Mayo Clinic, WebMD, etc.)
### Training Hyperparameters
- **Epochs**: 10
- **Batch Size**: 4
- **Learning Rate**: 0.0003
- **Optimizer**: AdamW
- **Weight Decay**: 0.01
- **Mixed Precision**: FP16 (if available)
### Hardware
- **Training Hardware**: NVIDIA RTX 3060 (12GB VRAM)
- **Framework**: PyTorch + Transformers
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("raihan-js/medllm-10m")
model = AutoModelForCausalLM.from_pretrained("raihan-js/medllm-10m")
# Generate medical text
prompt = "Symptoms of diabetes include"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
**inputs,
max_length=100,
do_sample=True,
temperature=0.7,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
## Model Performance
This is an early-stage model trained on limited data. Current capabilities include:
- Basic medical terminology understanding
- Simple text completion in medical contexts
- Educational content generation
**Known Limitations**:
- May generate incoherent or medically inaccurate text
- Requires significant additional training for production use
- Should not be used for medical advice or diagnosis
## Intended Use Cases
### ✅ Appropriate Uses
- Educational demonstrations of medical language models
- Research into medical NLP applications
- Text completion for medical writing assistance (with human review)
- Learning and experimentation with transformer models
### ❌ Inappropriate Uses
- **Medical diagnosis or treatment recommendations**
- **Clinical decision-making**
- **Patient care without human oversight**
- **Emergency medical situations**
- **Replacement for professional medical advice**
## Ethical Considerations
### Medical Disclaimer
⚠️ **CRITICAL WARNING**: This model is NOT intended for medical use. Always consult qualified healthcare professionals for medical advice, diagnosis, or treatment.
### Limitations and Biases
- Training data may contain biases present in medical literature
- Model may reflect historical or cultural biases in healthcare
- Performance varies significantly across different medical specialties
- May generate plausible but medically incorrect information
## Development Status
This is an **experimental model** in early development. Future improvements planned:
- Expanded training dataset
- Longer training duration
- Better medical accuracy evaluation
- Safety filtering and alignment
- Domain-specific fine-tuning
## Citation
```bibtex
@misc{medllm2024,
title={MedLLM: A Lightweight Medical Language Model},
author={Raihan},
year={2024},
publisher={HuggingFace},
url={https://huggingface.co/raihan-js/medllm-10m}
}
```
## Contact
For questions about this model, please open an issue in the model repository.
---
**Last Updated**: December 2024
**Model Version**: 1.0-alpha
**Status**: Experimental - Not for production use
|