MedLLM-10M: Medical Language Model

Model Description

MedLLM-10M is a lightweight GPT-style language model specifically trained on medical literature and clinical text. This model is designed for educational and research purposes in the medical domain.

⚠️ Important Disclaimer: This model is for research and educational purposes only. It should never be used for actual medical diagnosis, treatment recommendations, or clinical decision-making without proper medical supervision.

Model Details

Model Type: Causal Language Model (GPT-style)
Parameters: ~27.7M
Architecture: Transformer decoder
Training Data: Medical literature, PubMed abstracts, clinical guidelines
Vocabulary Size: 5,000
Context Length: 512 tokens
License: Apache 2.0

Architecture

Layers: 8
Hidden Size: 512
Attention Heads: 8
Feed Forward Size: 2048
Dropout: 0.1
Activation: gelu

Training Details

The model was trained on a curated dataset of medical literature including:

PubMed abstracts and research papers
Medical journal articles
Clinical practice guidelines
Medical Q&A datasets
Healthcare websites (Mayo Clinic, WebMD, etc.)

Training Hyperparameters

Epochs: 10
Batch Size: 4
Learning Rate: 0.0003
Optimizer: AdamW
Weight Decay: 0.01
Mixed Precision: FP16 (if available)

Hardware

Training Hardware: NVIDIA RTX 3060 (12GB VRAM)
Framework: PyTorch + Transformers

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("raihan-js/medllm-10m")
model = AutoModelForCausalLM.from_pretrained("raihan-js/medllm-10m")

# Generate medical text
prompt = "Symptoms of diabetes include"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
    **inputs, 
    max_length=100, 
    do_sample=True, 
    temperature=0.7,
    pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Model Performance

This is an early-stage model trained on limited data. Current capabilities include:

Basic medical terminology understanding
Simple text completion in medical contexts
Educational content generation

Known Limitations:

May generate incoherent or medically inaccurate text
Requires significant additional training for production use
Should not be used for medical advice or diagnosis

Intended Use Cases

✅ Appropriate Uses

Educational demonstrations of medical language models
Research into medical NLP applications
Text completion for medical writing assistance (with human review)
Learning and experimentation with transformer models

❌ Inappropriate Uses

Medical diagnosis or treatment recommendations
Clinical decision-making
Patient care without human oversight
Emergency medical situations
Replacement for professional medical advice

Ethical Considerations

Medical Disclaimer

⚠️ CRITICAL WARNING: This model is NOT intended for medical use. Always consult qualified healthcare professionals for medical advice, diagnosis, or treatment.

Limitations and Biases

Training data may contain biases present in medical literature
Model may reflect historical or cultural biases in healthcare
Performance varies significantly across different medical specialties
May generate plausible but medically incorrect information

Development Status

This is an experimental model in early development. Future improvements planned:

Expanded training dataset
Longer training duration
Better medical accuracy evaluation
Safety filtering and alignment
Domain-specific fine-tuning

Citation

@misc{medllm2024,
  title={MedLLM: A Lightweight Medical Language Model},
  author={Raihan},
  year={2024},
  publisher={HuggingFace},
  url={https://huggingface.co/raihan-js/medllm-10m}
}

Contact

For questions about this model, please open an issue in the model repository.

Last Updated: December 2024 Model Version: 1.0-alpha Status: Experimental - Not for production use

Downloads last month: 16

Safetensors

Model size

28M params

Tensor type

F32

raihan-js
/

medllm-10m