File size: 4,914 Bytes

9afef3b

---

license: apache-2.0
language: en
tags:
- medical
- healthcare
- gpt
- text-generation
- clinical
- biology
- medicine
datasets:
- medical-literature
- pubmed
widget:
- text: "Symptoms of diabetes include"
  example_title: "Medical Symptoms"
- text: "Treatment for hypertension involves"
  example_title: "Medical Treatment"
- text: "The patient presents with chest pain and"
  example_title: "Clinical Note"
- text: "Question: What is high blood pressure? Answer:"
  example_title: "Medical Q&A"
pipeline_tag: text-generation
---


# MedLLM-10M: Medical Language Model

## Model Description

MedLLM-10M is a lightweight GPT-style language model specifically trained on medical literature and clinical text. This model is designed for educational and research purposes in the medical domain.

⚠️ **Important Disclaimer**: This model is for research and educational purposes only. It should never be used for actual medical diagnosis, treatment recommendations, or clinical decision-making without proper medical supervision.

## Model Details

- **Model Type**: Causal Language Model (GPT-style)
- **Parameters**: ~27.7M
- **Architecture**: Transformer decoder
- **Training Data**: Medical literature, PubMed abstracts, clinical guidelines
- **Vocabulary Size**: 5,000
- **Context Length**: 512 tokens
- **License**: Apache 2.0

## Architecture

```

Layers: 8

Hidden Size: 512

Attention Heads: 8

Feed Forward Size: 2048

Dropout: 0.1

Activation: gelu

```

## Training Details

The model was trained on a curated dataset of medical literature including:
- PubMed abstracts and research papers
- Medical journal articles  
- Clinical practice guidelines
- Medical Q&A datasets
- Healthcare websites (Mayo Clinic, WebMD, etc.)

### Training Hyperparameters

- **Epochs**: 10
- **Batch Size**: 4
- **Learning Rate**: 0.0003
- **Optimizer**: AdamW
- **Weight Decay**: 0.01
- **Mixed Precision**: FP16 (if available)

### Hardware

- **Training Hardware**: NVIDIA RTX 3060 (12GB VRAM)
- **Framework**: PyTorch + Transformers

## Usage

```python

from transformers import AutoTokenizer, AutoModelForCausalLM



# Load model and tokenizer

tokenizer = AutoTokenizer.from_pretrained("raihan-js/medllm-10m")

model = AutoModelForCausalLM.from_pretrained("raihan-js/medllm-10m")



# Generate medical text

prompt = "Symptoms of diabetes include"

inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(

    **inputs, 

    max_length=100, 

    do_sample=True, 

    temperature=0.7,

    pad_token_id=tokenizer.eos_token_id

)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)

```

## Model Performance

This is an early-stage model trained on limited data. Current capabilities include:
- Basic medical terminology understanding
- Simple text completion in medical contexts
- Educational content generation

**Known Limitations**:
- May generate incoherent or medically inaccurate text
- Requires significant additional training for production use
- Should not be used for medical advice or diagnosis

## Intended Use Cases

### ✅ Appropriate Uses
- Educational demonstrations of medical language models
- Research into medical NLP applications
- Text completion for medical writing assistance (with human review)
- Learning and experimentation with transformer models

### ❌ Inappropriate Uses
- **Medical diagnosis or treatment recommendations**
- **Clinical decision-making**
- **Patient care without human oversight**
- **Emergency medical situations**
- **Replacement for professional medical advice**

## Ethical Considerations

### Medical Disclaimer
⚠️ **CRITICAL WARNING**: This model is NOT intended for medical use. Always consult qualified healthcare professionals for medical advice, diagnosis, or treatment.

### Limitations and Biases
- Training data may contain biases present in medical literature
- Model may reflect historical or cultural biases in healthcare
- Performance varies significantly across different medical specialties
- May generate plausible but medically incorrect information

## Development Status

This is an **experimental model** in early development. Future improvements planned:
- Expanded training dataset
- Longer training duration  
- Better medical accuracy evaluation
- Safety filtering and alignment
- Domain-specific fine-tuning

## Citation

```bibtex

@misc{medllm2024,

  title={MedLLM: A Lightweight Medical Language Model},

  author={Raihan},

  year={2024},

  publisher={HuggingFace},

  url={https://huggingface.co/raihan-js/medllm-10m}

}

```

## Contact

For questions about this model, please open an issue in the model repository.

---

**Last Updated**: December 2024
**Model Version**: 1.0-alpha
**Status**: Experimental - Not for production use