File size: 4,914 Bytes
9afef3b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
---

license: apache-2.0
language: en
tags:
- medical
- healthcare
- gpt
- text-generation
- clinical
- biology
- medicine
datasets:
- medical-literature
- pubmed
widget:
- text: "Symptoms of diabetes include"
  example_title: "Medical Symptoms"
- text: "Treatment for hypertension involves"
  example_title: "Medical Treatment"
- text: "The patient presents with chest pain and"
  example_title: "Clinical Note"
- text: "Question: What is high blood pressure? Answer:"
  example_title: "Medical Q&A"
pipeline_tag: text-generation
---


# MedLLM-10M: Medical Language Model

## Model Description

MedLLM-10M is a lightweight GPT-style language model specifically trained on medical literature and clinical text. This model is designed for educational and research purposes in the medical domain.

⚠️ **Important Disclaimer**: This model is for research and educational purposes only. It should never be used for actual medical diagnosis, treatment recommendations, or clinical decision-making without proper medical supervision.

## Model Details

- **Model Type**: Causal Language Model (GPT-style)
- **Parameters**: ~27.7M
- **Architecture**: Transformer decoder
- **Training Data**: Medical literature, PubMed abstracts, clinical guidelines
- **Vocabulary Size**: 5,000
- **Context Length**: 512 tokens
- **License**: Apache 2.0

## Architecture

```

Layers: 8

Hidden Size: 512

Attention Heads: 8

Feed Forward Size: 2048

Dropout: 0.1

Activation: gelu

```

## Training Details

The model was trained on a curated dataset of medical literature including:
- PubMed abstracts and research papers
- Medical journal articles  
- Clinical practice guidelines
- Medical Q&A datasets
- Healthcare websites (Mayo Clinic, WebMD, etc.)

### Training Hyperparameters

- **Epochs**: 10
- **Batch Size**: 4
- **Learning Rate**: 0.0003
- **Optimizer**: AdamW
- **Weight Decay**: 0.01
- **Mixed Precision**: FP16 (if available)

### Hardware

- **Training Hardware**: NVIDIA RTX 3060 (12GB VRAM)
- **Framework**: PyTorch + Transformers

## Usage

```python

from transformers import AutoTokenizer, AutoModelForCausalLM



# Load model and tokenizer

tokenizer = AutoTokenizer.from_pretrained("raihan-js/medllm-10m")

model = AutoModelForCausalLM.from_pretrained("raihan-js/medllm-10m")



# Generate medical text

prompt = "Symptoms of diabetes include"

inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(

    **inputs, 

    max_length=100, 

    do_sample=True, 

    temperature=0.7,

    pad_token_id=tokenizer.eos_token_id

)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)

```

## Model Performance

This is an early-stage model trained on limited data. Current capabilities include:
- Basic medical terminology understanding
- Simple text completion in medical contexts
- Educational content generation

**Known Limitations**:
- May generate incoherent or medically inaccurate text
- Requires significant additional training for production use
- Should not be used for medical advice or diagnosis

## Intended Use Cases

### ✅ Appropriate Uses
- Educational demonstrations of medical language models
- Research into medical NLP applications
- Text completion for medical writing assistance (with human review)
- Learning and experimentation with transformer models

### ❌ Inappropriate Uses
- **Medical diagnosis or treatment recommendations**
- **Clinical decision-making**
- **Patient care without human oversight**
- **Emergency medical situations**
- **Replacement for professional medical advice**

## Ethical Considerations

### Medical Disclaimer
⚠️ **CRITICAL WARNING**: This model is NOT intended for medical use. Always consult qualified healthcare professionals for medical advice, diagnosis, or treatment.

### Limitations and Biases
- Training data may contain biases present in medical literature
- Model may reflect historical or cultural biases in healthcare
- Performance varies significantly across different medical specialties
- May generate plausible but medically incorrect information

## Development Status

This is an **experimental model** in early development. Future improvements planned:
- Expanded training dataset
- Longer training duration  
- Better medical accuracy evaluation
- Safety filtering and alignment
- Domain-specific fine-tuning

## Citation

```bibtex

@misc{medllm2024,

  title={MedLLM: A Lightweight Medical Language Model},

  author={Raihan},

  year={2024},

  publisher={HuggingFace},

  url={https://huggingface.co/raihan-js/medllm-10m}

}

```

## Contact

For questions about this model, please open an issue in the model repository.

---

**Last Updated**: December 2024
**Model Version**: 1.0-alpha
**Status**: Experimental - Not for production use