Commit
·
098730b
1
Parent(s):
1919884
Fix: Rename to Multi-Head Latent Attention
Browse files- README.md +3 -3
- insights/architecture.md +1 -1
README.md
CHANGED
|
@@ -13,7 +13,7 @@ license: mit
|
|
| 13 |
|
| 14 |
# DeepSeek Multi-Latent Attention
|
| 15 |
|
| 16 |
-
This repository provides a PyTorch implementation of the Multi-Latent Attention (MLA) mechanism introduced in the DeepSeek-V2 paper. **This is not a trained model, but rather a modular attention implementation** that significantly reduces KV cache for efficient inference while maintaining model performance through its innovative architecture. It can be used as a drop-in attention module in transformer architectures.
|
| 17 |
|
| 18 |
## Key Features
|
| 19 |
|
|
@@ -33,10 +33,10 @@ Or download directly from the HuggingFace repository page.
|
|
| 33 |
|
| 34 |
```python
|
| 35 |
import torch
|
| 36 |
-
from src.mla import
|
| 37 |
|
| 38 |
# Initialize MLA
|
| 39 |
-
mla =
|
| 40 |
d_model=512, # Model dimension
|
| 41 |
num_head=8, # Number of attention heads
|
| 42 |
d_embed=512, # Embedding dimension
|
|
|
|
| 13 |
|
| 14 |
# DeepSeek Multi-Latent Attention
|
| 15 |
|
| 16 |
+
This repository provides a PyTorch implementation of the Multi-Head Latent Attention (MLA) mechanism introduced in the DeepSeek-V2 paper. **This is not a trained model, but rather a modular attention implementation** that significantly reduces KV cache for efficient inference while maintaining model performance through its innovative architecture. It can be used as a drop-in attention module in transformer architectures.
|
| 17 |
|
| 18 |
## Key Features
|
| 19 |
|
|
|
|
| 33 |
|
| 34 |
```python
|
| 35 |
import torch
|
| 36 |
+
from src.mla import MultiHeadLatentAttention
|
| 37 |
|
| 38 |
# Initialize MLA
|
| 39 |
+
mla = MultiHeadLatentAttention(
|
| 40 |
d_model=512, # Model dimension
|
| 41 |
num_head=8, # Number of attention heads
|
| 42 |
d_embed=512, # Embedding dimension
|
insights/architecture.md
CHANGED
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
# Advanced Insights: Multi-Latent Attention Architecture
|
| 2 |
|
| 3 |
## Key Architectural Innovations
|
| 4 |
|
|
|
|
| 1 |
+
# Advanced Insights: Multi-Head Latent Attention Architecture
|
| 2 |
|
| 3 |
## Key Architectural Innovations
|
| 4 |
|