Add files using upload-large-folder tool
Browse files- README.md +74 -0
- config.json +41 -0
- diffusion_pytorch_model.safetensors +3 -0
- transformer_blocks_0.safetensors +3 -0
- transformer_blocks_1.safetensors +3 -0
- transformer_blocks_10.safetensors +3 -0
- transformer_blocks_11.safetensors +3 -0
- transformer_blocks_12.safetensors +3 -0
- transformer_blocks_13.safetensors +3 -0
- transformer_blocks_14.safetensors +3 -0
- transformer_blocks_15.safetensors +3 -0
- transformer_blocks_16.safetensors +3 -0
- transformer_blocks_17.safetensors +3 -0
- transformer_blocks_18.safetensors +3 -0
- transformer_blocks_19.safetensors +3 -0
- transformer_blocks_2.safetensors +3 -0
- transformer_blocks_20.safetensors +3 -0
- transformer_blocks_21.safetensors +3 -0
- transformer_blocks_22.safetensors +3 -0
- transformer_blocks_23.safetensors +3 -0
- transformer_blocks_24.safetensors +3 -0
- transformer_blocks_25.safetensors +3 -0
- transformer_blocks_26.safetensors +3 -0
- transformer_blocks_27.safetensors +3 -0
- transformer_blocks_28.safetensors +3 -0
- transformer_blocks_29.safetensors +3 -0
- transformer_blocks_3.safetensors +3 -0
- transformer_blocks_30.safetensors +3 -0
- transformer_blocks_31.safetensors +3 -0
- transformer_blocks_32.safetensors +3 -0
- transformer_blocks_33.safetensors +3 -0
- transformer_blocks_34.safetensors +3 -0
- transformer_blocks_35.safetensors +3 -0
- transformer_blocks_36.safetensors +3 -0
- transformer_blocks_37.safetensors +3 -0
- transformer_blocks_4.safetensors +3 -0
- transformer_blocks_5.safetensors +3 -0
- transformer_blocks_6.safetensors +3 -0
- transformer_blocks_7.safetensors +3 -0
- transformer_blocks_8.safetensors +3 -0
- transformer_blocks_9.safetensors +3 -0
README.md
ADDED
@@ -0,0 +1,74 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model:
|
3 |
+
- stabilityai/stable-diffusion-3.5-large
|
4 |
+
base_model_relation: quantized
|
5 |
+
pipeline_tag: text-to-image
|
6 |
+
tags:
|
7 |
+
- dfloat11
|
8 |
+
- df11
|
9 |
+
- lossless compression
|
10 |
+
- 70% size, 100% accuracy
|
11 |
+
---
|
12 |
+
|
13 |
+
## DFloat11 Compressed Model: `stabilityai/stable-diffusion-3.5-large`
|
14 |
+
|
15 |
+
This is a **losslessly compressed** version of [`stabilityai/stable-diffusion-3.5-large`](https://huggingface.co/stabilityai/stable-diffusion-3.5-large) using our custom **DFloat11** format.
|
16 |
+
|
17 |
+
### 💡 Key Benefits
|
18 |
+
|
19 |
+
* ✅ **Bit-for-bit identical outputs** to the original BFloat16 model
|
20 |
+
* 📉 **\~30% reduction in model size** (from **16GB** → **11.3GB**)
|
21 |
+
* 🧠 **Lower memory requirements**: now runs on **16GB GPUs**
|
22 |
+
* ⚡ **Minimal performance overhead**: barely any slower than the full model
|
23 |
+
|
24 |
+
DFloat11 compresses the model weights while preserving full numerical precision. This allows you to run `stabilityai/stable-diffusion-3.5-large` on more accessible hardware, with **no compromise in output quality**.
|
25 |
+
|
26 |
+
### 🔍 How It Works
|
27 |
+
|
28 |
+
DFloat11 compresses model weights using **Huffman coding** of BFloat16 exponent bits, combined with **hardware-aware algorithmic designs** that enable efficient on-the-fly decompression directly on the GPU. During inference, the weights remain compressed in GPU memory and are **decompressed just before matrix multiplications**, then **immediately discarded after use** to minimize memory footprint.
|
29 |
+
|
30 |
+
Advantages:
|
31 |
+
* **Fully GPU-based**: no CPU decompression or host-device data transfer.
|
32 |
+
* DFloat11 is **much faster than CPU-offloading approaches**, enabling practical deployment in memory-constrained environments.
|
33 |
+
* The compression is **fully lossless**, guaranteeing that the model’s outputs are **bit-for-bit identical** to those of the original model.
|
34 |
+
|
35 |
+
### 🔧 How to Use
|
36 |
+
|
37 |
+
1. Install or upgrade the DFloat11 pip package *(installs the CUDA kernel automatically; requires a CUDA-compatible GPU and PyTorch installed)*:
|
38 |
+
|
39 |
+
```bash
|
40 |
+
pip install -U dfloat11[cuda12]
|
41 |
+
# or if you have CUDA version 11:
|
42 |
+
# pip install -U dfloat11[cuda11]
|
43 |
+
```
|
44 |
+
|
45 |
+
2. Install or upgrade the diffusers package.
|
46 |
+
|
47 |
+
```bash
|
48 |
+
pip install -U diffusers
|
49 |
+
```
|
50 |
+
|
51 |
+
3. To use the DFloat11 model, run the following example code in Python:
|
52 |
+
```python
|
53 |
+
import torch
|
54 |
+
from diffusers import StableDiffusion3Pipeline
|
55 |
+
from dfloat11 import DFloat11Model
|
56 |
+
|
57 |
+
pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3.5-large", torch_dtype=torch.bfloat16)
|
58 |
+
pipe.enable_model_cpu_offload()
|
59 |
+
|
60 |
+
DFloat11Model.from_pretrained('DFloat11/stable-diffusion-3.5-large-DF11', device='cpu', bfloat16_model=pipe.transformer)
|
61 |
+
|
62 |
+
image = pipe(
|
63 |
+
"A capybara holding a sign that reads Hello World",
|
64 |
+
num_inference_steps=28,
|
65 |
+
guidance_scale=3.5,
|
66 |
+
).images[0]
|
67 |
+
image.save("capybara.png")
|
68 |
+
```
|
69 |
+
|
70 |
+
### 📄 Learn More
|
71 |
+
|
72 |
+
* **Paper**: [70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float](https://arxiv.org/abs/2504.11651)
|
73 |
+
* **GitHub**: [https://github.com/LeanModels/DFloat11](https://github.com/LeanModels/DFloat11)
|
74 |
+
* **HuggingFace**: [https://huggingface.co/DFloat11](https://huggingface.co/DFloat11)
|
config.json
ADDED
@@ -0,0 +1,41 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"dfloat11_config": {
|
3 |
+
"bytes_per_thread": 8,
|
4 |
+
"pattern_dict": {
|
5 |
+
"transformer_blocks\\.([0-9]|[1-2][0-9]|3[0-6])": [
|
6 |
+
"norm1.linear",
|
7 |
+
"norm1_context.linear",
|
8 |
+
"attn.to_q",
|
9 |
+
"attn.to_k",
|
10 |
+
"attn.to_v",
|
11 |
+
"attn.add_k_proj",
|
12 |
+
"attn.add_v_proj",
|
13 |
+
"attn.add_q_proj",
|
14 |
+
"attn.to_out.0",
|
15 |
+
"attn.to_add_out",
|
16 |
+
"ff.net.0.proj",
|
17 |
+
"ff.net.2",
|
18 |
+
"ff_context.net.0.proj",
|
19 |
+
"ff_context.net.2"
|
20 |
+
],
|
21 |
+
"transformer_blocks\\.37": [
|
22 |
+
"norm1.linear",
|
23 |
+
"norm1_context.linear",
|
24 |
+
"attn.to_q",
|
25 |
+
"attn.to_k",
|
26 |
+
"attn.to_v",
|
27 |
+
"attn.add_k_proj",
|
28 |
+
"attn.add_v_proj",
|
29 |
+
"attn.add_q_proj",
|
30 |
+
"attn.to_out.0",
|
31 |
+
"ff.net.0.proj",
|
32 |
+
"ff.net.2"
|
33 |
+
]
|
34 |
+
},
|
35 |
+
"threads_per_block": [
|
36 |
+
512
|
37 |
+
],
|
38 |
+
"version": "0.2.0"
|
39 |
+
},
|
40 |
+
"model_type": "llama"
|
41 |
+
}
|
diffusion_pytorch_model.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0229ed114459974eb79f08b34652c3d5fd7fdd944362f56df3e9187a2f8b2f75
|
3 |
+
size 258416552
|
transformer_blocks_0.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5f8c3026a8873ef5131d4b73915da68427e22c71318e444077df5b7bc377b6e7
|
3 |
+
size 293263365
|
transformer_blocks_1.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6e2eb493bedb885318a0bec9fe5e4bfeb900cb5711b3f74b2ed3e16a7b6463d8
|
3 |
+
size 293439926
|
transformer_blocks_10.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:57563e09f967de64f53092295c7034edce872e5acead5f54687a7adb4c384ee0
|
3 |
+
size 294118731
|
transformer_blocks_11.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d8e6974574df12bcb221a7ca4e2194798906e2b568b9a781bd7a8afdd382cad6
|
3 |
+
size 294224693
|
transformer_blocks_12.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ebcc0e922c09c28a7c4ea1d8f5d7b60210f9b7cbb1d485c4d3504d36a9bb3250
|
3 |
+
size 294276922
|
transformer_blocks_13.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f7b49b4ffe66f02ff0c9225a4c28b8fd8a0001b4458c202d0b3f8d681e684489
|
3 |
+
size 294047127
|
transformer_blocks_14.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a30adc4696332bce11a68f3cfc20bc12f11f91aac192f35e7e91236cdfd0bcc0
|
3 |
+
size 294116040
|
transformer_blocks_15.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:910ecb8df498ea54de2165dee8eba2eb056e6906ffe6d3dde68de70a9078ec16
|
3 |
+
size 293107767
|
transformer_blocks_16.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:bdfd7729442196b7dc755649bd116453d7390af5b10d858561e7a8a527e38075
|
3 |
+
size 293282594
|
transformer_blocks_17.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3314d615c869e29358be859f97b639077581f98af1715e2c70ab938c1c3013c1
|
3 |
+
size 293318114
|
transformer_blocks_18.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6999a45c7c09652ab7e632d4fdaab5b274892da0b645c428a448c1e823c7b606
|
3 |
+
size 293466443
|
transformer_blocks_19.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:edfdad308b7a1fa3898c4980b9de934f5e007a413a901518c86946ba7f1f5d88
|
3 |
+
size 293558196
|
transformer_blocks_2.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:aeaa1df2c709d1f93269b836d97dc6c3102825447887f0fe42d05636a0ce4798
|
3 |
+
size 293159123
|
transformer_blocks_20.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:34cb196212c03aeac8b2264cc6512564e1365adb97c9a92012927f77ce0a542b
|
3 |
+
size 292956007
|
transformer_blocks_21.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c122aa52bc0ff3620f9ffabf2c43fc10783b13cf1860e05dcc8d882223e36d37
|
3 |
+
size 293102027
|
transformer_blocks_22.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3ab3b7d3300960ac2c8530ad7df991d0aa09a49873915195de94ce957dd0af9d
|
3 |
+
size 293056363
|
transformer_blocks_23.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8634d76d6666c43be749a11572f2d9b86f142b88cd15453d9b5f7a589d9c9d73
|
3 |
+
size 293014034
|
transformer_blocks_24.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:94996934cd1200c7d63d0e847c3a7944e6e35ffb09f9b1574cceecda23d3ad12
|
3 |
+
size 293393784
|
transformer_blocks_25.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b474b8695b3c0ef3259680027c2727215db139879e8a277db2e3f3939ed2f8c4
|
3 |
+
size 293805244
|
transformer_blocks_26.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c366231f23477acf6c4fb6133b6fd74620ddee33cb363f72f20e72f2536f47c2
|
3 |
+
size 294264662
|
transformer_blocks_27.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e80bcbade23c0973145f3ca459104fb42ac9d957240916fe9598300879c3cf5e
|
3 |
+
size 294308470
|
transformer_blocks_28.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e3d7ea308fc49bc319f20dfbbdc8952a1a377b0dd4e038e297bb83c011f08d84
|
3 |
+
size 294406188
|
transformer_blocks_29.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:651ca98314c908e8aa3738292dbfb2167a50f6a17b10a02cb082a5510edaa782
|
3 |
+
size 294757959
|
transformer_blocks_3.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:08de4952af5a29560045a2ed2299899a5b65deffd40dc1a631e9535fdb007ac5
|
3 |
+
size 293286267
|
transformer_blocks_30.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7068b717bdadf5e9cadc1e529f408e92c0fa98382597ee3ff5f358bbde9cd56a
|
3 |
+
size 295494479
|
transformer_blocks_31.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:eabd299a80813c16e8a64d5092b7a2a03f6a87bd3c3bc10470b1e891c3afb63e
|
3 |
+
size 296107463
|
transformer_blocks_32.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b1839ebb8fa191aaaae42b08930417b1adea55d9a798b1484e6ef8606c7a4612
|
3 |
+
size 297059980
|
transformer_blocks_33.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:eb2bdcc0a8449085a2bab5774165e68bb0781b4bb728cf5ae64f666732594827
|
3 |
+
size 297456588
|
transformer_blocks_34.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a8f2984a8c4de8b4d8776cd04db8f5c7b46b2e0cf807d127d347dff0278a2ea1
|
3 |
+
size 296582891
|
transformer_blocks_35.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8030712ce6180696d39e4cfbaa2b627c8ac950ef92f033901bb5543151ddcad0
|
3 |
+
size 296642435
|
transformer_blocks_36.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a479472a639de331a972f9d4dd37182ff83162de7db47deb29c5f5d9aa95c211
|
3 |
+
size 296436310
|
transformer_blocks_37.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fe232a03a48cd4a5058120b3190ffd48f106dbc76ecca7be6adeba4cea44eb65
|
3 |
+
size 188271563
|
transformer_blocks_4.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9c8689819a144276360e387b850f751b8517bb01f6bb28f386ad4a4e3699e027
|
3 |
+
size 293483265
|
transformer_blocks_5.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:200f243ef41b4ca4ace02c76603948a35cbd9fce10edeb41ecbaf74e144115d5
|
3 |
+
size 292641592
|
transformer_blocks_6.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9edefff5a9f391e25516cdc3399749658f264fc3bb99a953b3f1343ae5b11577
|
3 |
+
size 291930372
|
transformer_blocks_7.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c4a030a5e071e8daedfe5bc41a6e0feb14223415d1560d778cd4deeddbc2fe2e
|
3 |
+
size 291436516
|
transformer_blocks_8.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:266dafe8fff840cc052955296460599ef6a3d268f4ead0baac957f9cf21243c5
|
3 |
+
size 293588625
|
transformer_blocks_9.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a0bdbb13e47daab3b06238889f34084acacc70c560c8cb8967620bc6ec3d42d9
|
3 |
+
size 293635780
|