YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
MiniCPM-V-4.5-abliterated-int4
This is a 4-bit quantized version of huihui-ai/Huihui-MiniCPM-V-4_5-abliterated using bitsandbytes NF4 quantization.
Model Details
- Base Model: huihui-ai/Huihui-MiniCPM-V-4_5-abliterated
- Quantization: 4-bit (NF4) using bitsandbytes
- Model Size: ~6.4 GB (85.8% reduction from original 45.28 GB)
- Compute dtype: float16
- Double quantization: Disabled for better performance
Quantization Configuration
{
"load_in_4bit": true,
"bnb_4bit_compute_dtype": "float16",
"bnb_4bit_quant_type": "nf4",
"bnb_4bit_use_double_quant": false,
"llm_int8_skip_modules": ["out_proj", "kv_proj", "lm_head"],
"quant_method": "bitsandbytes"
}
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"wavespeed/MiniCPM-V-4_5-abliterated-int4",
device_map="auto",
trust_remote_code=True,
torch_dtype=torch.float16
)
tokenizer = AutoTokenizer.from_pretrained(
"wavespeed/MiniCPM-V-4_5-abliterated-int4",
trust_remote_code=True
)
Requirements
- transformers
- bitsandbytes
- torch
- accelerate
Note on File Size
The model files appear large (~6.4 GB) despite being 4-bit quantized. This is expected behavior for bitsandbytes quantization, which stores weights in a format that enables efficient on-the-fly dequantization during inference. The actual memory usage during runtime will be significantly lower than the file size suggests.
License
Same as the original model - please refer to the base model's license.
Acknowledgments
- Original model by huihui-ai
- Quantization approach inspired by openbmb/MiniCPM-V-4_5-int4
- Downloads last month
- 23
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support