YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

MiniCPM-V-4.5-abliterated-int4

This is a 4-bit quantized version of huihui-ai/Huihui-MiniCPM-V-4_5-abliterated using bitsandbytes NF4 quantization.

Model Details

  • Base Model: huihui-ai/Huihui-MiniCPM-V-4_5-abliterated
  • Quantization: 4-bit (NF4) using bitsandbytes
  • Model Size: ~6.4 GB (85.8% reduction from original 45.28 GB)
  • Compute dtype: float16
  • Double quantization: Disabled for better performance

Quantization Configuration

{
  "load_in_4bit": true,
  "bnb_4bit_compute_dtype": "float16",
  "bnb_4bit_quant_type": "nf4",
  "bnb_4bit_use_double_quant": false,
  "llm_int8_skip_modules": ["out_proj", "kv_proj", "lm_head"],
  "quant_method": "bitsandbytes"
}

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "wavespeed/MiniCPM-V-4_5-abliterated-int4",
    device_map="auto",
    trust_remote_code=True,
    torch_dtype=torch.float16
)

tokenizer = AutoTokenizer.from_pretrained(
    "wavespeed/MiniCPM-V-4_5-abliterated-int4",
    trust_remote_code=True
)

Requirements

  • transformers
  • bitsandbytes
  • torch
  • accelerate

Note on File Size

The model files appear large (~6.4 GB) despite being 4-bit quantized. This is expected behavior for bitsandbytes quantization, which stores weights in a format that enables efficient on-the-fly dequantization during inference. The actual memory usage during runtime will be significantly lower than the file size suggests.

License

Same as the original model - please refer to the base model's license.

Acknowledgments

Downloads last month
23
Safetensors
Model size
8.7B params
Tensor type
F32
·
F16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support