metadata
license: mit
language:
- en
base_model:
- unsloth/phi-4
- microsoft/phi-4
pipeline_tag: text-generation
Phi-4 converted for ExLlamaV2
ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs.
Filename | Quant type | File Size | Vram* |
---|---|---|---|
phi-4_hb8_3bpw | 3.00 bits per weight | 6.66 GB | 10,3 GB |
phi-4_hb8_4bpw | 4.00 bits per weight | 8.36 GB | 11,9 GB |
phi-4_hb8_5bpw | 5.00 bits per weight | 10.1 GB | 13,5 GB |
phi-4_hb8_6bpw | 6.00 bits per weight | 11.8 GB | 15,1 GB |
phi-4_hb8_7bpw | 7.00 bits per weight | 13.5 GB | 16,7 GB |
phi-4_hb8_8bpw | 8.00 bits per weight | 15.2 GB | 18,2 GB |
*at 16k context |
Phi-4 Model Card
Model Summary
Developers | Microsoft Research |
Description | phi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning.phi-4 underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures |
Architecture | 14B parameters, dense decoder-only Transformer model |
Context length | 16384 tokens |
Usage
Input Formats
Given the nature of the training data, phi-4
is best suited for prompts using the chat format as follows:
<|im_start|>system<|im_sep|>
You are a medieval knight and must provide explanations to modern people.<|im_end|>
<|im_start|>user<|im_sep|>
How should I explain the Internet?<|im_end|>
<|im_start|>assistant<|im_sep|>
With ExUI:
Edit exui/backend/prompts.py
class PromptFormat_phi4(PromptFormat):
description = "Phi-4 format"
def __init__(self):
super().__init__()
pass
def is_instruct(self):
return True
def stop_conditions(self, tokenizer, settings):
return \
[tokenizer.eos_token_id,
"""<|im_end|>"""]
def format(self, prompt, response, system_prompt, settings):
text = ""
if system_prompt and system_prompt.strip() != "":
text += "<|im_start|>system\n"
text += system_prompt
text += "\n<|im_end|>\n"
text += "<|im_start|>user\n"
text += prompt
text += "<|im_end|>\n"
text += "<|im_start|>assistant\n"
if response:
text += response
text += "<|im_end|>\n"
return text
def context_bos(self):
return True
prompt_formats = \
{
"Chat-RP": PromptFormat_raw,
"Llama-chat": PromptFormat_llama,
"Llama3-instruct": PromptFormat_llama3,
"ChatML": PromptFormat_chatml,
"TinyLlama-chat": PromptFormat_tinyllama,
"MistralLite": PromptFormat_mistrallite,
"Phind-CodeLlama": PromptFormat_phind_codellama,
"Deepseek-chat": PromptFormat_deepseek_chat,
"Deepseek-instruct": PromptFormat_deepseek_instruct,
"OpenChat": PromptFormat_openchat,
"Gemma": PromptFormat_gemma,
"Cohere": PromptFormat_cohere,
"Phi3-instruct": PromptFormat_phi3,
"Phi4": PromptFormat_phi4,
"Granite": PromptFormat_granite,
"Mistral V1": PromptFormat_mistralv1,
"Mistral V2/V3": PromptFormat_mistralv2v3,
"Mistral V3 (Tekken)": PromptFormat_mistralTekken,
}