Phi-3 Mini 4K Instruct - Alpaca LoRA Fine-tuned
This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct using LoRA (Low-Rank Adaptation) on the Alpaca dataset.
Model Details
- Base Model: microsoft/Phi-3-mini-4k-instruct
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Dataset: tatsu-lab/alpaca (52,002 instruction-following examples)
- Training Duration: ~1.24 hours
- Final Training Loss: 1.0445
- Average Training Loss: 1.0311
Training Configuration
- LoRA Rank: 16
- LoRA Alpha: 32
- LoRA Dropout: 0.05
- Target Modules: qkv_proj, o_proj, gate_proj, up_proj, down_proj
- Learning Rate: 1e-5
- Batch Size: 2 (with gradient accumulation steps: 8)
- Epochs: 1
- Precision: bfloat16
- Gradient Checkpointing: Enabled
Usage
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct", trust_remote_code=True)
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"microsoft/Phi-3-mini-4k-instruct",
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "johnlam90/phi3-mini-4k-instruct-alpaca-lora")
model.eval()
# Format prompt
prompt = "Give three tips for staying healthy."
formatted_prompt = f'''### Instruction:
{prompt}
### Response:
'''
# Generate
inputs = tokenizer(formatted_prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=200,
do_sample=False,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response.split("### Response:")[1].strip())
Performance
The model has been tested with comprehensive safety measures including:
- ✅ NaN clamp protection for stable generation
- ✅ Proper bfloat16 precision handling
- ✅ Consistent and coherent responses across multiple test prompts
- ✅ No numerical instabilities during training or inference
Training Details
This model was fine-tuned with careful attention to:
- Data Formatting: Proper Alpaca instruction/input/output structure
- Numerical Stability: Using bfloat16 precision and conservative hyperparameters
- Memory Efficiency: Gradient checkpointing and optimized batch sizes
- Safety Measures: NaN protection and proper token handling
License
This model is released under the MIT license, following the base model's licensing terms.