metadata
base_model: meta/llama-3.2-3b-instruct-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- gguf
license: apache-2.0
language:
- en
Llama-3.2-3B-Instruct
Model Description
Llama-3.2-3B-Instruct is a fine-tuned version of the Llama-3.2-3B base model, optimized for instruction-following and conversational AI tasks. This model is trained using Unsloth for efficient fine-tuning and inference. It supports the GGUF format, making it ideal for running on various hardware setups.
Features
- 🦙 Fine-tuned for instruction-following
- ⚡ Optimized for GGUF format (efficient inference)
- 🔥 Compatible with Transformers & Text-Generation-Inference
- 🌍 Supports English language
- 🏗️ Trained using Unsloth for high performance
Model Usage
Install Dependencies
To use this model, install the required libraries:
pip install transformers text-generation gguf unsloth
Load the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "deepakkumar07/Llama-3.2-3B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
input_text = "What is the capital of France?"
inputs = tokenizer(input_text, return_tensors="pt")
output = model.generate(**inputs)
print(tokenizer.decode(output[0], skip_special_tokens=True))
GGUF Inference
For GGUF-based inference, use llama.cpp or text-generation-inference:
pip install llama-cpp-python
from llama_cpp import Llama
llm = Llama(model_path="Llama-3.2-3B-Instruct.gguf")
response = llm("Tell me a joke.")
print(response)
License
This model is licensed under Apache 2.0.
Acknowledgments
- Meta's LLaMA
- Unsloth Optimization
- Hugging Face 🤗 Community