deepakkumar07's picture
remove mardown notations
96752e6 verified
|
raw
history blame
2.24 kB
metadata
base_model: meta/llama-3.2-3b-instruct-bnb-4bit
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - llama
  - gguf
license: apache-2.0
language:
  - en

Llama-3.2-3B-Instruct

Hugging Face Apache 2.0 License

Model Description

Llama-3.2-3B-Instruct is a fine-tuned version of the Llama-3.2-3B base model, optimized for instruction-following and conversational AI tasks. This model is trained using Unsloth for efficient fine-tuning and inference. It supports the GGUF format, making it ideal for running on various hardware setups.

Features

  • 🦙 Fine-tuned for instruction-following
  • Optimized for GGUF format (efficient inference)
  • 🔥 Compatible with Transformers & Text-Generation-Inference
  • 🌍 Supports English language
  • 🏗️ Trained using Unsloth for high performance

Model Usage

Install Dependencies

To use this model, install the required libraries:

pip install transformers text-generation gguf unsloth

Load the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "deepakkumar07/Llama-3.2-3B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

input_text = "What is the capital of France?"
inputs = tokenizer(input_text, return_tensors="pt")

output = model.generate(**inputs)
print(tokenizer.decode(output[0], skip_special_tokens=True))

GGUF Inference

For GGUF-based inference, use llama.cpp or text-generation-inference:

pip install llama-cpp-python
from llama_cpp import Llama

llm = Llama(model_path="Llama-3.2-3B-Instruct.gguf")
response = llm("Tell me a joke.")
print(response)

License

This model is licensed under Apache 2.0.

Acknowledgments