Llama-3.2-3B-Instruct / README.md

deepakkumar07

remove mardown notations

96752e6 verified 8 months ago

preview code

raw

history blame

2.24 kB

metadata

base_model: meta/llama-3.2-3b-instruct-bnb-4bit
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - llama
  - gguf
license: apache-2.0
language:
  - en

Llama-3.2-3B-Instruct

Model Description

Llama-3.2-3B-Instruct is a fine-tuned version of the Llama-3.2-3B base model, optimized for instruction-following and conversational AI tasks. This model is trained using Unsloth for efficient fine-tuning and inference. It supports the GGUF format, making it ideal for running on various hardware setups.

Features

🦙 Fine-tuned for instruction-following
⚡ Optimized for GGUF format (efficient inference)
🔥 Compatible with Transformers & Text-Generation-Inference
🌍 Supports English language
🏗️ Trained using Unsloth for high performance

Model Usage

Install Dependencies

To use this model, install the required libraries:

pip install transformers text-generation gguf unsloth

Load the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "deepakkumar07/Llama-3.2-3B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

input_text = "What is the capital of France?"
inputs = tokenizer(input_text, return_tensors="pt")

output = model.generate(**inputs)
print(tokenizer.decode(output[0], skip_special_tokens=True))

GGUF Inference

For GGUF-based inference, use llama.cpp or text-generation-inference:

pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama(model_path="Llama-3.2-3B-Instruct.gguf")
response = llm("Tell me a joke.")
print(response)

License

This model is licensed under Apache 2.0.

Acknowledgments

Meta's LLaMA
Unsloth Optimization
Hugging Face 🤗 Community