rootxhacker
/

llama3-diffusion

Text Generation

text-generation-inference

Model card Files Files and versions

llama3-diffusion / README.md

rootxhacker's picture

Add model card

5b80ffd verified 5 months ago

|

history blame contribute delete

1.14 kB

	---
	license: apache-2.0
	base_model: unsloth/Meta-Llama-3.1-8B-Instruct
	tags:
	- diffusion
	- language-model
	- llama
	- text-generation
	library_name: transformers
	pipeline_tag: text-generation
	---

	# Llama-3.1-8B Diffusion Model (LAD)

	This is a Language Autoregressive Diffusion (LAD) model based on Llama-3.1-8B-Instruct.

	## Features
	- 🎯 Dual mode: Autoregressive + Diffusion generation
	- 🚀 Cosine noise schedule with 1000 timesteps
	- 🧠 LoRA fine-tuning (rank 32)
	- ⚡ Custom diffusion components

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model = AutoModelForCausalLM.from_pretrained("rootxhacker/llama3-diffusion")
	tokenizer = AutoTokenizer.from_pretrained("rootxhacker/llama3-diffusion")

	# Generate text
	inputs = tokenizer("The future of AI", return_tensors="pt")
	outputs = model.generate(**inputs, max_length=100)
	print(tokenizer.decode(outputs[0]))
	```

	## Training Details
	- Base: Meta-Llama-3.1-8B-Instruct
	- Dataset: PatrickHaller/cosmopedia-v2-1B
	- Framework: Unsloth + Custom Diffusion
	- Context: 256 tokens
	- Training: 60% AR + 40% Diffusion

	Uploaded: 2025-06-08 23:13