benjamin
/

Gemma2-2B-IT-with-Qwen2-Tokenizer

Text Generation

text-generation-inference

Model card Files Files and versions

Gemma2-2B-IT-with-Qwen2-Tokenizer / README.md

benjamin's picture

Update README.md

6b3ffcd verified 7 months ago

|

history blame contribute delete

1.84 kB

	---
	library_name: transformers
	base_model:
	- google/gemma-2-2b-it
	---

	# Gemma2-2B Instruction Tuned Model (Transferred to Qwen Tokenizer) Model Card

	Gemma2-2B-IT transferred to the Qwen2 Tokenizer. The model approximately preserves performance of the original on most benchmarks, except for some slight degradations.

	## Model Details

	- Base Model: Gemma2-2B
	- Tokenization: Transferred to the Qwen Tokenizer
	- Training Methodology: Instruction-tuned Gemma2-2B-IT transferred to the Qwen Tokenizer


	\| Benchmark \| Gemma2-2B w/ Qwen Tokenizer \| Original Gemma2-2B-IT \|
	\|---------------\|------------------------------------\|------------------------\|
	\| PiQA \| 76.9 \| 79.6 \|
	\| HS \| 70.7 \| 72.5 \|
	\| ARC-C \| 46.8 \| 50.4 \|
	\| BoolQ \| 82.8 \| 83.8 \|
	\| MMLU \| 53.8 \| 56.9 \|
	\| Arith. \| 83.9 \| 84.8 \|
	\| IFEval \| 62.5 \| 62.5 \|




	## Model Details

	Details on the training methodology are forthcoming.

	## Use

	```python
	import torch
	from transformers import pipeline

	pipe = pipeline(
	"text-generation",
	model="benjamin/Gemma2-2B-IT-with-Qwen2-Tokenizer",
	model_kwargs={"torch_dtype": torch.bfloat16},
	device="cuda", # replace with "mps" to run on a Mac device
	)

	messages = [
	{"role": "user", "content": "Who are you? Please, answer in pirate-speak."},
	]

	outputs = pipe(messages, max_new_tokens=256)
	assistant_response = outputs[0]["generated_text"][-1]["content"].strip()
	print(assistant_response)
	```