Upload folder using huggingface_hub

Browse files

Files changed (7) hide show

README.md +185 -0
config.json +42 -0
model.safetensors +3 -0
quantize_config.json +13 -0
special_tokens_map.json +30 -0
tokenizer.json +0 -0
tokenizer_config.json +389 -0

README.md ADDED Viewed

	@@ -0,0 +1,185 @@

+---
+license: other
+language:
+- en
+tags:
+- causal-lm
+- code
+metrics:
+- code_eval
+library_name: transformers
+model-index:
+- name: stabilityai/stable-code-instruct-3b
+  results:
+  - task:
+      type: text-generation
+    dataset:
+      type: nuprl/MultiPL-E
+      name: MultiPL-HumanEval (Python)
+    metrics:
+    - name: pass@1
+      type: pass@1
+      value: 32.4
+      verified: false
+  - task:
+      type: text-generation
+    dataset:
+      type: nuprl/MultiPL-E
+      name: MultiPL-HumanEval (C++)
+    metrics:
+    - name: pass@1
+      type: pass@1
+      value: 30.9
+      verified: false
+  - task:
+      type: text-generation
+    dataset:
+      type: nuprl/MultiPL-E
+      name: MultiPL-HumanEval (Java)
+    metrics:
+    - name: pass@1
+      type: pass@1
+      value: 32.1
+      verified: false
+  - task:
+      type: text-generation
+    dataset:
+      type: nuprl/MultiPL-E
+      name: MultiPL-HumanEval (JavaScript)
+    metrics:
+    - name: pass@1
+      type: pass@1
+      value: 32.1
+      verified: false
+  - task:
+      type: text-generation
+    dataset:
+      type: nuprl/MultiPL-E
+      name: MultiPL-HumanEval (PHP)
+    metrics:
+    - name: pass@1
+      type: pass@1
+      value: 24.2
+      verified: false
+  - task:
+      type: text-generation
+    dataset:
+      type: nuprl/MultiPL-E
+      name: MultiPL-HumanEval (Rust)
+    metrics:
+    - name: pass@1
+      type: pass@1
+      value: 23.0
+      verified: false
+---
+GPTQ quantized version of stable-code-instruct-3b model.
+---
+# **Stable Code Instruct 3B**
+[Try it out here: https://huggingface.co/spaces/stabilityai/stable-code-instruct-3b](https://huggingface.co/spaces/stabilityai/stable-code-instruct-3b)
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/63466107f7bd6326925fc770/O7ZkLgqoJprQEWAttX7Hj.png)
+## Model Description
+`stable-code-instruct-3b` is a 2.7B billion parameter decoder-only language model tuned from [`stable-code-3b`](https://huggingface.co/stabilityai/stable-code-3b/). This model was trained on a mix of publicly available datasets, synthetic datasets using [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290).
+This instruct tune demonstrates state-of-the-art performance (compared to models of similar size) on the MultiPL-E metrics across multiple programming languages tested using [BigCode's Evaluation Harness](https://github.com/bigcode-project/bigcode-evaluation-harness/tree/main), and on the code portions of
+[MT Bench](https://klu.ai/glossary/mt-bench-eval).
+The model is finetuned to make it useable in tasks like,
+  - General purpose Code/Software Engineering like conversations.
+  - SQL related generation and conversation.
+## Usage
+Here's how you can run the model use the model:
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+tokenizer = AutoTokenizer.from_pretrained("stabilityai/stable-code-instruct-3b", trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained("stabilityai/stable-code-instruct-3b", torch_dtype=torch.bfloat16, trust_remote_code=True)
+model.eval()
+model = model.cuda()
+messages = [
+    {
+        "role": "system",
+        "content": "You are a helpful and polite assistant",
+    },
+    {
+        "role": "user",
+        "content": "Write a simple website in HTML. When a user clicks the button, it shows a random joke from a list of 4 jokes."
+    },
+]
+prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
+inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
+tokens = model.generate(
+    **inputs,
+    max_new_tokens=1024,
+    temperature=0.5,
+    top_p=0.95,
+    top_k=100,
+    do_sample=True,
+    use_cache=True
+)
+output = tokenizer.batch_decode(tokens[:, inputs.input_ids.shape[-1]:], skip_special_tokens=False)[0]
+```
+## Model Details
+* **Developed by**: [Stability AI](https://stability.ai/)
+* **Model type**: `Stable Code Instruct 3B` model is an auto-regressive language model based on the transformer decoder architecture.
+* **Language(s)**: English
+* **Paper**: [Stable Code Technical Report](https://drive.google.com/file/d/16-DGsR5-qwoPztZ6HcM7KSRUxIXrjlSm/view)
+* **Library**: [Alignment Handbook](https://github.com/huggingface/alignment-handbook.git)
+* **Finetuned from model**: [https://huggingface.co/stabilityai/stable-code-3b](https://huggingface.co/stabilityai/stable-code-3b)
+* **License**: [StabilityAI Non-Commercial Research Community License](https://huggingface.co/stabilityai/stable-code-instruct-3b/blob/main/LICENSE). If you want to use this model for your commercial products or purposes, please contact us [here](https://stability.ai/contact) to learn more.
+* **Contact**: For questions and comments about the model, please email `[email protected]`
+## Performance
+### Multi-PL Benchmark:
+| Model                        | Size | Avg  | Python | C++  | JavaScript | Java | PHP  | Rust |
+|------------------------------|------|------|--------|------|------------|------|------|------|
+| Codellama Instruct           | 7B   | 0.30 | 0.33   | 0.31 | 0.31       | 0.29 | 0.31 | 0.25 |
+| Deepseek Instruct            | 1.3B | 0.44 | 0.52   | **0.52** | 0.41       | **0.46** | 0.45 | 0.28 |
+| Stable Code Instruct (SFT)   | 3B   | 0.44 | 0.55   | 0.45 | 0.42       | 0.42 | 0.44 | 0.32 |
+| Stable Code Instruct (DPO)   | 3B   | **0.47** | **0.59**   | 0.49 | **0.49**       | 0.44 | **0.45** | **0.37** |
+### MT-Bench Coding:
+| Model                       | Size | Score |
+|-----------------------------|------|-----------------|
+| DeepSeek Coder              | 1.3B | 4.6             |
+| Stable Code Instruct (DPO)  | 3B   | **5.8**(ours)             |
+| Stable Code Instruct (SFT)  | 3B   | 5.5             |
+| DeepSeek Coder              | 6.7B | **6.9**             |
+| CodeLlama Instruct          | 7B   | 3.55            |
+| StarChat2                   | 15B  | 5.7             |
+### SQL Performance
+| Model                       | Size | Date  | Group By | Order By | Ratio | Join  | Where |
+|-----------------------------|------|-------|----------|----------|-------|-------|-------|
+| Stable Code Instruct (DPO)  | 3B   | 24.0% | 54.2%    | 68.5%    | 40.0% | 54.2% | 42.8% |
+| DeepSeek-Coder Instruct     | 1.3B | 24.0% | 37.1%    | 51.4%    | 34.3% | 45.7% | 45.7% |
+| SQLCoder                    | 7B   | 64.0% | 82.9%    | 74.3%    | 54.3% | 74.3% | 74.3% |
+## How to Cite
+```bibtex
+@misc{stable-code-instruct-3b,
+      url={[https://huggingface.co/stabilityai/stable-code-3b](https://huggingface.co/stabilityai/stable-code-instruct-3b)},
+      title={Stable Code 3B},
+      author={Phung, Duy, and Pinnaparaju, Nikhil and Adithyan, Reshinth and Zhuravinskyi, Maksym and Tow, Jonathan and Cooper, Nathan}
+}
+```

config.json ADDED Viewed

	@@ -0,0 +1,42 @@

+{
+  "_name_or_path": "stable-code-instruct-3b",
+  "architectures": [
+    "StableLmForCausalLM"
+  ],
+  "attention_dropout": 0.0,
+  "bos_token_id": 0,
+  "eos_token_id": 0,
+  "hidden_act": "silu",
+  "hidden_dropout": 0.0,
+  "hidden_size": 2560,
+  "initializer_range": 0.02,
+  "intermediate_size": 6912,
+  "layer_norm_eps": 1e-05,
+  "max_position_embeddings": 16384,
+  "model_type": "stablelm",
+  "num_attention_heads": 32,
+  "num_hidden_layers": 32,
+  "num_key_value_heads": 32,
+  "partial_rotary_factor": 0.25,
+  "quantization_config": {
+    "bits": 4,
+    "damp_percent": 0.01,
+    "desc_act": true,
+    "group_size": 128,
+    "is_marlin_format": false,
+    "model_file_base_name": null,
+    "model_name_or_path": null,
+    "quant_method": "gptq",
+    "static_groups": false,
+    "sym": true,
+    "true_sequential": true
+  },
+  "rope_scaling": null,
+  "rope_theta": 1000000,
+  "tie_word_embeddings": false,
+  "torch_dtype": "float16",
+  "transformers_version": "4.39.3",
+  "use_cache": false,
+  "use_qkv_bias": false,
+  "vocab_size": 50304
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bd9320800dfa76bdb679ab95b5bc89d52193a258ad7cfcae3552296147f400f5
+size 1838810792

quantize_config.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "bits": 4,
+  "group_size": 128,
+  "damp_percent": 0.01,
+  "desc_act": true,
+  "static_groups": false,
+  "sym": true,
+  "true_sequential": true,
+  "model_name_or_path": null,
+  "model_file_base_name": null,
+  "is_marlin_format": false,
+  "quant_method": "gptq"
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,30 @@

+{
+  "bos_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,389 @@

+{
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<|padding|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50254": {
+      "content": "                        ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50255": {
+      "content": "                       ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50256": {
+      "content": "                      ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50257": {
+      "content": "                     ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50258": {
+      "content": "                    ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50259": {
+      "content": "                   ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50260": {
+      "content": "                  ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50261": {
+      "content": "                 ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50262": {
+      "content": "                ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50263": {
+      "content": "               ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50264": {
+      "content": "              ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50265": {
+      "content": "             ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50266": {
+      "content": "            ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50267": {
+      "content": "           ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50268": {
+      "content": "          ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50269": {
+      "content": "         ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50270": {
+      "content": "        ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50271": {
+      "content": "       ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50272": {
+      "content": "      ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50273": {
+      "content": "     ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50274": {
+      "content": "    ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50275": {
+      "content": "   ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50276": {
+      "content": "  ",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50277": {
+      "content": "<fim_prefix>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50278": {
+      "content": "<fim_middle>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50279": {
+      "content": "<fim_suffix>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50280": {
+      "content": "<fim_pad>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50281": {
+      "content": "<filename>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50282": {
+      "content": "<gh_stars>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50283": {
+      "content": "<issue_start>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50284": {
+      "content": "<issue_comment>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50285": {
+      "content": "<issue_closed>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50286": {
+      "content": "<jupyter_start>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50287": {
+      "content": "<jupyter_text>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50288": {
+      "content": "<jupyter_code>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50289": {
+      "content": "<jupyter_output>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50290": {
+      "content": "<empty_output>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50291": {
+      "content": "<commit_before>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50292": {
+      "content": "<commit_msg>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50293": {
+      "content": "<commit_after>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50294": {
+      "content": "<reponame>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50295": {
+      "content": "<repo_continuation>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50296": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50297": {
+      "content": "<|im_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "50298": {
+      "content": "<|im_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    }
+  },
+  "bos_token": "<|endoftext|>",
+  "chat_template": "{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% set system_message = 'You are a helpful assistant.' %}{% endif %}{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in loop_messages %}{% if loop.index0 == 0 %}{{'<|im_start|>system\n' + system_message + '<|im_end|>\n'}}{% endif %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
+  "clean_up_tokenization_spaces": true,
+  "eos_token": "<|endoftext|>",
+  "model_max_length": 4096,
+  "pad_token": "<|endoftext|>",
+  "tokenizer_class": "GPTNeoXTokenizer",
+  "unk_token": "<|endoftext|>"
+}