01GangaPutraBheeshma
/

colab_code_generator_FT_code_gen_UT

Text Generation

text2text-generation

Model card Files Files and versions

01GangaPutraBheeshma commited on Nov 25, 2023

Commit

3a57615

·

1 Parent(s): df3f221

Update README.md

Files changed (1) hide show

README.md +47 -0

README.md CHANGED Viewed

@@ -31,6 +31,8 @@ colab_code_generator_FT_code_gen_UT, an instruction-following large language mod
 # Getting Started
 Loading the fine-tuned Code Generator
 ```
 from peft import AutoPeftModelForCausalLM>
@@ -38,6 +40,50 @@ test_model_UT = AutoPeftModelForCausalLM.from_pretrained("01GangaPutraBheeshma/c
 test_tokenizer_UT = AutoTokenizer.from_pretrained("01GangaPutraBheeshma/colab_code_generator_FT_code_gen_UT")
 ```
 # Documentation
 This model was fine-tuned using LoRA because I wanted the model's weights to be efficient in solving other types of Python problems(Ones that were not included in the training data).
@@ -75,3 +121,4 @@ bnb_config = BitsAndBytesConfig(

 # Getting Started
+## Installation
 Loading the fine-tuned Code Generator
 ```
 from peft import AutoPeftModelForCausalLM>
 test_tokenizer_UT = AutoTokenizer.from_pretrained("01GangaPutraBheeshma/colab_code_generator_FT_code_gen_UT")
 ```
+## Usage
+For re-training this model, I would highly recommend using this format to provide input to the tokenizer.
+```
+def prompt_instruction_format(sample):
+  return f"""### Instruction:
+    Use the Task below and the Input given to write the Response, which is a programming code that can solve the following Task:
+    ### Task:
+    {sample['instruction']}
+    ### Input:
+    {sample['input']}
+    ### Response:
+    {sample['output']}
+```
+Then, we can leverage the above function to format our input prompts that can be pre-processed and used in the Model Training using Supervised Fine-Tuning or SFTTrainer Class.
+```
+trainer = SFTTrainer(
+    model=model,
+    train_dataset=code_dataset,
+    peft_config=peft_config,
+    max_seq_length=2048,
+    tokenizer=tokenizer,
+    packing=True,
+    formatting_func=prompt_instruction_format,
+    args=trainingArgs,
+)
+```
+This is a crucial step when we perform Reinforcement Learning with Human Feedback or RLHF for short. Here are the six reasons why its important:
+1. Sample Efficiency
+2. Task Adaptation
+3. Transfer Learning
+4. Human Guidance
+5. Reducing Exploration Challenges
+6. Addressing Distribution Shift
 # Documentation
 This model was fine-tuned using LoRA because I wanted the model's weights to be efficient in solving other types of Python problems(Ones that were not included in the training data).