update model card README.md
Browse files
    	
        README.md
    CHANGED
    
    | @@ -6,7 +6,6 @@ tags: | |
| 6 | 
             
            model-index:
         | 
| 7 | 
             
            - name: starcoder-cpp2py-newsnippet1
         | 
| 8 | 
             
              results: []
         | 
| 9 | 
            -
            library_name: peft
         | 
| 10 | 
             
            ---
         | 
| 11 |  | 
| 12 | 
             
            <!-- This model card has been generated automatically according to the information the Trainer had access to. You
         | 
| @@ -16,7 +15,7 @@ should probably proofread and complete it, then remove this comment. --> | |
| 16 |  | 
| 17 | 
             
            This model is a fine-tuned version of [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) on an unknown dataset.
         | 
| 18 | 
             
            It achieves the following results on the evaluation set:
         | 
| 19 | 
            -
            - Loss: 0. | 
| 20 |  | 
| 21 | 
             
            ## Model description
         | 
| 22 |  | 
| @@ -36,11 +35,11 @@ More information needed | |
| 36 |  | 
| 37 | 
             
            The following hyperparameters were used during training:
         | 
| 38 | 
             
            - learning_rate: 9e-05
         | 
| 39 | 
            -
            - train_batch_size:  | 
| 40 | 
            -
            - eval_batch_size:  | 
| 41 | 
             
            - seed: 42
         | 
| 42 | 
            -
            - gradient_accumulation_steps:  | 
| 43 | 
            -
            - total_train_batch_size:  | 
| 44 | 
             
            - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
         | 
| 45 | 
             
            - lr_scheduler_type: cosine
         | 
| 46 | 
             
            - lr_scheduler_warmup_steps: 15
         | 
| @@ -50,20 +49,16 @@ The following hyperparameters were used during training: | |
| 50 |  | 
| 51 | 
             
            | Training Loss | Epoch | Step | Validation Loss |
         | 
| 52 | 
             
            |:-------------:|:-----:|:----:|:---------------:|
         | 
| 53 | 
            -
            | 4. | 
| 54 | 
            -
            | 0. | 
| 55 | 
            -
            | 0. | 
| 56 | 
            -
            | 0. | 
| 57 | 
            -
            | 0. | 
| 58 | 
            -
            | 0. | 
| 59 |  | 
| 60 |  | 
| 61 | 
             
            ### Framework versions
         | 
| 62 |  | 
| 63 | 
            -
            - PEFT 0.5.0.dev0
         | 
| 64 | 
            -
            - PEFT 0.5.0.dev0
         | 
| 65 | 
            -
            - PEFT 0.5.0.dev0
         | 
| 66 | 
            -
            - PEFT 0.5.0.dev0
         | 
| 67 | 
             
            - Transformers 4.32.0.dev0
         | 
| 68 | 
             
            - Pytorch 2.0.1+cu117
         | 
| 69 | 
             
            - Datasets 2.12.0
         | 
|  | |
| 6 | 
             
            model-index:
         | 
| 7 | 
             
            - name: starcoder-cpp2py-newsnippet1
         | 
| 8 | 
             
              results: []
         | 
|  | |
| 9 | 
             
            ---
         | 
| 10 |  | 
| 11 | 
             
            <!-- This model card has been generated automatically according to the information the Trainer had access to. You
         | 
|  | |
| 15 |  | 
| 16 | 
             
            This model is a fine-tuned version of [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) on an unknown dataset.
         | 
| 17 | 
             
            It achieves the following results on the evaluation set:
         | 
| 18 | 
            +
            - Loss: 0.1961
         | 
| 19 |  | 
| 20 | 
             
            ## Model description
         | 
| 21 |  | 
|  | |
| 35 |  | 
| 36 | 
             
            The following hyperparameters were used during training:
         | 
| 37 | 
             
            - learning_rate: 9e-05
         | 
| 38 | 
            +
            - train_batch_size: 32
         | 
| 39 | 
            +
            - eval_batch_size: 32
         | 
| 40 | 
             
            - seed: 42
         | 
| 41 | 
            +
            - gradient_accumulation_steps: 8
         | 
| 42 | 
            +
            - total_train_batch_size: 256
         | 
| 43 | 
             
            - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
         | 
| 44 | 
             
            - lr_scheduler_type: cosine
         | 
| 45 | 
             
            - lr_scheduler_warmup_steps: 15
         | 
|  | |
| 49 |  | 
| 50 | 
             
            | Training Loss | Epoch | Step | Validation Loss |
         | 
| 51 | 
             
            |:-------------:|:-----:|:----:|:---------------:|
         | 
| 52 | 
            +
            | 4.3812        | 0.17  | 25   | 0.4652          |
         | 
| 53 | 
            +
            | 0.2923        | 0.33  | 50   | 0.2125          |
         | 
| 54 | 
            +
            | 0.2148        | 0.5   | 75   | 0.2013          |
         | 
| 55 | 
            +
            | 0.2051        | 0.67  | 100  | 0.1971          |
         | 
| 56 | 
            +
            | 0.2003        | 0.83  | 125  | 0.1964          |
         | 
| 57 | 
            +
            | 0.1882        | 1.05  | 150  | 0.1961          |
         | 
| 58 |  | 
| 59 |  | 
| 60 | 
             
            ### Framework versions
         | 
| 61 |  | 
|  | |
|  | |
|  | |
|  | |
| 62 | 
             
            - Transformers 4.32.0.dev0
         | 
| 63 | 
             
            - Pytorch 2.0.1+cu117
         | 
| 64 | 
             
            - Datasets 2.12.0
         |