---
base_model: lvwerra/gpt2-imdb
tags:
- generated_from_trainer
model-index:
- name: gpt-imdb-jsd-beta_0.1
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# gpt-imdb-jsd-beta_0.1

This model is a fine-tuned version of [lvwerra/gpt2-imdb](https://huggingface.co/lvwerra/gpt2-imdb) on an unknown dataset.
It achieves the following results on the evaluation set:
- Step: 7000
- Loss: 0.1422
- Rewards/chosen: -6.6308
- Rewards/rejected: -12.9931
- Rewards/accuracies: 0.9396
- Rewards/margins: 6.3623
- Logps/rejected: -393.6160
- Logps/chosen: -301.5730
- Logits/rejected: -40.9101
- Logits/chosen: -42.7380

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Adam with betas=(0.9,0.99) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 150
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
| 0.2783        | 0.21  | 500  | 0.3575          | -1.6510        | -3.6200          | 0.8458             | 1.9690          | -299.8852      | -251.7749    | -34.0335        | -35.2131      |
| 0.3254        | 0.42  | 1000 | 0.2845          | -2.6765        | -5.5357          | 0.8771             | 2.8593          | -319.0428      | -262.0301    | -41.3238        | -42.6399      |
| 0.187         | 0.63  | 1500 | 0.2520          | -4.2045        | -7.9801          | 0.8875             | 3.7756          | -343.4868      | -277.3105    | -36.4710        | -37.8971      |
| 0.2236        | 0.83  | 2000 | 0.1916          | -3.9591        | -8.0388          | 0.9313             | 4.0797          | -344.0737      | -274.8567    | -35.8180        | -37.3586      |
| 0.1544        | 1.04  | 2500 | 0.1671          | -4.7747        | -9.4384          | 0.9333             | 4.6637          | -358.0689      | -283.0118    | -38.2421        | -39.6906      |
| 0.285         | 1.25  | 3000 | 0.1728          | -5.7913        | -11.0242         | 0.9271             | 5.2329          | -373.9274      | -293.1786    | -39.8869        | -41.8088      |
| 0.3249        | 1.46  | 3500 | 0.1585          | -5.3924        | -11.0092         | 0.9313             | 5.6168          | -373.7777      | -289.1895    | -41.4103        | -43.3052      |
| 0.2288        | 1.67  | 4000 | 0.1544          | -5.7770        | -11.2642         | 0.9333             | 5.4872          | -376.3274      | -293.0356    | -39.3995        | -41.1619      |
| 0.1367        | 1.88  | 4500 | 0.1463          | -5.6038        | -11.2632         | 0.9312             | 5.6594          | -376.3172      | -291.3033    | -38.0074        | -39.7695      |
| 0.1596        | 2.08  | 5000 | 0.1489          | -6.3796        | -12.4737         | 0.9312             | 6.0941          | -388.4222      | -299.0610    | -39.8571        | -41.5072      |
| 0.035         | 2.29  | 5500 | 0.1413          | -6.2472        | -12.4489         | 0.9375             | 6.2017          | -388.1746      | -297.7371    | -40.1165        | -41.9028      |
| 0.1528        | 2.5   | 6000 | 0.1452          | -6.7167        | -13.0974         | 0.9354             | 6.3807          | -394.6590      | -302.4318    | -39.9707        | -41.8089      |
| 0.1269        | 2.71  | 6500 | 0.1427          | -6.6508        | -13.0564         | 0.9458             | 6.4056          | -394.2490      | -301.7733    | -40.7866        | -42.6209      |
| 0.2239        | 2.92  | 7000 | 0.1422          | -6.6308        | -12.9931         | 0.9396             | 6.3623          | -393.6160      | -301.5730    | -40.9101        | -42.7380      |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.1
- Datasets 2.15.0
- Tokenizers 0.15.0