File size: 4,462 Bytes
dd863f2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7384c75
07637f0
7384c75
 
 
 
 
 
 
 
 
dd863f2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7384c75
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b134018
7384c75
 
dd863f2
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
base_model: lvwerra/gpt2-imdb
tags:
- generated_from_trainer
model-index:
- name: gpt-imdb-sigmoid-beta_0.1
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# gpt-imdb-sigmoid-beta_0.1

This model is a fine-tuned version of [lvwerra/gpt2-imdb](https://huggingface.co/lvwerra/gpt2-imdb) on an unknown dataset.
It achieves the following results on the evaluation set:
- Step: 7000
- Loss: 0.1445
- Rewards/chosen: -5.6156
- Rewards/rejected: -11.9139
- Rewards/accuracies: 0.9354
- Rewards/margins: 6.2982
- Logps/rejected: -382.8238
- Logps/chosen: -291.4216
- Logits/rejected: -44.3728
- Logits/chosen: -46.3321

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Adam with betas=(0.9,0.99) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 150
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
| 0.2741        | 0.21  | 500  | 0.3546          | -0.7644        | -2.6310          | 0.8604             | 1.8666          | -289.9951      | -242.9089    | -34.2705        | -35.4568      |
| 0.3403        | 0.42  | 1000 | 0.2963          | -1.6755        | -4.3008          | 0.8687             | 2.6253          | -306.6930      | -252.0203    | -40.9205        | -42.3105      |
| 0.1939        | 0.63  | 1500 | 0.2596          | -3.1297        | -6.7295          | 0.8771             | 3.5998          | -330.9802      | -266.5624    | -37.6829        | -39.1821      |
| 0.2094        | 0.83  | 2000 | 0.1941          | -2.9414        | -6.9143          | 0.9292             | 3.9728          | -332.8280      | -264.6796    | -38.0792        | -39.7464      |
| 0.1481        | 1.04  | 2500 | 0.1744          | -3.7473        | -8.3469          | 0.9333             | 4.5996          | -347.1542      | -272.7383    | -40.9252        | -42.5164      |
| 0.2862        | 1.25  | 3000 | 0.1750          | -4.5825        | -9.7147          | 0.9292             | 5.1322          | -360.8324      | -281.0905    | -41.9790        | -44.0717      |
| 0.304         | 1.46  | 3500 | 0.1652          | -4.3291        | -9.8200          | 0.9333             | 5.4909          | -361.8853      | -278.5559    | -44.1786        | -46.1418      |
| 0.2167        | 1.67  | 4000 | 0.1580          | -4.6175        | -10.0305         | 0.9354             | 5.4130          | -363.9903      | -281.4398    | -43.6324        | -45.4854      |
| 0.1396        | 1.88  | 4500 | 0.1518          | -4.5940        | -10.1635         | 0.9396             | 5.5696          | -365.3205      | -281.2049    | -41.9461        | -43.8060      |
| 0.1575        | 2.08  | 5000 | 0.1525          | -5.3119        | -11.3685         | 0.9292             | 6.0566          | -377.3703      | -288.3840    | -43.4045        | -45.2127      |
| 0.0338        | 2.29  | 5500 | 0.1472          | -5.2545        | -11.3863         | 0.9333             | 6.1319          | -377.5485      | -287.8099    | -43.2283        | -45.1626      |
| 0.1631        | 2.5   | 6000 | 0.1496          | -5.6862        | -11.9852         | 0.9333             | 6.2991          | -383.5375      | -292.1269    | -43.6007        | -45.5693      |
| 0.1177        | 2.71  | 6500 | 0.1473          | -5.6329        | -11.9588         | 0.9417             | 6.3259          | -383.2729      | -291.5939    | -44.3503        | -46.3168      |
| 0.2342        | 2.92  | 7000 | **0.1445**          | -5.6156        | -11.9139         | 0.9354             | 6.2982          | -382.8238      | -291.4216    | -44.3728        | -46.3321      |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.1
- Datasets 2.15.0
- Tokenizers 0.15.0