Update README.md
Browse files
README.md
CHANGED
@@ -24,8 +24,32 @@ It achieves the following results on the evaluation set:
|
|
24 |
|
25 |
## Model description
|
26 |
|
27 |
-
|
28 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
## Intended uses & limitations
|
30 |
|
31 |
More information needed
|
|
|
24 |
|
25 |
## Model description
|
26 |
|
27 |
+
Trained using:
|
28 |
+
```
|
29 |
+
python trl/examples/scripts/rm/rm.py \
|
30 |
+
--dataset_name trl-internal-testing/sentiment-trl-style \
|
31 |
+
--dataset_train_split train \
|
32 |
+
--dataset_eval_split test \
|
33 |
+
--model_name_or_path TinyLlama/TinyLlama_v1.1 \
|
34 |
+
--chat_template simple_concat \
|
35 |
+
--learning_rate 3e-6 \
|
36 |
+
--per_device_train_batch_size 32 \
|
37 |
+
--per_device_eval_batch_size 32 \
|
38 |
+
--gradient_accumulation_steps 1 \
|
39 |
+
--logging_steps 1 \
|
40 |
+
--eval_strategy steps \
|
41 |
+
--max_token_length 1024 \
|
42 |
+
--max_prompt_token_lenth 1024 \
|
43 |
+
--remove_unused_columns False \
|
44 |
+
--num_train_epochs 1 \
|
45 |
+
--eval_steps 100 \
|
46 |
+
--output_dir models/ppo_torchtune/tinyllama/tinyllama_rm_sentiment_1b \
|
47 |
+
--push_to_hub
|
48 |
+
```
|
49 |
+
|
50 |
+
on the "dataset-processor" branch of trl:
|
51 |
+
|
52 |
+
git clone -b "dataset-processor" https://github.com/huggingface/trl
|
53 |
## Intended uses & limitations
|
54 |
|
55 |
More information needed
|