smohammadi commited on
Commit
8dad0d0
·
verified ·
1 Parent(s): ae8eb87

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -2
README.md CHANGED
@@ -24,8 +24,32 @@ It achieves the following results on the evaluation set:
24
 
25
  ## Model description
26
 
27
- More information needed
28
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  ## Intended uses & limitations
30
 
31
  More information needed
 
24
 
25
  ## Model description
26
 
27
+ Trained using:
28
+ ```
29
+ python trl/examples/scripts/rm/rm.py \
30
+ --dataset_name trl-internal-testing/sentiment-trl-style \
31
+ --dataset_train_split train \
32
+ --dataset_eval_split test \
33
+ --model_name_or_path TinyLlama/TinyLlama_v1.1 \
34
+ --chat_template simple_concat \
35
+ --learning_rate 3e-6 \
36
+ --per_device_train_batch_size 32 \
37
+ --per_device_eval_batch_size 32 \
38
+ --gradient_accumulation_steps 1 \
39
+ --logging_steps 1 \
40
+ --eval_strategy steps \
41
+ --max_token_length 1024 \
42
+ --max_prompt_token_lenth 1024 \
43
+ --remove_unused_columns False \
44
+ --num_train_epochs 1 \
45
+ --eval_steps 100 \
46
+ --output_dir models/ppo_torchtune/tinyllama/tinyllama_rm_sentiment_1b \
47
+ --push_to_hub
48
+ ```
49
+
50
+ on the "dataset-processor" branch of trl:
51
+
52
+ git clone -b "dataset-processor" https://github.com/huggingface/trl
53
  ## Intended uses & limitations
54
 
55
  More information needed