Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,7 @@ Finetune of [Mistral-Nemo-Base-2407](https://huggingface.co/mistralai/Mistral-Ne
|
|
10 |
|
11 |
- 2 epochs of SFT on RP data, then about an hour of PPO on 8xH100 with [POLAR-7B RFT](https://github.com/RowitZou/POLAR_RFT)
|
12 |
- Kind of wonky, if you're dealing with longer messages you may need to decrease your temperature
|
|
|
13 |
- Reviews:
|
14 |
|
15 |
> its typically good at writing, v good for 12b, coherent in RP, follows context and starts conversations well
|
|
|
10 |
|
11 |
- 2 epochs of SFT on RP data, then about an hour of PPO on 8xH100 with [POLAR-7B RFT](https://github.com/RowitZou/POLAR_RFT)
|
12 |
- Kind of wonky, if you're dealing with longer messages you may need to decrease your temperature
|
13 |
+
- ChatML chat format
|
14 |
- Reviews:
|
15 |
|
16 |
> its typically good at writing, v good for 12b, coherent in RP, follows context and starts conversations well
|