Generation parameters
#8
by
caciolai - opened
Hi team!
Thank you so much for releasing this, great work!
Reading through the tech report at https://github.com/Cohere-Labs/tiny-aya-tech-report/blob/main/tiny_aya_tech_report.pdf I could not find the list of generation / sampling parameters used for the evaluations leading to the reported benchmark numbers.
I see in the model card you do
gen_tokens = model.generate(
input_ids,
max_new_tokens=4096,
do_sample=True,
temperature=0.1,
top_p=0.95
)
do you suggest to keep these? Have you noticed any improvement or degradation in translation performance when using greedy decoding instead? What about other tasks?