Generation parameters

#8
by caciolai - opened

Hi team!

Thank you so much for releasing this, great work!

Reading through the tech report at https://github.com/Cohere-Labs/tiny-aya-tech-report/blob/main/tiny_aya_tech_report.pdf I could not find the list of generation / sampling parameters used for the evaluations leading to the reported benchmark numbers.

I see in the model card you do

gen_tokens = model.generate(
    input_ids,
    max_new_tokens=4096,
    do_sample=True,
    temperature=0.1,
    top_p=0.95
)

do you suggest to keep these? Have you noticed any improvement or degradation in translation performance when using greedy decoding instead? What about other tasks?

Cohere Labs org

Hey @caciolai

For generative tasks, we used temperature = 0. For tasks where we used the lm-evaluation-harness, we used the default parameters for all models.

Sign up or log in to comment