Generation parameters

by caciolai - opened 6 days ago

Hi team!

Thank you so much for releasing this, great work!

Reading through the tech report at https://github.com/Cohere-Labs/tiny-aya-tech-report/blob/main/tiny_aya_tech_report.pdf I could not find the list of generation / sampling parameters used for the evaluations leading to the reported benchmark numbers.

I see in the model card you do

gen_tokens = model.generate(
    input_ids,
    max_new_tokens=4096,
    do_sample=True,
    temperature=0.1,
    top_p=0.95
)

do you suggest to keep these? Have you noticed any improvement or degradation in translation performance when using greedy decoding instead? What about other tasks?

alexrs

Cohere Labs org about 13 hours ago

Hey @caciolai

For generative tasks, we used temperature = 0. For tasks where we used the lm-evaluation-harness, we used the default parameters for all models.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Generation parameters

🎉 Free Image Generator Now Available!