Post
3080
Introducing CosmoChat, a multiturn chat dataset based on Cosmopedia that I'm working on in the open on the Hub.
🎯 Goals:
💬 Create multi-turn chats seeded from Cosmopedia
🎓 Customize questions for different audience levels
🔍 Evaluate the model's ability to elaborate and clarify
🤓 (I want to learn more about creating valuable synthetic datasets, and I learn best by doing stuff rather than reading stuff).
Cosmochat is created using the excellent distilabel library.
🔗 Explore the current version of the dataset: davanstrien/cosmochat
📝 Read more: https://huggingface.co/blog/davanstrien/cosmochat
🎯 Goals:
💬 Create multi-turn chats seeded from Cosmopedia
🎓 Customize questions for different audience levels
🔍 Evaluate the model's ability to elaborate and clarify
🤓 (I want to learn more about creating valuable synthetic datasets, and I learn best by doing stuff rather than reading stuff).
Cosmochat is created using the excellent distilabel library.
🔗 Explore the current version of the dataset: davanstrien/cosmochat
📝 Read more: https://huggingface.co/blog/davanstrien/cosmochat