MiniLLM
/

MiniPLM-Qwen-200M

nielsr HF Staff commited on Oct 27, 2024

Commit

bb12b6b

verified ·

1 Parent(s): 543ee2a

Add link to paper (#1)

Files changed (1) hide show

README.md CHANGED Viewed

@@ -37,4 +37,14 @@ MiniPLM models achieves better performance given the same computation and scales
 ## Citation
-TODO

 ## Citation
+```bibtex
+@misc{gu2024miniplmknowledgedistillationpretraining,
+      title={MiniPLM: Knowledge Distillation for Pre-Training Language Models},
+      author={Yuxian Gu and Hao Zhou and Fandong Meng and Jie Zhou and Minlie Huang},
+      year={2024},
+      eprint={2410.17215},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2410.17215},
+}
+```