pipeline_tag: text-generation | |
library_name: transformers | |
license: apache-2.0 | |
tags: | |
- mixtral | |
- moe | |
- reasoning | |
# Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks | |
This repository contains model checkpoints from the paper [Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks](https://huggingface.co/papers/2508.18672). | |
For more details, including code and evaluation procedures, please refer to the official GitHub repository: [https://github.com/rioyokotalab/optimal-sparsity](https://github.com/rioyokotalab/optimal-sparsity) | |
## How to cite | |
If you find our work helpful, please feel free to cite the paper. | |
```bibtex | |
@article{nakamura2025optimalsparsitymixtureofexpertslanguage, | |
title={Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks}, | |
author={Taishi Nakamura and Satoki Ishikawa and Masaki Kawamura and Takumi Okamoto and Daisuke Nohara and Jun Suzuki and Rio Yokota}, | |
year={2025}, | |
eprint={2508.18672}, | |
archivePrefix={arXiv}, | |
primaryClass={cs.LG}, | |
url={https://arxiv.org/abs/2508.18672}, | |
} | |
``` | |