Taishi-N324's picture
Upload README.md with huggingface_hub
17ee6a6 verified
---
pipeline_tag: text-generation
library_name: transformers
license: apache-2.0
tags:
- mixtral
- moe
- reasoning
---
# Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks
This repository contains model checkpoints from the paper [Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks](https://huggingface.co/papers/2508.18672).
For more details, including code and evaluation procedures, please refer to the official GitHub repository: [https://github.com/rioyokotalab/optimal-sparsity](https://github.com/rioyokotalab/optimal-sparsity)
## How to cite
If you find our work helpful, please feel free to cite the paper.
```bibtex
@article{nakamura2025optimalsparsitymixtureofexpertslanguage,
title={Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks},
author={Taishi Nakamura and Satoki Ishikawa and Masaki Kawamura and Takumi Okamoto and Daisuke Nohara and Jun Suzuki and Rio Yokota},
year={2025},
eprint={2508.18672},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2508.18672},
}
```