|
--- |
|
license: mit |
|
datasets: |
|
- openslr/librispeech_asr |
|
language: |
|
- en |
|
pipeline_tag: automatic-speech-recognition |
|
--- |
|
|
|
# Splitformer |
|
|
|
--- |
|
|
|
<div align="center" style="line-height: 1;"> |
|
<a href="https://github.com/augustgw/early-exit-transformer" target="_blank" style="margin: 2px;"> |
|
<img alt="GitHub" src="https://img.shields.io/badge/GitHub-Splitformer-181717?logo=github&logoColor=white" style="display: inline-block; vertical-align: middle;"/> |
|
</a> |
|
<a href="https://arxiv.org/abs/2501.12948" target="_blank" style="margin: 2px;"> |
|
<img alt="arXiv" src="https://img.shields.io/badge/arXiv-2501.12948-b31b1b?logo=arxiv&logoColor=white" style="display: inline-block; vertical-align: middle;"/> |
|
</a> |
|
<a href="LICENSE" style="margin: 2px;"> |
|
<img alt="License" src="https://img.shields.io/badge/License-MIT-f5de53?&color=white" style="display: inline-block; vertical-align: middle;"/> |
|
</a> |
|
</div> |
|
|
|
## Overview |
|
|
|
**Splitformer** is a 36.7M parameters Conformer-based ASR model trained from scratch on 1000 hours of the LibriSpeech dataset with an early‐exit objective. |
|
|
|
This architecture introduces parallel downsampling layers before the first and last exits to improve performance with minimal extra overhead, while retaining inference speed. |
|
|
|
Our code for training and inference is available on our [GitHub](https://github.com/augustgw/early-exit-transformer) repository. |