Extending Tower to the speech modality. Spire models are multimodal LLMs capable of transcribing and translating English into 9 different languages.
AI & ML interests
Multimodality, Large Language Models, Speech Processing
Recent Activity
View all activity
Organization Card
UTTER - Unified Transcription and Translation for Extended Reality - is a collaborative Research and Innovation Project funded under Horizon Europe that aims to leverage large language models to build the next generation of multimodal extended reality (XR) technologies for translation, summarization and minuting.
The consortium for UTTER consists of:
- Universiteit van Amsterdam,
- Instituto de Telecomunicações,
- University of Edinburgh,
- NAVER LABS Europe,
- Unbabel.
models
36
utter-project/TowerVideo-9B
Video-Text-to-Text
•
10B
•
Updated
•
4
•
2
utter-project/TowerVideo-2B
Video-Text-to-Text
•
3B
•
Updated
•
24
utter-project/TowerVision-9B
Image-Text-to-Text
•
10B
•
Updated
•
127
•
1
utter-project/TowerVision-2B
Image-Text-to-Text
•
3B
•
Updated
•
73
•
1
utter-project/eurollm-22b-mcore-phase1
Updated
utter-project/eurollm-22b-mcore-phase3-long-context
Updated
utter-project/eurollm-22b-mcore-phase2
Updated
utter-project/eurollm-9b-mcore-phase3
Updated
utter-project/eurollm-9b-mcore-phase2
Updated
utter-project/eurollm-9b-mcore-phase3-long-context
Updated