Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,48 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- de
|
4 |
+
license: cc-by-nc-4.0
|
5 |
+
tags:
|
6 |
+
- speech
|
7 |
+
- text-to-speech
|
8 |
+
- F5-TTS
|
9 |
+
datasets:
|
10 |
+
- amphion/Emilia-Dataset
|
11 |
+
- fsicoli/common_voice_19_0
|
12 |
+
library_name: f5_tts
|
13 |
+
base_model:
|
14 |
+
- SWivid/F5-TTS
|
15 |
+
---
|
16 |
+
|
17 |
+
# German Voice Cloning TTS Model using F5-TTS Architecture
|
18 |
+
|
19 |
+
This is an attempt at an mlx conversion, to use per f5-tts-mlx.
|
20 |
+
|
21 |
+
A German Text-to-Speech system capable of cloning voices from a few seconds of reference audio, built on the F5-TTS architecture.
|
22 |
+
|
23 |
+
## Model Details
|
24 |
+
- **Developed by:** Johanna Reiml and team at KI-Servicezentrum, Hasso-Plattner-Institut (HPI)
|
25 |
+
- **Base Model:** [SWivid/F5-TTS](https://huggingface.co/SWivid/F5-TTS)
|
26 |
+
- **Paper:** [F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching](https://arxiv.org/abs/2410.06885)
|
27 |
+
|
28 |
+
## Key Features & Capabilities
|
29 |
+
- Generates natural-sounding German speech from text
|
30 |
+
- Clones voices using minimal reference audio (few seconds)
|
31 |
+
- Suitable for audiobooks, voice assistants, and accessibility applications
|
32 |
+
|
33 |
+
## Technical Specifications
|
34 |
+
Download checkpoints from the directories F5TTS_Base (vocos) or F5TTS_Base_bigvgan (bigvgan).
|
35 |
+
- **Datasets:** Common Voice (Mozilla) and Emilia_DE
|
36 |
+
- **Process:** Fine-tuned checkpoints of [base F5-TTS model](https://huggingface.co/SWivid/F5-TTS)
|
37 |
+
- **Trained on Hardware:** 8x NVIDIA H100
|
38 |
+
|
39 |
+
## Contact
|
40 |
+
- AI Service Center: [email protected]
|
41 |
+
- Johanna Reiml: [email protected]
|
42 |
+
- Enes Suermeli: [email protected]
|
43 |
+
- Kajo Kratzenstein: [email protected]
|
44 |
+
- Carlos Menke: [email protected]
|
45 |
+
|
46 |
+
|
47 |
+
## Acknowledgements
|
48 |
+
The authors acknowledge the financial support by the German Federal Ministry for Education and Research (BMBF) through the project «KI-Servicezentrum Berlin Brandenburg» (01IS22092).
|