Kernicterus commited on
Commit
13adbbd
·
verified ·
1 Parent(s): 0ee468a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +89 -78
README.md CHANGED
@@ -1,78 +1,89 @@
1
- # Whisper Large v3 Turbo - CTranslate2
2
-
3
- This is a CTranslate2-optimized version of OpenAI's Whisper Large v3 Turbo model for automatic speech recognition (ASR).
4
-
5
- ## Model Description
6
-
7
- This model is a converted version of the original Whisper Large v3 Turbo model, optimized for inference using CTranslate2. CTranslate2 is a C++ and Python library for efficient inference with Transformer models, providing:
8
-
9
- - **Faster inference**: Optimized implementations of attention mechanisms and feed-forward networks
10
- - **Lower memory usage**: Quantization support and memory-efficient attention
11
- - **Better throughput**: Batching and parallel processing optimizations
12
- - **Cross-platform compatibility**: Support for CPU and GPU inference
13
-
14
- ## Conversion
15
-
16
- This model has been converted using the following command:
17
-
18
- ```bash
19
- ct2-transformers-converter --model openai/whisper-large-v3-turbo --output_dir whisper-large-v3-turbo-ct2-int8 --quantization int8 --copy_files tokenizer.json preprocessor_config.json
20
- ```
21
-
22
- The conversion includes **int8 quantization**, which provides several benefits:
23
-
24
- - **Reduced disk space**: Significantly smaller model size compared to the original float32 version
25
- - **Lower memory consumption**: Requires less RAM during inference
26
- - **Maintained accuracy**: Minimal quality loss while providing substantial efficiency gains
27
- - **Faster loading**: Reduced time to load the model from disk
28
-
29
- ## Original Model
30
-
31
- This model is based on OpenAI's Whisper Large v3 Turbo, which is a state-of-the-art automatic speech recognition model that:
32
-
33
- - Supports 99 languages
34
- - Provides high-quality transcription and translation
35
- - Features improved accuracy and speed compared to previous Whisper versions
36
- - Handles various audio conditions and accents
37
-
38
- ## Usage
39
-
40
- To use this model, you'll need to install CTranslate2 and the appropriate Whisper integration (faster-whisper):
41
-
42
- ```bash
43
- pip install ctranslate2 faster-whisper
44
- ```
45
-
46
- ```python
47
- from faster_whisper import WhisperModel
48
-
49
- model_size = "path/to/whisper-large-v3-turbo-ct2"
50
- model = WhisperModel(model_size, device="cpu", compute_type="int8")
51
-
52
- segments, info = model.transcribe("audio.wav", beam_size=5)
53
-
54
- for segment in segments:
55
- print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
56
- ```
57
-
58
- ## Performance
59
-
60
- This CTranslate2 version provides significant performance improvements over the original PyTorch implementation:
61
-
62
- - Up to 4x faster inference
63
- - Reduced memory consumption
64
- - Support for quantization
65
- - Optimized for both CPU and GPU inference
66
-
67
- ## Supported Languages
68
-
69
- Same as the original Whisper Large v3 Turbo:
70
- Afrikaans, Arabic, Armenian, Azerbaijani, Belarusian, Bosnian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Marathi, Maori, Nepali, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese, Welsh.
71
-
72
- ## Model Card
73
-
74
- - **Developed by**: OpenAI (original), converted to CT2 format
75
- - **Model type**: Automatic Speech Recognition
76
- - **Language(s)**: Multilingual (99 languages)
77
- - **License**: MIT
78
- - **Model size**: Large (1550M parameters)
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model:
4
+ - openai/whisper-large-v3-turbo
5
+ tags:
6
+ - whisper
7
+ - faster
8
+ - int8
9
+ - ct2
10
+ - turbo
11
+ ---
12
+ # Whisper Large v3 Turbo - CTranslate2
13
+
14
+ This is a CTranslate2-optimized version of OpenAI's Whisper Large v3 Turbo model for automatic speech recognition (ASR).
15
+
16
+ ## Model Description
17
+
18
+ This model is a converted version of the original Whisper Large v3 Turbo model, optimized for inference using CTranslate2. CTranslate2 is a C++ and Python library for efficient inference with Transformer models, providing:
19
+
20
+ - **Faster inference**: Optimized implementations of attention mechanisms and feed-forward networks
21
+ - **Lower memory usage**: Quantization support and memory-efficient attention
22
+ - **Better throughput**: Batching and parallel processing optimizations
23
+ - **Cross-platform compatibility**: Support for CPU and GPU inference
24
+
25
+ ## Conversion
26
+
27
+ This model has been converted using the following command:
28
+
29
+ ```bash
30
+ ct2-transformers-converter --model openai/whisper-large-v3-turbo --output_dir whisper-large-v3-turbo-ct2-int8 --quantization int8 --copy_files tokenizer.json preprocessor_config.json
31
+ ```
32
+
33
+ The conversion includes **int8 quantization**, which provides several benefits:
34
+
35
+ - **Reduced disk space**: Significantly smaller model size compared to the original float32 version
36
+ - **Lower memory consumption**: Requires less RAM during inference
37
+ - **Maintained accuracy**: Minimal quality loss while providing substantial efficiency gains
38
+ - **Faster loading**: Reduced time to load the model from disk
39
+
40
+ ## Original Model
41
+
42
+ This model is based on OpenAI's Whisper Large v3 Turbo, which is a state-of-the-art automatic speech recognition model that:
43
+
44
+ - Supports 99 languages
45
+ - Provides high-quality transcription and translation
46
+ - Features improved accuracy and speed compared to previous Whisper versions
47
+ - Handles various audio conditions and accents
48
+
49
+ ## Usage
50
+
51
+ To use this model, you'll need to install CTranslate2 and the appropriate Whisper integration (faster-whisper):
52
+
53
+ ```bash
54
+ pip install ctranslate2 faster-whisper
55
+ ```
56
+
57
+ ```python
58
+ from faster_whisper import WhisperModel
59
+
60
+ model_size = "path/to/whisper-large-v3-turbo-ct2"
61
+ model = WhisperModel(model_size, device="cpu", compute_type="int8")
62
+
63
+ segments, info = model.transcribe("audio.wav", beam_size=5)
64
+
65
+ for segment in segments:
66
+ print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
67
+ ```
68
+
69
+ ## Performance
70
+
71
+ This CTranslate2 version provides significant performance improvements over the original PyTorch implementation:
72
+
73
+ - Up to 4x faster inference
74
+ - Reduced memory consumption
75
+ - Support for quantization
76
+ - Optimized for both CPU and GPU inference
77
+
78
+ ## Supported Languages
79
+
80
+ Same as the original Whisper Large v3 Turbo:
81
+ Afrikaans, Arabic, Armenian, Azerbaijani, Belarusian, Bosnian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Marathi, Maori, Nepali, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese, Welsh.
82
+
83
+ ## Model Card
84
+
85
+ - **Developed by**: OpenAI (original), converted to CT2 format
86
+ - **Model type**: Automatic Speech Recognition
87
+ - **Language(s)**: Multilingual (99 languages)
88
+ - **License**: MIT
89
+ - **Model size**: Large (1550M parameters)