-
2.47k
Whisper
📉Transcribe audio or YouTube videos into text
-
Robust Speech Recognition via Large-Scale Weak Supervision
Paper • 2212.04356 • Published • 37 -
openai/whisper-large-v2
Automatic Speech Recognition • 2B • Updated • 140k • 1.76k -
openai/whisper-large
Automatic Speech Recognition • 2B • Updated • 44k • 528
Collections
Discover the best community collections!
Collections including paper arxiv:2212.04356
-
DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning
Paper • 2504.07128 • Published • 87 -
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper • 2412.09871 • Published • 109 -
BitNet b1.58 2B4T Technical Report
Paper • 2504.12285 • Published • 74 -
FAST: Efficient Action Tokenization for Vision-Language-Action Models
Paper • 2501.09747 • Published • 25
-
2.47k
Whisper
📉Transcribe audio or YouTube videos into text
-
Robust Speech Recognition via Large-Scale Weak Supervision
Paper • 2212.04356 • Published • 37 -
openai/whisper-large-v2
Automatic Speech Recognition • 2B • Updated • 140k • 1.76k -
openai/whisper-large
Automatic Speech Recognition • 2B • Updated • 44k • 528
-
DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning
Paper • 2504.07128 • Published • 87 -
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper • 2412.09871 • Published • 109 -
BitNet b1.58 2B4T Technical Report
Paper • 2504.12285 • Published • 74 -
FAST: Efficient Action Tokenization for Vision-Language-Action Models
Paper • 2501.09747 • Published • 25