Edit Models filters

Apps

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

trl-internal-testing/descriptiveness-sentiment-trl-style

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

18

Full-text search

Active filters: trl-internal-testing/descriptiveness-sentiment-trl-style

bikalnetomi/RLHF-PPO-PPOModel-LLama3-1B-v1.0

Text Generation • 1B • Updated Dec 2, 2024 • 6 •

bikalnetomi/RLHF-PPO-PPOModel-LLama3-1B-v1.1

Text Generation • 1B • Updated Dec 2, 2024 • 11 •

bikalnetomi/RLHF-PPO-PPOModel-LLama3-1B-v1.3

Text Generation • 1B • Updated Dec 2, 2024 • 7 •

bikalnetomi/RLHF-PPO-PPOModel-LLama3-1B-v1.4

Text Generation • 1B • Updated Dec 2, 2024 • 7 •

mradermacher/FedPPO-Collaborative-Pythia-70M-a0-GGUF

70.4M • Updated Dec 13, 2024 • 182

mradermacher/FedPPO-Confused-Pythia-70M-a1-GGUF

70.4M • Updated Dec 13, 2024 • 106

mradermacher/FedPPO-Collaborative-Pythia-70M-a1-GGUF

70.4M • Updated Dec 13, 2024 • 206

mradermacher/FedPPO-Isolated-Pythia-70M-a0-GGUF

70.4M • Updated Dec 13, 2024 • 195

mradermacher/FedPPO-Isolated-Pythia-70M-a1-GGUF

70.4M • Updated Dec 13, 2024 • 147

mradermacher/FedPPO-Pythia-70M-a1-GGUF

70.4M • Updated Dec 13, 2024 • 188

mradermacher/FedPPO-Confused-Pythia-70M-a0-GGUF

70.4M • Updated Dec 13, 2024 • 240

mradermacher/FedPPO-Pythia-70M-a0-GGUF

70.4M • Updated Dec 13, 2024 • 64

nologin/ppo

Text Generation • 0.2B • Updated Dec 13, 2024 • 9

nileshmalpeddi/ppo

Text Generation • 0.3B • Updated Mar 15 • 16

AMindToThink/ppo

Text Generation • 0.2B • Updated Apr 15 • 3

AMindToThink/ppo_push_main_13

Text Generation • 0.2B • Updated Apr 16 • 3

AMindToThink/ppo_with_value14

AMindToThink/ppo_with_value15