feat: Add CPU support

#18

by gabegoodhart - opened 10 days ago

base: refs/heads/main

←

from: refs/pr/18

Discussion Files changed

+38

-12

gabegoodhart

10 days ago

Description

This PR adds support to modeling_nemotron.py for running inference on CPU. This is a cleaned up version of the edits I made while working on support in llama.cpp.

Changes

Handle failed imports of rmsnorm_fn
Add un-optimized implementation of MambaRMSNormGated.forward
Fix NemotronHMamba2Mixer.torch_forward to use repeat_interleaved for B and C (see discussion here)

feat: Add CPU support3ef26b4b

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment