YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Llama3.2-3B

Run Llama3.2-3B optimized for Qualcomm NPUs with nexaSDK.

Quickstart

  1. Install nexaSDK and create a free account at sdk.nexa.ai
  2. Activate your device with your access token:
    nexa config set license '<access_token>'
    
  3. Run the model on Qualcomm NPU in one line:
    nexa infer NexaAI/Llama3.2-3B-NPU-Turbo
    

Model Description

Llama3.2-3B is a 3-billion-parameter language model from Meta’s Llama 3.2 series.
It is designed to provide a balance of efficiency and capability, making it suitable for deployment on a wide range of devices while maintaining strong performance on core language understanding and generation tasks.

Trained on diverse, high-quality datasets, Llama3.2-3B supports multiple languages and is optimized for scalability, fine-tuning, and real-world applications.

Features

  • Lightweight yet capable: delivers strong performance with a smaller memory footprint.
  • Conversational AI: context-aware dialogue for assistants and agents.
  • Content generation: text completion, summarization, code comments, and more.
  • Reasoning & analysis: step-by-step problem solving and explanation.
  • Multilingual: supports understanding and generation in multiple languages.
  • Customizable: can be fine-tuned for domain-specific or enterprise use.

Use Cases

  • Personal and enterprise chatbots
  • On-device AI applications
  • Document and report summarization
  • Education and tutoring tools
  • Specialized models in verticals (e.g., healthcare, finance, legal)

Inputs and Outputs

Input:

  • Text prompts or conversation history (tokenized input sequences).

Output:

  • Generated text: responses, explanations, or creative content.
  • Optionally: raw logits/probabilities for advanced downstream tasks.

License

References

Downloads last month
48
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including NexaAI/Llama3.2-3B-NPU-Turbo