MiniCPM4-0.5B — RKLLM build for RK3588 boards

Author: @jamescallander
Source model: openbmb/MiniCPM4-0.5B · Hugging Face

Target: Rockchip RK3588 NPU via RKNN-LLM Runtime

This repository hosts a conversion of MiniCPM4-0.5B for use on Rockchip RK3588 single-board computers (Orange Pi 5 plus, Radxa Rock 5b+, Banana Pi M7, etc.). Conversion was performed using the RKNN-LLM toolkit

Conversion details

RKLLM-Toolkit version: v1.2.1
NPU driver: v0.9.8
Python: 3.12
Quantization: w8a8_g128
Output: single-file .rkllm artifact
Modifications: quantization (w8a8_g128), export to .rkllm format for RK3588 SBCs
Tokenizer: not required at runtime (UI handles prompt I/O)

Intended use

On-device lightweight inference on RK3588 SBCs.
MiniCPM4-0.5B is a compact general-purpose model designed for efficiency, testing, and resource-constrained scenarios. Ideal for experimentation where low memory usage and fast response matter more than deep reasoning.

Limitations

Requires 700MB free memory
As a 0.5B parameter model, it has limited reasoning ability compared to larger LLMs (e.g., 7B/8B).
Tested on a Radxa Rock 5B+, Orange Pi 5 plus; other devices may require different drivers/toolkit versions.
Quantization (w8a8_g128) may further reduce output fidelity.
Best suited for basic Q&A, toy chat, or edge demos rather than production-level tasks.

Quick start (RK3588)

1) Install runtime

The RKNN-LLM toolkit and instructions can be found on the specific development board's manufacturer website or from airockchip's github page.

Download and install the required packages as per the toolkit's instructions.

2) Simple Flask server deployment

The simplest way the deploy the .rkllm converted model is using an example script provided in the toolkit in this directory: rknn-llm/examples/rkllm_server_demo

python3 <TOOLKIT_PATH>/rknn-llm/examples/rkllm_server_demo/flask_server.py \
  --rkllm_model_path <MODEL_PATH>/MiniCPM4-0.5B_w8a8_g128_rk3588.rkllm \
  --target_platform rk3588

3) Sending a request

A basic format for message request is:

{
    "model":"MiniCPM4-0.5B",
    "messages":[{
        "role":"user",
        "content":"<YOUR_PROMPT_HERE>"}],
    "stream":false
}

Example request using curl:

curl -s -X POST <SERVER_IP_ADDRESS>:8080/rkllm_chat \
    -H 'Content-Type: application/json' \
    -d '{"model":"MiniCPM4-0.5B","messages":[{"role":"user","content":"Explain who Napoleon Bonaparte is in two or three sentences."}],"stream":false}'

The response is formated in the following way:

{
    "choices":[{
        "finish_reason":"stop",
        "index":0,
        "logprobs":null,
        "message":{
            "content":"<MODEL_REPLY_HERE">,
            "role":"assistant"}}],
        "created":null,
        "id":"rkllm_chat",
        "object":"rkllm_chat",
        "usage":{
            "completion_tokens":null,
            "prompt_tokens":null,
            "total_tokens":null}
}

Example response:

{"choices":[{"finish_reason":"stop","index":0,"logprobs":null,"message":{"content":"Napoleon Bonaparte was a French military leader and statesman who rose to prominence during the French Revolution. He played a pivotal role in shaping modern Europe through his military campaigns, administrative reforms, and the establishment of new political institutions.","role":"assistant"}}],"created":null,"id":"rkllm_chat","object":"rkllm_chat","usage":{"completion_tokens":null,"prompt_tokens":null,"total_tokens":null}}

4) UI compatibility

This server exposes an OpenAI-compatible Chat Completions API.

You can connect it to any OpenAI-compatible client or UI (for example: Open WebUI)

Configure your client with the API base: http://<SERVER_IP_ADDRESS>:8080 and use the endpoint: /rkllm_chat
Make sure the model field matches the converted model’s name, for example:

{
 "model": "MiniCPM4-0.5B",
 "messages": [{"role":"user","content":"Hello!"}],
 "stream": false
}

License

This conversion follows the license of the source model: apache-2.0

Attribution: Built with MiniCPM4 (OpenBMB)
Required notice: see NOTICE
Modifications: quantization (w8a8_g128), export to .rkllm format for RK3588 SBCs

Downloads last month: 3

Model tree for jamescallander/MiniCPM4-0.5B_w8a8_g128_rk3588.rkllm

Base model

openbmb/MiniCPM4-0.5B

Finetuned

(5)

this model

Collections including jamescallander/MiniCPM4-0.5B_w8a8_g128_rk3588.rkllm

RK3588 rkllm Models

Collection

Converted models for use on RK3588 single board computers such as Radxa Rock 5b+, Orange Pi 5 plus, Banana Pi M7, etc. • 15 items • Updated 3 days ago

RK3588 Chat & Instruction Models

Collection

Chat and instruction models converted to the rkllm format for use with RK3588 single board computers. • 6 items • Updated 3 days ago