Failed in vllm inference
#139
by
Mqleet
- opened
Hello, thanks for your great work! I met the failed info below:
[1;36m(EngineCore_0 pid=3481)[0;0m ERROR 09-08 23:05:16 [backend_xgrammar.py:163] Failed to advance FSM for request chatcmpl-c9d0e134de19488db8c73c97a2f23bae for tokens 4050. Please file an issue.
[1;36m(EngineCore_0 pid=3481)[0;0m ERROR 09-08 23:05:16 [backend_xgrammar.py:163] Failed to advance FSM for request chatcmpl-105ce5912c5e48b8b417cca36e8bd3b0 for tokens 2601. Please file an issue.
[1;36m(EngineCore_0 pid=3481)[0;0m ERROR 09-08 23:05:16 [backend_xgrammar.py:163] Failed to advance FSM for request chatcmpl-105ce5912c5e48b8b417cca36e8bd3b0 for tokens 200012. Please file an issue.
[1;36m(APIServer pid=3042)[0;0m INFO: 127.0.0.1:36242 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(EngineCore_0 pid=3481)[0;0m ERROR 09-08 23:05:16 [backend_xgrammar.py:163] Failed to advance FSM for request chatcmpl-d7489a05165e43d8aa1aff25866de331 for tokens 256. Please file an issue.
[1;36m(APIServer pid=3042)[0;0m INFO: 127.0.0.1:36208 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(EngineCore_0 pid=3481)[0;0m ERROR 09-08 23:05:17 [backend_xgrammar.py:163] Failed to advance FSM for request chatcmpl-5ea7e4163e574c34990d1b390d6971d7 for tokens 1202. Please file an issue.
[1;36m(APIServer pid=3042)[0;0m INFO: 127.0.0.1:36152 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(EngineCore_0 pid=3481)[0;0m ERROR 09-08 23:05:17 [backend_xgrammar.py:163] Failed to advance FSM for request chatcmpl-6016d82b12154fe385620e843185458b for tokens 3369. Please file an issue.
[1;36m(EngineCore_0 pid=3481)[0;0m ERROR 09-08 23:05:17 [backend_xgrammar.py:163] Failed to advance FSM for request chatcmpl-6016d82b12154fe385620e843185458b for tokens 175728. Please file an issue.
[1;36m(EngineCore_0 pid=3481)[0;0m ERROR 09-08 23:05:17 [backend_xgrammar.py:163] Failed to advance FSM for request chatcmpl-6016d82b12154fe385620e843185458b for tokens 18583. Please file an issue.
[1;36m(APIServer pid=3042)[0;0m INFO: 127.0.0.1:36220 - "POST /v1/chat/completions HTTP/1.1" 200 OK
[1;36m(EngineCore_0 pid=3481)[0;0m ERROR 09-08 23:05:17 [backend_xgrammar.py:163] Failed to advance FSM for request chatcmpl-5abbcb233a90485f96016ee2fa5c43c7 for tokens 3369. Please file an issue.
[1;36m(EngineCore_0 pid=3481)[0;0m ERROR 09-08 23:05:17 [backend_xgrammar.py:163] Failed to advance FSM for request chatcmpl-c9d0e134de19488db8c73c97a2f23bae for tokens 279. Please file an issue.
[1;36m(EngineCore_0 pid=3481)[0;0m ERROR 09-08 23:05:18 [backend_xgrammar.py:163] Failed to advance FSM for request chatcmpl-5abbcb233a90485f96016ee2fa5c43c7 for tokens 388. Please file an issue.
and the vllm commands is:
vllm serve openai/gpt-oss-120b \
--async-scheduling \
--port 11451 \
--host 0.0.0.0 \
--tensor-parallel-size $NUM_GPUS \
--max-model-len 131072 \
--gpu-memory-utilization 0.8 \
How can I solve this problem?