Skip to content

vllm-omni 为什么不支持张量并行? #338

@wangaocheng

Description

@wangaocheng

--tensor_parallel_size 2 直接就报错

(vllm) william@ubuntu:~$ vllm serve /models/openbmb/VoxCPM2 --omni --port 8002 --served-model-name VoxCPM2 --trust-remote-code --tensor-parallel-size 2
INFO 06-09 16:28:15 [main.py:51] Delegating entrypoint handling to vllm-omni
INFO 06-09 16:28:15 [logo.py:52] █ █ █▄ ▄█ ▄▀▀▀▀▄ █▄ ▄█ █▄ █ ▀█▀
INFO 06-09 16:28:15 [logo.py:52] ▄▄ ▄█ █ █ █ ▀▄▀ █ ▄▄▄ █ █ █ ▀▄▀ █ █ ▀▄ █ █
INFO 06-09 16:28:15 [logo.py:52] █▄█▀ █ █ █ █ █ █ █ █ █ ▀▄█ █
INFO 06-09 16:28:15 [logo.py:52] ▀▀ ▀▀▀▀▀ ▀▀▀▀▀ ▀ ▀ ▀▀▀▀ ▀ ▀ ▀ ▀ ▀▀▀
INFO 06-09 16:28:15 [logo.py:52]
(APIServer pid=6880) INFO 06-09 16:28:15 [utils.py:306] vLLM server version 0.21.0, serving model /models/openbmb/VoxCPM2
(APIServer pid=6880) INFO 06-09 16:28:15 [utils.py:240] non-default args: {'model_tag': '/models/openbmb/VoxCPM2', 'port': 8002, 'model': '/models/openbmb/VoxCPM2', 'tokenizer_mode': None, 'trust_remote_code': True, 'dtype': None, 'enforce_eager': None, 'served_model_name': ['VoxCPM2'], 'config_format': None, 'load_format': None, 'pipeline_parallel_size': None, 'tensor_parallel_size': 2, 'data_parallel_size': None, 'gpu_memory_utilization': None, 'mm_processor_cache_gb': None, 'skip_mm_profiling': None, 'compilation_config': None, 'profiler_config': None}
(APIServer pid=6880) INFO 06-09 16:28:15 [omni_base.py:172] [AsyncOmni] Initializing with model /models/openbmb/VoxCPM2
(APIServer pid=6880) INFO 06-09 16:28:15 [async_omni_engine.py:269] [AsyncOmniEngine] Initializing with model /models/openbmb/VoxCPM2
(APIServer pid=6880) WARNING 06-09 16:28:15 [utils.py:191] Filtered out 1 callable object(s) from base_engine_args that are not compatible with OmegaConf: ['dispatch_function'].
(APIServer pid=6880) INFO 06-09 16:28:15 [async_omni_engine.py:331] [AsyncOmniEngine] Launching Orchestrator thread with 1 stages
(APIServer pid=6880) INFO 06-09 16:28:15 [initialization.py:352] Loaded OmniTransferConfig with 0 connector configurations
(APIServer pid=6880) INFO 06-09 16:28:15 [arg_utils.py:234] Patched empty HF config with model_type=voxcpm2 at /tmp/omni_hf_config_siys6us3
(APIServer pid=6880) /home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/transformers/modeling_rope_utils.py:1034: FutureWarning: rope_config_validation is deprecated and has been removed. Its functionality has been moved to RotaryEmbeddingConfigMixin.validate_rope method. PreTrainedConfig inherits this class, so please call self.validate_rope() instead. Also, make sure to use the new rope_parameters syntax. You can call self.standardize_rope_params() in the meantime.
(APIServer pid=6880) warnings.warn(
(APIServer pid=6880) INFO 06-09 16:28:21 [model.py:568] Resolved architecture: VoxCPM2TalkerForConditionalGeneration
(APIServer pid=6880) WARNING 06-09 16:28:21 [model.py:2035] Casting bfloat16 to torch.bfloat16.
(APIServer pid=6880) INFO 06-09 16:28:21 [model.py:1697] Using max model len 4096
(APIServer pid=6880) INFO 06-09 16:28:21 [scheduler.py:239] Chunked prefill is enabled with max_num_batched_tokens=4096.
(APIServer pid=6880) INFO 06-09 16:28:21 [vllm.py:886] Asynchronous scheduling is enabled.
(APIServer pid=6880) WARNING 06-09 16:28:21 [vllm.py:942] Enforce eager set, disabling torch.compile and CUDAGraphs. This is equivalent to setting -cc.mode=none -cc.cudagraph_mode=none
(APIServer pid=6880) WARNING 06-09 16:28:21 [vllm.py:960] Inductor compilation was disabled by user settings, optimizations settings that are only active during inductor compilation will be ignored.
(APIServer pid=6880) INFO 06-09 16:28:21 [kernel.py:212] Final IR op priority after setting platform defaults: IrOpPriorityConfig(rms_norm=['vllm_c', 'native'], fused_add_rms_norm=['vllm_c', 'native'])
(APIServer pid=6880) INFO 06-09 16:28:21 [vllm.py:1135] Cudagraph is disabled under eager mode
(APIServer pid=6880) INFO 06-09 16:28:21 [compilation.py:303] Enabled custom fusions: norm_quant, act_quant
(APIServer pid=6880) INFO 06-09 16:28:21 [stage_init_utils.py:535] [stage_init] Stage-0 set runtime devices: 0
(APIServer pid=6880) INFO 06-09 16:28:21 [async_omni_engine.py:706] [AsyncOmniEngine] Stage 0 engine launch started
(StageEngineCoreProc pid=7366) INFO 06-09 16:28:24 [core.py:109] Initializing a V1 LLM engine (v0.21.0) with config: model='/models/openbmb/VoxCPM2', speculative_config=None, tokenizer='/models/openbmb/VoxCPM2', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.bfloat16, max_seq_len=4096, download_dir=None, load_format=auto, tensor_parallel_size=2, pipeline_parallel_size=1, data_parallel_size=1, decode_context_parallel_size=1, dcp_comm_backend=ag_rs, disable_custom_all_reduce=False, quantization=None, quantization_config=None, enforce_eager=True, enable_return_routed_experts=False, kv_cache_dtype=auto, device_config=cuda, structured_outputs_config=StructuredOutputsConfig(backend='auto', disable_any_whitespace=False, disable_additional_properties=False, reasoning_parser='', reasoning_parser_plugin='', enable_in_reasoning=False), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None, kv_cache_metrics=False, kv_cache_metrics_sample=0.01, cudagraph_metrics=False, enable_layerwise_nvtx_tracing=False, enable_mfu_metrics=False, enable_mm_processor_stats=False, enable_logging_iteration_details=False), seed=0, served_model_name=VoxCPM2, enable_prefix_caching=False, enable_chunked_prefill=True, pooler_config=None, compilation_config={'mode': <CompilationMode.NONE: 0>, 'debug_dump_path': None, 'cache_dir': '', 'compile_cache_save_format': 'binary', 'backend': 'inductor', 'custom_ops': ['all'], 'ir_enable_torch_wrap': False, 'splitting_ops': [], 'compile_mm_encoder': False, 'cudagraph_mm_encoder': False, 'encoder_cudagraph_token_budgets': [], 'encoder_cudagraph_max_vision_items_per_batch': 0, 'encoder_cudagraph_max_frames_per_batch': None, 'compile_sizes': [], 'compile_ranges_endpoints': [4096], 'inductor_compile_config': {'enable_auto_functionalized_v2': False, 'size_asserts': False, 'alignment_asserts': False, 'scalar_asserts': False, 'combo_kernels': True, 'benchmark_combo_kernel': True}, 'inductor_passes': {}, 'cudagraph_mode': <CUDAGraphMode.NONE: 0>, 'cudagraph_num_of_warmups': 0, 'cudagraph_capture_sizes': [], 'cudagraph_copy_inputs': False, 'cudagraph_specialize_lora': True, 'use_inductor_graph_partition': False, 'pass_config': {'fuse_norm_quant': True, 'fuse_act_quant': True, 'fuse_attn_quant': False, 'enable_sp': False, 'fuse_gemm_comms': False, 'fuse_allreduce_rms': False, 'fuse_act_padding': False}, 'max_cudagraph_capture_size': 0, 'dynamic_shapes_config': {'type': <DynamicShapesType.BACKED: 'backed'>, 'evaluate_guards': False, 'assume_32_bit_indexing': False}, 'local_cache_dir': None, 'fast_moe_cold_start': False, 'static_all_moe_layers': []}, kernel_config=KernelConfig(ir_op_priority=IrOpPriorityConfig(rms_norm=['vllm_c', 'native'], fused_add_rms_norm=['vllm_c', 'native']), enable_flashinfer_autotune=False, moe_backend='auto')
(StageEngineCoreProc pid=7366) WARNING 06-09 16:28:24 [multiproc_executor.py:1029] Reducing Torch parallelism from 16 threads to 1 to avoid unnecessary CPU contention. Set OMP_NUM_THREADS in the external environment to tune this value as needed.
(StageEngineCoreProc pid=7366) INFO 06-09 16:28:24 [multiproc_executor.py:139] DP group leader: node_rank=0, node_rank_within_dp=0, master_addr=127.0.0.1, mq_connect_ip=192.168.0.132 (local), world_size=2, local_world_size=2
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] WorkerProc failed to start.
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] Traceback (most recent call last):
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 837, in worker_main
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] worker = WorkerProc(*args, **kwargs)
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] return func(*args, **kwargs)
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] ^^^^^^^^^^^^^^^^^^^^^
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 611, in init
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] self.worker.init_device()
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 317, in init_device
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] self.worker.init_device() # type: ignore
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] ^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] return func(*args, **kwargs)
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] ^^^^^^^^^^^^^^^^^^^^^
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/worker/gpu_ar_worker.py", line 52, in init_device
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] assert self.local_rank < torch.accelerator.device_count(), (
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker pid=7553) ERROR 06-09 16:28:30 [multiproc_executor.py:870] AssertionError: DP adjusted local rank 1 is out of bounds.
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] WorkerProc failed to start.
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] Traceback (most recent call last):
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 837, in worker_main
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] worker = WorkerProc(*args, **kwargs)
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] return func(*args, **kwargs)
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] ^^^^^^^^^^^^^^^^^^^^^
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 611, in init
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] self.worker.init_device()
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/worker/worker_base.py", line 317, in init_device
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] self.worker.init_device() # type: ignore
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] ^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] return func(*args, **kwargs)
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] ^^^^^^^^^^^^^^^^^^^^^
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/worker/gpu_ar_worker.py", line 56, in init_device
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] assert self.parallel_config.local_world_size <= visible_device_count, (
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker pid=7552) ERROR 06-09 16:28:30 [multiproc_executor.py:870] AssertionError: local_world_size (2) must be less than or equal to the number of visible devices (1).
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] StageEngineCoreProc failed to start.
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] Traceback (most recent call last):
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/stage_engine_core_proc.py", line 79, in run_stage_core
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] engine_core = StageEngineCoreProc(
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] ^^^^^^^^^^^^^^^^^^^^
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] return func(*args, **kwargs)
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] ^^^^^^^^^^^^^^^^^^^^^
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 880, in init
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] super().init(
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 118, in init
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] self.model_executor = executor_class(vllm_config)
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 107, in init
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] super().init(vllm_config)
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] return func(*args, **kwargs)
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] ^^^^^^^^^^^^^^^^^^^^^
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/abstract.py", line 109, in init
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] self._init_executor()
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 200, in _init_executor
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] self.workers = WorkerProc.wait_for_ready(unready_workers)
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 747, in wait_for_ready
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] raise e from None
(StageEngineCoreProc pid=7366) ERROR 06-09 16:28:31 [stage_engine_core_proc.py:104] Exception: WorkerProc initialization failed due to an exception in a background process. See stack trace for root cause.
(StageEngineCoreProc pid=7366) Process StageEngineCoreProc:
(StageEngineCoreProc pid=7366) Traceback (most recent call last):
(StageEngineCoreProc pid=7366) File "/home/william/miniconda3/envs/vllm/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
(StageEngineCoreProc pid=7366) self.run()
(StageEngineCoreProc pid=7366) File "/home/william/miniconda3/envs/vllm/lib/python3.12/multiprocessing/process.py", line 108, in run
(StageEngineCoreProc pid=7366) self._target(*self._args, **self._kwargs)
(StageEngineCoreProc pid=7366) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/stage_engine_core_proc.py", line 79, in run_stage_core
(StageEngineCoreProc pid=7366) engine_core = StageEngineCoreProc(
(StageEngineCoreProc pid=7366) ^^^^^^^^^^^^^^^^^^^^
(StageEngineCoreProc pid=7366) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(StageEngineCoreProc pid=7366) return func(*args, **kwargs)
(StageEngineCoreProc pid=7366) ^^^^^^^^^^^^^^^^^^^^^
(StageEngineCoreProc pid=7366) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 880, in init
(StageEngineCoreProc pid=7366) super().init(
(StageEngineCoreProc pid=7366) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 118, in init
(StageEngineCoreProc pid=7366) self.model_executor = executor_class(vllm_config)
(StageEngineCoreProc pid=7366) ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(StageEngineCoreProc pid=7366) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 107, in init
(StageEngineCoreProc pid=7366) super().init(vllm_config)
(StageEngineCoreProc pid=7366) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
(StageEngineCoreProc pid=7366) return func(*args, **kwargs)
(StageEngineCoreProc pid=7366) ^^^^^^^^^^^^^^^^^^^^^
(StageEngineCoreProc pid=7366) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/abstract.py", line 109, in init
(StageEngineCoreProc pid=7366) self._init_executor()
(StageEngineCoreProc pid=7366) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 200, in _init_executor
(StageEngineCoreProc pid=7366) self.workers = WorkerProc.wait_for_ready(unready_workers)
(StageEngineCoreProc pid=7366) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(StageEngineCoreProc pid=7366) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 747, in wait_for_ready
(StageEngineCoreProc pid=7366) raise e from None
(StageEngineCoreProc pid=7366) Exception: WorkerProc initialization failed due to an exception in a background process. See stack trace for root cause.
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] [AsyncOmniEngine] Stage initialization failed; shutting down 0 initialized client(s)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] Traceback (most recent call last):
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 1062, in _initialize_stages
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] initialized_clients_by_stage = self._initialize_stage_replicas(stage_plans, stage_init_timeout)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 976, in _initialize_stage_replicas
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] raise primary_exc
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 963, in _initialize_stage_replicas
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] initialized_clients_by_stage[stage_idx][replica_id] = future.result()
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] ^^^^^^^^^^^^^^^
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] File "/home/william/miniconda3/envs/vllm/lib/python3.12/concurrent/futures/_base.py", line 449, in result
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] return self.__get_result()
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] ^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] File "/home/william/miniconda3/envs/vllm/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] raise self._exception
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] File "/home/william/miniconda3/envs/vllm/lib/python3.12/concurrent/futures/thread.py", line 59, in run
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] result = self.fn(*self.args, **self.kwargs)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 894, in _initialize_replica
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] return self._initialize_llm_replica(plan, stage_init_timeout, stage_launch_lock)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 721, in _initialize_llm_replica
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] complete_stage_handshake(proc, handshake_address, addresses, vllm_config, stage_init_timeout)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/stage_engine_core_proc.py", line 163, in complete_stage_handshake
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] _perform_handshake(proc, handshake_address, addresses, vllm_config, handshake_timeout)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/stage_engine_core_proc.py", line 192, in _perform_handshake
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] identity, msg = _recv(poller, handshake_socket, proc, "READY", handshake_timeout)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/stage_engine_core_proc.py", line 221, in _recv
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] raise RuntimeError(f"StageEngineCoreProc died during {expected} (exit code {proc.exitcode})")
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1078] RuntimeError: StageEngineCoreProc died during READY (exit code 1)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] [AsyncOmniEngine] Orchestrator thread crashed
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] Traceback (most recent call last):
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 1142, in _bootstrap_orchestrator
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] loop.run_until_complete(_run_orchestrator())
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] File "/home/william/miniconda3/envs/vllm/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] return future.result()
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] ^^^^^^^^^^^^^^^
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 1127, in _run_orchestrator
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] self._initialize_stages(stage_init_timeout)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 1062, in _initialize_stages
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] initialized_clients_by_stage = self._initialize_stage_replicas(stage_plans, stage_init_timeout)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 976, in _initialize_stage_replicas
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] raise primary_exc
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 963, in _initialize_stage_replicas
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] initialized_clients_by_stage[stage_idx][replica_id] = future.result()
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] ^^^^^^^^^^^^^^^
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] File "/home/william/miniconda3/envs/vllm/lib/python3.12/concurrent/futures/_base.py", line 449, in result
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] return self.__get_result()
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] ^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] File "/home/william/miniconda3/envs/vllm/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] raise self._exception
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] File "/home/william/miniconda3/envs/vllm/lib/python3.12/concurrent/futures/thread.py", line 59, in run
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] result = self.fn(*self.args, **self.kwargs)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 894, in _initialize_replica
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] return self._initialize_llm_replica(plan, stage_init_timeout, stage_launch_lock)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 721, in _initialize_llm_replica
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] complete_stage_handshake(proc, handshake_address, addresses, vllm_config, stage_init_timeout)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/stage_engine_core_proc.py", line 163, in complete_stage_handshake
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] _perform_handshake(proc, handshake_address, addresses, vllm_config, handshake_timeout)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/stage_engine_core_proc.py", line 192, in _perform_handshake
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] identity, msg = _recv(poller, handshake_socket, proc, "READY", handshake_timeout)
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/stage_engine_core_proc.py", line 221, in _recv
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] raise RuntimeError(f"StageEngineCoreProc died during {expected} (exit code {proc.exitcode})")
(APIServer pid=6880) ERROR 06-09 16:28:32 [async_omni_engine.py:1148] RuntimeError: StageEngineCoreProc died during READY (exit code 1)
(APIServer pid=6880) INFO 06-09 16:28:32 [async_omni_engine.py:2174] [AsyncOmniEngine] Shutting down Orchestrator
(APIServer pid=6880) Exception in thread orchestrator:
(APIServer pid=6880) Traceback (most recent call last):
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
(APIServer pid=6880) self.run()
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/threading.py", line 1012, in run
(APIServer pid=6880) self._target(*self._args, **self._kwargs)
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 1142, in _bootstrap_orchestrator
(APIServer pid=6880) loop.run_until_complete(_run_orchestrator())
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
(APIServer pid=6880) return future.result()
(APIServer pid=6880) ^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 1127, in _run_orchestrator
(APIServer pid=6880) self._initialize_stages(stage_init_timeout)
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 1062, in _initialize_stages
(APIServer pid=6880) initialized_clients_by_stage = self._initialize_stage_replicas(stage_plans, stage_init_timeout)
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 976, in _initialize_stage_replicas
(APIServer pid=6880) raise primary_exc
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 963, in _initialize_stage_replicas
(APIServer pid=6880) initialized_clients_by_stage[stage_idx][replica_id] = future.result()
(APIServer pid=6880) ^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/concurrent/futures/_base.py", line 449, in result
(APIServer pid=6880) return self.__get_result()
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
(APIServer pid=6880) raise self._exception
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/concurrent/futures/thread.py", line 59, in run
(APIServer pid=6880) result = self.fn(*self.args, **self.kwargs)
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 894, in _initialize_replica
(APIServer pid=6880) return self._initialize_llm_replica(plan, stage_init_timeout, stage_launch_lock)
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 721, in _initialize_llm_replica
(APIServer pid=6880) complete_stage_handshake(proc, handshake_address, addresses, vllm_config, stage_init_timeout)
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/stage_engine_core_proc.py", line 163, in complete_stage_handshake
(APIServer pid=6880) _perform_handshake(proc, handshake_address, addresses, vllm_config, handshake_timeout)
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/stage_engine_core_proc.py", line 192, in _perform_handshake
(APIServer pid=6880) identity, msg = _recv(poller, handshake_socket, proc, "READY", handshake_timeout)
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/stage_engine_core_proc.py", line 221, in _recv
(APIServer pid=6880) raise RuntimeError(f"StageEngineCoreProc died during {expected} (exit code {proc.exitcode})")
(APIServer pid=6880) RuntimeError: StageEngineCoreProc died during READY (exit code 1)
(APIServer pid=6880) Traceback (most recent call last):
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
(APIServer pid=6880) self.run()
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/threading.py", line 1012, in run
(APIServer pid=6880) self._target(*self._args, **self._kwargs)
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 1142, in _bootstrap_orchestrator
(APIServer pid=6880) loop.run_until_complete(_run_orchestrator())
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
(APIServer pid=6880) return future.result()
(APIServer pid=6880) ^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 1127, in _run_orchestrator
(APIServer pid=6880) self._initialize_stages(stage_init_timeout)
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 1062, in _initialize_stages
(APIServer pid=6880) initialized_clients_by_stage = self._initialize_stage_replicas(stage_plans, stage_init_timeout)
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 976, in _initialize_stage_replicas
(APIServer pid=6880) raise primary_exc
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 963, in _initialize_stage_replicas
(APIServer pid=6880) initialized_clients_by_stage[stage_idx][replica_id] = future.result()
(APIServer pid=6880) ^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/concurrent/futures/_base.py", line 449, in result
(APIServer pid=6880) return self.__get_result()
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
(APIServer pid=6880) raise self._exception
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/concurrent/futures/thread.py", line 59, in run
(APIServer pid=6880) result = self.fn(*self.args, **self.kwargs)
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 894, in _initialize_replica
(APIServer pid=6880) return self._initialize_llm_replica(plan, stage_init_timeout, stage_launch_lock)
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 721, in _initialize_llm_replica
(APIServer pid=6880) complete_stage_handshake(proc, handshake_address, addresses, vllm_config, stage_init_timeout)
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/stage_engine_core_proc.py", line 163, in complete_stage_handshake
(APIServer pid=6880) _perform_handshake(proc, handshake_address, addresses, vllm_config, handshake_timeout)
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/stage_engine_core_proc.py", line 192, in _perform_handshake
(APIServer pid=6880) identity, msg = _recv(poller, handshake_socket, proc, "READY", handshake_timeout)
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/stage_engine_core_proc.py", line 221, in _recv
(APIServer pid=6880) raise RuntimeError(f"StageEngineCoreProc died during {expected} (exit code {proc.exitcode})")
(APIServer pid=6880) RuntimeError: StageEngineCoreProc died during READY (exit code 1)
(APIServer pid=6880)
(APIServer pid=6880) The above exception was the direct cause of the following exception:
(APIServer pid=6880)
(APIServer pid=6880) Traceback (most recent call last):
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/bin/vllm", line 6, in
(APIServer pid=6880) sys.exit(main())
(APIServer pid=6880) ^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/entrypoints/cli/main.py", line 52, in main
(APIServer pid=6880) omni_main()
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/entrypoints/cli/main.py", line 65, in main
(APIServer pid=6880) args.dispatch_function(args)
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/entrypoints/cli/serve.py", line 100, in cmd
(APIServer pid=6880) uvloop.run(omni_run_server(args))
(APIServer pid=6880) File "/home/william/.local/lib/python3.12/site-packages/uvloop/init.py", line 96, in run
(APIServer pid=6880) return __asyncio.run(
(APIServer pid=6880) ^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/asyncio/runners.py", line 195, in run
(APIServer pid=6880) return runner.run(main)
(APIServer pid=6880) ^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=6880) return self._loop.run_until_complete(task)
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=6880) File "/home/william/.local/lib/python3.12/site-packages/uvloop/init.py", line 48, in wrapper
(APIServer pid=6880) return await main
(APIServer pid=6880) ^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/entrypoints/openai/api_server.py", line 362, in omni_run_server
(APIServer pid=6880) await omni_run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/entrypoints/openai/api_server.py", line 380, in omni_run_server_worker
(APIServer pid=6880) async with build_async_omni(
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/contextlib.py", line 210, in aenter
(APIServer pid=6880) return await anext(self.gen)
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/entrypoints/openai/api_server.py", line 494, in build_async_omni
(APIServer pid=6880) async with build_async_omni_from_stage_config(
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/contextlib.py", line 210, in aenter
(APIServer pid=6880) return await anext(self.gen)
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/entrypoints/openai/api_server.py", line 534, in build_async_omni_from_stage_config
(APIServer pid=6880) async_omni = AsyncOmni(model=args.model, **kwargs)
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/entrypoints/async_omni.py", line 139, in init
(APIServer pid=6880) OmniBase.init(self, model=model, **kwargs)
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/entrypoints/omni_base.py", line 174, in init
(APIServer pid=6880) self.engine = AsyncOmniEngine(
(APIServer pid=6880) ^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 346, in init
(APIServer pid=6880) self._wait_for_orchestrator_init(startup_future, startup_timeout)
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm_omni/engine/async_omni_engine.py", line 1186, in _wait_for_orchestrator_init
(APIServer pid=6880) startup_future.result(
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/concurrent/futures/_base.py", line 456, in result
(APIServer pid=6880) return self.__get_result()
(APIServer pid=6880) ^^^^^^^^^^^^^^^^^^^
(APIServer pid=6880) File "/home/william/miniconda3/envs/vllm/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
(APIServer pid=6880) raise self._exception
(APIServer pid=6880) RuntimeError: Orchestrator initialization failed: StageEngineCoreProc died during READY (exit code 1)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions