How does one run GLM 4.6V using the vLLM docker image?
#19
by
kldzj
- opened
If we need to manually upgrade to transformers v5, is there currently no way to run this model using vLLMs v0.12.0 docker image?
Create a new file called GLM-4.6V.Dockerfile with the following content:
FROM vllm/vllm-openai:nightly
RUN uv pip install transformers==5.0.0rc0 --upgrade --no-deps --system
RUN uv pip install huggingface-hub --upgrade --no-deps --system
Then create a custom docker image by running:
docker build . -f GLM-4.6V.Dockerfile -t vllm/vllm-openai:glm46v
Then start a container from this custom vLLM image:
docker run -it \
--gpus all \
--ipc host \
-p 8000:8000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
vllm/vllm-openai:glm46v \
--model zai-org/GLM-4.6V \
--tool-call-parser glm45 \
--reasoning-parser glm45 \
--enable-auto-tool-choice \
--enable-expert-parallel \
--allowed-local-media-path / \
--mm-encoder-tp-mode data \
--mm-processor-cache-type shm