How does one run GLM 4.6V using the vLLM docker image?

#19
by kldzj - opened

If we need to manually upgrade to transformers v5, is there currently no way to run this model using vLLMs v0.12.0 docker image?

Create a new file called GLM-4.6V.Dockerfile with the following content:

FROM vllm/vllm-openai:nightly
RUN uv pip install transformers==5.0.0rc0 --upgrade --no-deps --system
RUN uv pip install huggingface-hub --upgrade --no-deps --system

Then create a custom docker image by running:

docker build . -f GLM-4.6V.Dockerfile -t vllm/vllm-openai:glm46v

Then start a container from this custom vLLM image:

docker run -it \
  --gpus all \
  --ipc host \
  -p 8000:8000 \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  vllm/vllm-openai:glm46v \
     --model zai-org/GLM-4.6V \
     --tool-call-parser glm45 \
     --reasoning-parser glm45 \
     --enable-auto-tool-choice \
     --enable-expert-parallel \
     --allowed-local-media-path / \
     --mm-encoder-tp-mode data \
     --mm-processor-cache-type shm

Sign up or log in to comment