How does one run GLM 4.6V using the vLLM docker image?

#19

by kldzj - opened 3 days ago

3 days ago

If we need to manually upgrade to transformers v5, is there currently no way to run this model using vLLMs v0.12.0 docker image?

Felladrin

2 days ago

•

edited 2 days ago

Create a new file called GLM-4.6V.Dockerfile with the following content:

FROM vllm/vllm-openai:nightly
RUN uv pip install transformers==5.0.0rc0 --upgrade --no-deps --system
RUN uv pip install huggingface-hub --upgrade --no-deps --system

Then create a custom docker image by running:

docker build . -f GLM-4.6V.Dockerfile -t vllm/vllm-openai:glm46v

Then start a container from this custom vLLM image:

docker run -it \
  --gpus all \
  --ipc host \
  -p 8000:8000 \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  vllm/vllm-openai:glm46v \
     --model zai-org/GLM-4.6V \
     --tool-call-parser glm45 \
     --reasoning-parser glm45 \
     --enable-auto-tool-choice \
     --enable-expert-parallel \
     --allowed-local-media-path / \
     --mm-encoder-tp-mode data \
     --mm-processor-cache-type shm

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment