Tested with Ollama Official Image v0.1.27
.
Ollama is a super easy tool to run LLM/GenAI model locally, with elegant installaion options, huge community ecosystem and official Docker Images built.
Regular installation will put a executable binary file into system, but we can also run Ollama as Docker Container. It's a two-step process:
- Execute
docker run -itd --name ollama ollama/ollama
to launch Ollama backend, the default Entrypoint is/bin/ollama
and Command isserve
, which means it executesollama serve
to be ready to accept incoming requests - Execute
docker exec -it ollama ollama run <Model Name:Version>
to enter the interactive interface and start to inject prompts
Sounds good, but not enough. If we gonna host with some cloud services and turn Ollama into a serverless, it should
- Launch the backend at the first moment
- Accept a prompt argument then output the answer
- Shutdown
So we can make inference like following pseudo code: docker run -it --rm --name ollama/ollama ollama run <Model Name:Version> "<Prompt>"
.
If you need the same function, this repository gives you a better way to use Ollama as Docker Container.
To build the image with "gemma:2b-instruct-q4_0" model for example:
./build.sh
# Which actually does this:
docker build -t gemma .
This will take some time since one of the building step is download the model artifact.
Inference Example:
./run.sh
# Which actually does this:
docker run -it --rm --name gemma gemma '/serve.sh "<Prompt>"'
The magic is in the serve.sh
script that
- Launches the Ollama backend in the background:
nohup ollama serve
- Sleep 1 second to make sure the backend is awaken:
sleep 1
- Make inference with "gemma:2b-instruct-q4_0" model and take the argument as Prompt:
ollama run gemma:2b-instruct-q4_0 "<Prompt>"
Sometimes it's annoying to download the model artifact everytime when building Docker Image. Another approach is pre-downloaded it, and copy into Docker Image during the building process.
- Create a directory to store the artifacts:
mkdir ollama
- Launch Ollama Container and mount the directory:
docker run -it --rm -v $(pwd)/ollama/:/root/.ollama/ --name ollama ollama/ollama
- Pull the Model artifact:
docker exec -it ollama ollama pull <Model Name:Version>
- Edit the Dockerfile: Remove
RUN /bin/bash -c "/bin/ollama serve & sleep 1 && ollama pull <Model"
, replace it withCOPY ["ollama/", "/root/.ollama/"]
Example Dockerfile:
FROM ollama/ollama:0.1.27
COPY ["ollama/", "/root/.ollama/"]
COPY ["serve.sh", "/serve.sh"]
RUN ["chmod", "+x", "/serve.sh"]
ENTRYPOINT ["/bin/bash", "-c"]
Edit Line # of Dockerfile
, change the name and version of the model. Take gemma:7b
for example:
RUN /bin/bash -c "/bin/ollama serve & sleep 1 && ollama pull gemma:7b"