Ollama Installation on Docker Container and Docker Compose file

I have created a local chatbot code in python 3.10.10. I want to let it run on the Server. Therefore, I have created a docker container. Since in my source (chatbot code), i am using below dependencies

chromadb==0.5.3
streamlit==1.36.0
langchain_core==0.2.9
langchain_community==0.2.5
PyPDF2
pypdf==4.2.0

Apart from these dependencies, my chatbot application has some other dependencies like below:

Python 3.10.10
Ollama
LLM_model = Mistral:latest
embeddings_model = nomic-embed-text:latest

Since I target to deploy the code into server (where there is no dependencies pre-installed), i have written command to pull the Ollama Docker Image and pull the Embeddings model and LLM Model using Docker-compose.

I have created the docker file as well and I have setted up the enviornment variables as below:

ENV BASE_URL=http://ollama:11434

Run the application

CMD ["streamlit", "run", "chatbot.py", "--server.port=8501", "--server.address=0.0.0.0"]

I have the docker compose file as below:

version: '3.8'

services:
  ollama:
    image: ollama/ollama:latest  # Use the official Ollama image
    container_name: ollama
    ports:
      - "11434:11434"
    command: >
      ollama pull nomic-embed-text:latest &&
      ollama pull mistral:latest &&
      ollama serve
    # command: serve  # Simplify the command to just serve models available
    environment:
      - MODELS=nomic-embed-text:latest,mistral:latest

  chatbot:
    build: .
    container_name: chatbot
    environment:
      BASE_URL: http://ollama:11434
    ports:
      - "8501:8501"
    depends_on:
      - ollama

When I execute the docker file and docker-compose via command: docker-compose up --build and then when I open the address : http://localhost:8501/, it shows me below error:

ValueError: Error raised by inference endpoint: HTTPConnectionPool(host='ollama', port=11434): Max retries exceeded with url: /api/embeddings (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0x7f9a613b0d60>: Failed to resolve 'ollama' ([Errno -3] Temporary failure in name resolution)"))

I do not understand why this error comes up. Whenever I access the port 11434 outside the container using http://localhost:11434, it says Ollama is running. Seocnd thing is, whenever I run the chatbot application without the docker container, it runs fine.

Could you please give me explanation about the source of an error and the solution ?

Upvotes: 0

Answers (3)

Kelo

Reputation: 169

I just had the same issue you mentioned and in order to fix it, you must ensure that both the ollama and chatbot services are running on the same network.Otherwise, Docker doesn't know how to resolve the domains.

Here is the fixed version of the docker-compose.yml file:

services:
  ollama:
    image: ollama/ollama:latest  # Use the official Ollama image
    container_name: ollama
    ports:
      - "11434:11434"
    command: >
      ollama pull nomic-embed-text:latest &&
      ollama pull mistral:latest &&
      ollama serve
    networks:
      - ollama_network
    environment:
      - MODELS=nomic-embed-text:latest,mistral:latest

  chatbot:
    build: .
    container_name: chatbot
    environment:
      BASE_URL: http://ollama:11434
    ports:
      - "8501:8501"
    depends_on:
      - ollama
    networks:
      - ollama_network
networks:
    ollama_network:
       driver: bridge

Besides modifying your example, I am also including a version that I used and works perfectly, including the download and serving of the models:

name: ollama_project
services:

  ollama:
    container_name: ollama
    restart: unless-stopped
    image: ollama/ollama:latest
    runtime: nvidia
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
    volumes:
      - "./ollamadata:/root/.ollama"
    ports:
      - 11434:11434
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    healthcheck:
      test: ollama list || exit 1
      interval: 10s
      timeout: 30s
      retries: 5
      start_period: 10s
    networks:
      - ollama_network

  ollama-models-pull:
    container_name: ollama-models-pull
    image: curlimages/curl:latest
    command: >
      http://ollama:11434/api/pull -d '{"name":"llama3.1"}'
    depends_on:
      ollama:
        condition: service_healthy
    networks:
      - ollama_network

networks:
  ollama_network:
    driver: bridge

Upvotes: 1

Pantelis Kaniouras

Reputation: 41

I am having the same issue. Here is some context:

The error I am seeing:

{
  "error": "[Errno 111] Connection refused"
}

My docker-compose.yaml file:

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    ports:
      - "11434:11434"
    command: >
      ollama pull llama3-groq-tool-use:latest &&
      ollama serve
    volumes:
      - ./data/ollama:/root/.ollama
    networks:
      - langchain-network
    environment:
      - MODELS=llama3-groq-tool-use:latest

  langchain_app:
    build: .
    image: langchain_app
    container_name: langchain_app
    environment:
      BASE_URL: http://ollama:11434
    ports:
      - "8501:8501"
    depends_on:
      - ollama
    networks:
      - langchain-network

volumes:
  ollama: {}

networks:
  langchain-network:
    driver: bridge
    name: langchain-network

My endpoint:

# Endpoint for the pure LLM model
@app.post("/pure_llm")
def query(request: str):
    logger.info(f"Received request: {request}")
    user_message = request
    messages = [
        ("system", "You are a helpful assistant."),
        ("human", user_message),
    ]
    try:
        llm = ChatOllama(
                model='llama3-groq-tool-use',  # or 'llama3.1'
                base_url='http://ollama:11434',
                temperature=0,
                verbose=True
            )
        response = llm.invoke(messages)
        logger.info(f"Returned response: {response}")
        return {"response": response}
    except Exception as e:
        # Handle exceptions appropriately (log errors, return specific error messages)
        return {"error": str(e)}

The app still sends the request to the wrong address, like ignoring the ollama:11434 address that I am giving

langchain_app  | INFO [2024-08-30 10:56:25] main - Received request: hello
langchain_app  | DEBUG [2024-08-30 10:56:25] main - Using base_url: http://172.19.0.3:11434
langchain_app  | DEBUG [2024-08-30 10:56:25] httpcore.connection - connect_tcp.started host='127.0.0.1' port=11434 local_address=None timeout=None socket_options=None
langchain_app  | DEBUG [2024-08-30 10:56:25] httpcore.connection - connect_tcp.failed exception=ConnectError(ConnectionRefusedError(111, 'Connection refused'))
langchain_app  | INFO:     172.25.0.1:38928 - "POST /pure_llm?request=hello HTTP/1.1" 200 OK

Upvotes: 0

Liang HE

Reputation: 135

Not sure if it is just a typo when you copy and paste it. You have an extra " after serve command. This may cause it failed to start the service and consequently led to the error you see.

Upvotes: 0

Ollama Installation on Docker Container and Docker Compose file

Run the application

Answers (3)

Related Questions