s123
s123

Reputation: 29

How to locally access an Ollama model remotely hosted on a Google Colab for a Python script?

I am able to successfully run Ollama on Google Colab using this code (below), and am able to access it from my local terminal using export OLLAMA_HOST=https://{url}.ngrok-free.app/ and ollama run llama2.

from google.colab import userdata
NGROK_AUTH_TOKEN = userdata.get('NGROK_AUTH_TOKEN')

# Download and install ollama to the system
!curl https://ollama.ai/install.sh | sh

!pip install aiohttp pyngrok

import os
import asyncio

# Set LD_LIBRARY_PATH so the system NVIDIA library
os.environ.update({'LD_LIBRARY_PATH': '/usr/lib64-nvidia'})

async def run_process(cmd):
  print('>>> starting', *cmd)
  p = await asyncio.subprocess.create_subprocess_exec(
      *cmd,
      stdout=asyncio.subprocess.PIPE,
      stderr=asyncio.subprocess.PIPE,
  )

  async def pipe(lines):
    async for line in lines:
      print(line.strip().decode('utf-8'))

  await asyncio.gather(
      pipe(p.stdout),
      pipe(p.stderr),
  )

#register an account at ngrok.com and create an authtoken and place it here
await asyncio.gather(
    run_process(['ngrok', 'config', 'add-authtoken', NGROK_AUTH_TOKEN])
)

await asyncio.gather(
    run_process(['ollama', 'serve']),
    run_process(['ngrok', 'http', '--log', 'stderr', '11434']),
)

However, I'm not able to use a locally executed Python script to access the Google Colab Ollama. Here is the code I have tried:

import requests

# Ngrok tunnel URL
ngrok_tunnel_url = "https://{url}.ngrok-free.app/"

# Define the request payload
payload = {
    "model": "llama2",
    "prompt": "Why is the sky blue?"
}

try:
    # Send the request to the ngrok tunnel URL
    response = requests.post(ngrok_tunnel_url, json=payload)
    
    # Check the response status code
    if response.status_code == 200:
        print("Request successful:")
        print("Response content:")
        print(response.text)
    else:
        print("Error:", response.status_code)
except requests.exceptions.RequestException as e:
    print("Error:", e)

which returns Error: 403

Does anyone know how to fix this, or how to properly access a remote instance of Ollama?

Upvotes: 2

Views: 4669

Answers (1)

Kaarel Kaarelson
Kaarel Kaarelson

Reputation: 31

1. Set the Host Header to localhost:11434

I had the same issue in both terminal and Python. Setting the flag --request-header="localhost:11434" for the ngrok command fixed both for me.

I think the 403 occurs because the incoming requests are still not routed correctly by the tunnel. It was recently added to Ollama FAQ.

Here's the updated server code:

await asyncio.gather(
    run_process(['ollama', 'serve']),
    run_process(['ngrok', 'http', '--log', 'stderr', '11434', '--host-header="localhost:11434"']), # Set host header
)

Upvotes: 3

Related Questions