Reputation: 29
I am able to successfully run Ollama on Google Colab using this code (below), and am able to access it from my local terminal using export OLLAMA_HOST=https://{url}.ngrok-free.app/
and ollama run llama2
.
from google.colab import userdata
NGROK_AUTH_TOKEN = userdata.get('NGROK_AUTH_TOKEN')
# Download and install ollama to the system
!curl https://ollama.ai/install.sh | sh
!pip install aiohttp pyngrok
import os
import asyncio
# Set LD_LIBRARY_PATH so the system NVIDIA library
os.environ.update({'LD_LIBRARY_PATH': '/usr/lib64-nvidia'})
async def run_process(cmd):
print('>>> starting', *cmd)
p = await asyncio.subprocess.create_subprocess_exec(
*cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
async def pipe(lines):
async for line in lines:
print(line.strip().decode('utf-8'))
await asyncio.gather(
pipe(p.stdout),
pipe(p.stderr),
)
#register an account at ngrok.com and create an authtoken and place it here
await asyncio.gather(
run_process(['ngrok', 'config', 'add-authtoken', NGROK_AUTH_TOKEN])
)
await asyncio.gather(
run_process(['ollama', 'serve']),
run_process(['ngrok', 'http', '--log', 'stderr', '11434']),
)
However, I'm not able to use a locally executed Python script to access the Google Colab Ollama. Here is the code I have tried:
import requests
# Ngrok tunnel URL
ngrok_tunnel_url = "https://{url}.ngrok-free.app/"
# Define the request payload
payload = {
"model": "llama2",
"prompt": "Why is the sky blue?"
}
try:
# Send the request to the ngrok tunnel URL
response = requests.post(ngrok_tunnel_url, json=payload)
# Check the response status code
if response.status_code == 200:
print("Request successful:")
print("Response content:")
print(response.text)
else:
print("Error:", response.status_code)
except requests.exceptions.RequestException as e:
print("Error:", e)
which returns Error: 403
Does anyone know how to fix this, or how to properly access a remote instance of Ollama?
Upvotes: 2
Views: 4669
Reputation: 31
I had the same issue in both terminal and Python. Setting the flag --request-header="localhost:11434"
for the ngrok command fixed both for me.
I think the 403 occurs because the incoming requests are still not routed correctly by the tunnel. It was recently added to Ollama FAQ.
Here's the updated server code:
await asyncio.gather(
run_process(['ollama', 'serve']),
run_process(['ngrok', 'http', '--log', 'stderr', '11434', '--host-header="localhost:11434"']), # Set host header
)
Upvotes: 3