使用 Ollama 的本地 LLM 设置 AI 的 REST-API 服务

使用 Ollama 的本地 LLM 为 AI 设置 REST API 服务似乎是一种实用的方法。这是一个简单的工作流程。

1 安装 Ollama 和 LLM

首先在本地机器上安装 Ollama 和本地 LLM。Ollama 有助于本地部署 LLM，从而更易于管理和利用它们执行各种任务。

Ollama

Ollama 下载

安装应用程序文件

为 Ollama 安装 LLM

ollama pull llama3
ollama run llama3

下载并运行 llama3

与 llama3 本地聊天

Ollama 命令

Available Commands:
  /set         Set session variables
  /show        Show model information
  /bye         Exit
  /?, /help    Help for a command

Use """ to begin a multi-line message

测试Ollama

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt": "Why is the sky blue?",
  "stream": true
}'

如果将 stream 设置为 false，则响应将是单个 JSON 对象

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt": "Why is the sky blue?",
  "stream": false
}'

2 设置FastAPI：

设置 Python FastAPI 应用程序。FastAPI 是一个现代、快速（高性能）的 Web 框架，用于基于标准 Python 类型提示使用 Python 3.7+ 构建 API。它是构建强大而高效的 API 的绝佳选择。

开发 FastAPI 路由和端点以与 Ollama 服务器交互。这涉及向 Ollama 发送请求以处理任务，例如文本生成、语言理解或您的 LLM 支持的任何其他 AI 相关任务。以下代码是一个简单的示例。（您还可以使用Ollama Python 库来改进以下编码。）

g.)

from typing import Union
from fastapi import FastAPI
from pydantic import BaseModel
import json
import requests

app = FastAPI(debug=True)

class Itemexample(BaseModel):
    name: str
    prompt: str
    instruction: str
    is_offer: Union[bool, None] = None

class Item(BaseModel):
    model: str
    prompt: str

urls =["http://localhost:11434/api/generate"]

headers = {
    "Content-Type": "application/json"
}


@app.get("/")
def read_root():
    return {"Hello": "World"}

@app.post("/chat/{llms_name}")
def update_item(llms_name: str, item: Item):
    if llms_name == "llama3":
        url = urls[0]
        payload = {
            "model": "llama3",
            "prompt": "Why is the sky blue?",
            "stream": False
        }
        response = requests.post(url, headers=headers, data=json.dumps(payload))
        if response.status_code == 200:
            return {"data": response.text, "llms_name": llms_name}
        else:
            print("error:", response.status_code, response.text)
            return {"item_name": item.model, "error": response.status_code, "data": response.text}
    return {"item_name": item.model, "llms_name": llms_name}

测试 REST-API 服务

curl --location 'http://127.0.0.1:8000/chat/llama3' \
--header 'Content-Type: application/json' \
--data '{
  "model": "llama3",
  "prompt": "Why is the sky blue?"
}
'

通过 API 进行 Curl 请求

API 日志

3.部署：

一旦您对 REST API 的功能和性能感到满意，就可以根据需要将此服务部署到生产环境。这可能涉及将其部署到云平台、使用 Docker 进行容器化或将其部署到服务器上。

在这个简单的示例中，通过利用 Ollama 进行本地 LLM 部署并将其与 FastAPI 集成以构建 REST API 服务器，您可以创建一个免费的 AI 服务解决方案。此模型可以根据您自己的训练数据进行微调，以实现定制目的（我们将在未来讨论）。

{{userData.name}}已认证

我如何创建我的第一家 AI 初创公司（没有任何经验）

GenAI Compass：用于设计生成式 AI 体验的 UX 框架

灵图官方——基础Stable DiffusionAI艺术课程(初级到专业)

Open WebUI：LLM Web UI

10 款最佳 AI 编码辅助工具 — 开发人员指南

使用这三个必备的 AI 工具增强你的 Mac

Office AI 助手 v0.3.01(免费,2024-06-01更新支持本机ChatGPT-4o、文心一言4.0）

专家系统