[大语言模型-工程实践] 手把手教你-基于Ollama搭建本地个人智能AI助理

简介: [大语言模型-工程实践] 手把手教你-基于Ollama搭建本地个人智能AI助理

1. Ollama简介

Ollama 是一个用于在本地环境中运行和定制大型语言模型的工具。它提供了一个简单而高效的接口,用于创建、运行和管理这些模型,同时还提供了一个丰富的预构建模型库,可以轻松集成到各种应用程序中。Ollama 支持多种操作系统,包括 macOS、Windows、Linux 以及通过 Docker 容器运行,这使得不同环境下的用户都能轻松使用。

安装 Ollama 相对简单,支持多平台部署。例如,在 macOS 上,可以通过 Homebrew 安装,而在 Windows 上则需要下载安装包并进行手动安装。Linux 用户可以通过包管理器或下载源码编译安装,也可以使用 Docker 安装。

使用 Ollama,你可以快速开始部署和运行大型模型。安装完成后,通过命令行启动 Ollama,部署模型,然后使用模型进行文本生成或其他任务。Ollama 提供了丰富的功能和 API,包括模型管理、模型运行、自定义模型等。

Ollama 还提供了 Python 和 JavaScript 库,以及命令行界面(CLI)和 REST API,方便开发者集成到各种应用中。此外,Ollama 支持从多种格式导入模型,如 GGUF、PyTorch 和 Safetensors,同时也提供了一个模型库,方便用户选择和使用。

Ollama 的安全性和隐私保护也是其一大特点。它使用 SSL/TLS 加密传输数据,并提供访问控制列表(ACL)来限制对服务的访问。Ollama 不会存储用户的个人数据,除非用户明确同意。

Ollama 的未来创新开发潜力巨大,包括模型优化和扩展、更广泛的应用集成、增强的定制化功能、社区和生态系统的增长以及安全性和隐私保护的提升。

2. 基于Ollama搭建本地个人智能AI助理

要基于 Ollama 搭建一个本地的个人智能 AI 助理并使用 Python 进行交互,你需要先确保 Ollama 已经在你的机器上正确安装,并且至少有一个模型已经部署好。

2.1 安装本地Ollama

首先,从官网上下载适合的Ollama版本,这里以MAC环境为例,从官网https://ollama.com/download 下载安装文件,

https://ollama.com/#:~:text=Get%20up%20and%20running%20with%20large%20language

接着,对下载的安装文件解压,得到一个Ollama的安装软件,点击该图标,并确认打开该软件,进入安装步骤:

安装过程比较简单,提示将该软件移到Applications,点击确定就可以:

如下图,点击Next。

继续点击Install,完成安装操作:

这时候会弹出一个参考demo的命令行,点击保存,然后点击Finish完成安装操作。

2.2 本地AI测试-控制台模式

打开一个常用的命令行终端,输入上面的demo命令行,将下载llama3.1的模型参数,后面就可以执行对话AI测试:

ollama run llama3.1

如果想要测试中文的大语言模型,可以使用qwen2,安装命令:

ollama run qwen2

这里,我们假设想利用个人AI助理帮我们安排一段旅行行程,测试效果如下:

> (base) fyfMBP:library fyf$ ollama run qwen2
>>> 我想要十一去趟南京,请帮我安排一个5天的游玩行程,推荐当地特殊景点和美食。
南京作为六朝古都,拥有丰富的历史文化遗产和独特的自然景观,同时也融合了江南水乡的柔美与现代化城市的活力。以下是为您精心设计的南京五日游行程建议:
**第一天:**
- **上午**:游览明孝陵和中山陵。这两大名胜体现了中国古代皇家园林的精髓和现代国家领袖的精神气度。
- **下午**:参观南京博物院,这里收藏着丰富的文物,从远古石器到明清瓷器,全面展示了中国历史文化的脉络。
- **晚上**:漫步秦淮河畔,体验古典与现代交融的独特风情。您可以在夫子庙附近品尝正宗的南京小吃,比如鸭血粉丝汤、盐水鸭等。
**第二天:**
- **全天**:游览南京大屠杀纪念馆和侵华日军南京大屠杀遇难同胞纪念馆。了解历史,缅怀先烈。
推荐参观后在雨花台感受自然之美与革命精神并存的魅力。
**第三天:**
- **上午**:前往紫金山天文台,享受一个宁静的早晨,在这里可以看到南京城的全貌和近处的山景。
- **下午**:探索南京老门东街区。这里是体验南京传统文化的好地方,您可以在这里漫步、购物、品尝地道小吃。
**第四天:**
- **全天**:参观雨花台烈士陵园和中国近代史遗址博物馆,缅怀革命先烈,了解中国近现代历史。
推荐晚上在新街口区域逛逛,这里有各种商场、书店、咖啡厅等,是感受南京现代生活的好去处。
**第五天:**
- **上午**:乘坐游船游览玄武湖,享受宁静的湖泊风光和周边的园林美景。
- **下午**:探访鸡鸣寺,体验古老的佛教文化。之后,您可以选择在附近的茶馆品茗休息,或者前往南京图书馆感受浓郁的文化氛围。
**美食推荐**:
1. 鸭血粉丝汤:是南京的传统小吃,口感丰富、鲜美可口。
2. 盐水鸭:南京的特色名菜之一,肉质紧实、味道鲜美。
3. 南京烤鸭包:皮薄馅多,咬一口满嘴流油,非常美味。
这只是一个大致的行程框架,您可以根据个人兴趣和时间安排进行调整。希望您在南京的旅程愉快!

再来一个例子,如果我们希望模型帮我们查询一些最新的论文资料,可以如下操作:

>>> please help me to retrieval some papers about CTR predictions in recent two years.
To retrieve recent research papers on Click-Through Rate (CTR) prediction, you can use academic databases such as Google Scholar, PubMed, IEEE Xplore, and the ACM Digital Library. Here's a step-by-step guide on how to search these platforms effectively:
### Google Scholar
1. **Search for Keywords**: Enter keywords like "CTR Prediction", "Machine Learning CTR", "Deep Learning for CTR", etc., in the search bar.
2. **Filter by Year**: After selecting your search results, apply filters to display only papers published within the last two years.
3. **Check Citations and Recommendations**: Google Scholar often shows highly cited articles at the top, which are influential studies that you might want to explore.
### PubMed
If your research involves biological or medical applications of CTR predictions:
1. **Search Terms**: Use terms like "Machine Learning", "Predictive Modeling", "User Behavior", along with specific disease/condition names.
2. **Date Range**: Set the date range to filter results from the last two years.
### IEEE Xplore
For papers related to engineering or computer science:
1. **Topic Navigation**: Navigate through specific journals and conferences like Transactions on Knowledge Discovery from Data (TKDD), ICDE, CIKM, WWW.
2. **Keyword Search**: Use terms such as "CTR Prediction", "Recommender Systems", "Data Mining".
### ACM Digital Library
Similar to IEEE Xplore, but focused on computing:
1. **Advanced Search**: Utilize the advanced search feature to include specific publication types (e.g., journal articles, conference papers).
2. **Keywords and Filters**: Include relevant keywords and filter by date.
### Additional Tips:
- **Author Alerts**: If you find authors whose work is particularly interesting or impactful, set up alerts for their publications.
- **Conferences and Workshops**: Focus on those related to data mining, machine learning, AI, and information retrieval as they often feature the latest CTR prediction research.
- **Networking**: Engage with researchers via social media platforms like LinkedIn or Twitter. Scholarly networks can provide insights into emerging trends and opportunities for collaboration.
### Example Search Terms:
- "CTR Prediction" AND "Deep Learning"
- "Machine Learning" AND "Recommender Systems"
- "Clickstream Analysis" AND "User Behavior"
These steps should help you effectively identify and access relevant papers on CTR prediction. Remember to read abstracts carefully, as not all will directly address your specific interest, but this process should yield a comprehensive list of pertinent research for
your needs.

从这个结果来看,ollama只是给了我们一些使用的建议,本身并没有提供对应的查询结果。

我们可以进一步要求AI助理给出一些对应的论文信息:

>>> please list some papers about CTR prediction.
Given the vast amount of research in the field of Click-Through Rate (CTR) prediction, here are several influential and recent studies that might be of interest:
1. **"Deep Learning Based Context-aware Recommendations: An Experiment on Douban Movie Platform"**
   - Authors: Rui Li, Qijun Jiang, Jie Tang, et al.
   - *Journal*: IEEE Transactions on Knowledge and Data Engineering
   - *Year*: 2020
   This paper discusses the effectiveness of deep learning models in context-aware recommendations, which includes CTR prediction as a key component.
2. **"CTR Prediction for Click-Streaming Data with Time-series Neural Networks"**
   - Authors: Yash M. Desai, Chien-Ju Ho
   - *Conference*: ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)
   - *Year*: 2019
   This work introduces the use of time-series neural networks for CTR prediction in click-stream data.
3. **"Learning to Rank: From Pairwise Approach to Listwise Approach"**
   - Authors: Zhi-Ming Ma, Guo-Qiang Tian
   - *Conference*: IEEE International Conference on Data Mining (ICDM)
   - *Year*: 2018
   This paper discusses advancements in learning to rank systems which are crucial for CTR prediction in recommendation systems.
4. **"CTR Prediction Models in Online Advertising: A Survey"**
   - Authors: Xiaoxi Jiang, Qian Liang, Shih-Fu Chang
   - *Journal*: ACM Computing Surveys (CSUR)
   - *Year*: 2019
   This survey paper covers various models for CTR prediction specifically within the context of online advertising.
5. **"A Review on Deep Learning and Neural Networks in Information Retrieval"**
   - Authors: Aman P. Verma, Anuj Srivastava
   - *Journal*: Journal of Information Science
   - *Year*: 2019
   This review discusses the application of deep learning techniques including neural networks for tasks like information retrieval that often includes CTR prediction.
6. **"Deep Neural Networks for Large-Scale CTR Prediction in Display Advertising"**
   - Authors: Chien-Ju Ho, Yen-Ting Lin
   - *Conference*: International Conference on Data Mining (ICDM)
   - *Year*: 2017
   This paper explores the implementation of deep neural networks to address large-scale challenges in CTR prediction.
These papers represent a mix of different methodologies and applications related to CTR prediction, offering both theoretical advancements and practical insights into modern techniques. For deeper understanding or more specific research directions, considering
accessing these papers through academic databases like Google Scholar, IEEE Xplore, or ACM Digital Library would be highly beneficial.

2.3 本地AI测试-API模式

除了本地控制台模式外,Ollama也体统一些API使用模式。在Ollama安装成功后,就已经 建立了11434端口,可以简单使用curl方式调用:

curl http://localhost:11434/api/generate -d '{
  "model": "qwen2",
  "prompt":"Why the sun is so hot?"
}'

执行如下:

(base) fyfMBP:~ fyf$ curl http://localhost:11434/api/generate -d '{
>   "model": "qwen2",
>   "prompt":"Why the sun is so hot?"
> }'
{"model":"qwen2","created_at":"2024-10-04T01:04:11.147267Z","response":"The","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:11.42658Z","response":" Sun","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:11.722676Z","response":" appears","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:12.00233Z","response":" hot","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:12.276914Z","response":" because","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:12.559582Z","response":" it","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:12.835295Z","response":" is","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:13.13865Z","response":" an","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:13.414892Z","response":" active","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:13.684645Z","response":" star","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:13.956783Z","response":",","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:14.230772Z","response":" constantly","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:14.503081Z","response":" undergoing","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:14.77767Z","response":" nuclear","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:15.047265Z","response":" fusion","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:15.314927Z","response":" reactions","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:15.589948Z","response":" in","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:15.861257Z","response":" its","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:16.134838Z","response":" core","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:16.408961Z","response":".","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:16.68125Z","response":" The","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:16.954901Z","response":" internal","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:17.227297Z","response":" pressure","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:17.502385Z","response":" and","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:17.783161Z","response":" temperature","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:18.068877Z","response":" are","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:18.340786Z","response":" high","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:18.612701Z","response":" enough","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:18.891774Z","response":" to","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:19.164983Z","response":" fuse","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:19.440172Z","response":" hydrogen","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:19.713839Z","response":" atoms","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:19.986504Z","response":" together","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:20.257295Z","response":" into","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:20.532114Z","response":" helium","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:20.812023Z","response":" under","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:21.088888Z","response":" extreme","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:21.360339Z","response":" conditions","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:21.631833Z","response":".\n\n","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:21.9113Z","response":"Here","done":false}
{"model":"qwen2","created_at":"2024-10-04T01:04:22.187385Z","response":"'s","done":false}

也可以使用对话模式:

curl http://localhost:11434/api/chat -d '{
  "model": "qwen2",
  "messages": [
    { "role": "user", "content": "Why the sun is so hot?" }
  ]
}'

然而,上述两种调用,由于每次AI助理的响应都是逐一生成整个结果,读起来稍微有些费劲。我们可以使用一个高级参数stream来修改下响应模式:

curl http://localhost:11434/api/chat -d '{
  "model": "qwen2",
  "stream": false,
  "format": "json2",
  "messages": [
    { "role": "user", "content": "Why the sun is so hot?" }
  ]
}'

更进一步,以下是一个简单的 Python 代码示例,它使用 requests 库来发送 HTTP 请求到 Ollama 的 REST API,从而与你部署的模型进行交互。

首先,确保你已经安装了 requests 库,如果没有安装,可以通过以下命令安装:

pip install requests

然后,你可以使用以下 Python 代码来创建一个简单的 AI 助理:

import requests
# Ollama 服务的 URL,假设它运行在本地的 11434 端口
ollama_url = 'http://localhost:11434'
# 选择一个已经部署的模型,例如 'llama2'
model_name = 'llama2'
# 定义一个函数来发送请求并获取模型的响应
def ask_ollama(question):
    # 构建请求的 JSON 数据
    payload = {
        'model': model_name,
        'prompt': question
    }
    
    # 发送 POST 请求到 Ollama 的 API
    response = requests.post(f'{ollama_url}/api/generate', json=payload)
    
    # 检查请求是否成功
    if response.status_code == 200:
        # 解析响应 JSON 并返回结果
        return response.json()['choices'][0]['text']
    else:
        # 如果请求失败,打印错误信息
        print('Error:', response.status_code)
        return None
# 测试 AI 助理
if __name__ == '__main__':
    question = "Hello, who are you?"
    answer = ask_ollama(question)
    if answer:
        print(f"AI Assistant: {answer}")

这段代码定义了一个 ask_ollama 函数,它接受一个问题作为输入,然后发送一个 POST 请求到 Ollama 的 /api/generate 端点。请求的正文包含了模型的名称和输入的提示(prompt)。如果请求成功,函数将解析响应并返回 AI 助理的答案。

请注意,这个例子假设你的 Ollama 服务运行在本地机器的 11434 端口上,并且你已经部署了一个名为 ‘llama2’ 的模型。如果你的设置有所不同,请相应地修改 ollama_urlmodel_name 变量。

3. 完整一键部署代码

from urllib.request import urlopen
from io import BytesIO
from zipfile import ZipFile
def download_and_unzip(url, extract_to='.'):
    http_response = urlopen(url)
    zipfile = ZipFile(BytesIO(http_response.read()))
    zipfile.extractall(path=extract_to)
def run():
    # step1: download ollama
    zip_file_url = 'https://github.com/ollama/ollama/releases/download/v0.3.12/Ollama-darwin.zip'
    download_and_unzip(zip_file_url)
if __name__ == '__main__':
    run()

如果您对我的博客内容感兴趣,欢迎三连击(点赞,关注和评论),我将持续为您带来计算机人工智能前沿技术(尤其是AI相关的大语言模型,深度学习,计算机视觉相关方向)最新学术论文及工程实践方面的内容分享,助力您更快更准更及时地了解前沿技术的发展现状。

目录
相关文章
|
11天前
|
机器学习/深度学习 人工智能 UED
OOTDiffusion:开源AI虚拟试衣工具,智能适配性别和体型自动调整衣物
OOTDiffusion是一款开源的AI虚拟试衣工具,能够智能适配不同性别和体型,自动调整衣物尺寸和形状,生成自然贴合的试穿效果。该工具支持半身和全身试穿模式,操作简单,适合服装电商、时尚行业从业者及AI试穿技术爱好者使用。
95 27
OOTDiffusion:开源AI虚拟试衣工具,智能适配性别和体型自动调整衣物
|
5天前
|
人工智能 数据库 自然语言处理
拥抱Data+AI|DMS+AnalyticDB助力钉钉AI助理,轻松玩转智能问数
「拥抱Data+AI」系列文章由阿里云瑶池数据库推出,基于真实客户案例,展示Data+AI行业解决方案。本文通过钉钉AI助理的实际应用,探讨如何利用阿里云Data+AI解决方案实现智能问数服务,使每个人都能拥有专属数据分析师,显著提升数据查询和分析效率。点击阅读详情。
拥抱Data+AI|DMS+AnalyticDB助力钉钉AI助理,轻松玩转智能问数
|
9天前
|
数据采集 人工智能 编解码
书生·万象InternVL 2.5:上海 AI Lab 开源的多模态大语言模型,超越了目前许多商业模型
书生·万象InternVL 2.5是由上海AI实验室OpenGVLab团队推出的开源多模态大语言模型系列。该模型在多模态理解基准(MMMU)上表现优异,超越了许多商业模型,适用于图像和视频分析、视觉问答、文档理解和多语言处理等多个领域。
54 7
书生·万象InternVL 2.5:上海 AI Lab 开源的多模态大语言模型,超越了目前许多商业模型
|
3天前
|
传感器 机器学习/深度学习 人工智能
AI视频监控卫士技术介绍:智能化河道管理解决方案
AI视频监控卫士系统,通过高清摄像头、智能传感器和深度学习技术,实现河道、水库、城市水务及生态保护区的全天候、全覆盖智能监控。系统能够自动识别非法行为、水质变化和异常情况,并实时生成警报,提升管理效率和精准度。
30 13
|
1天前
|
弹性计算 人工智能 数据管理
AI场景下的对象存储OSS数据管理实践
本文介绍了ECS和OSS的操作流程,分为两大部分。第一部分详细讲解了ECS的登录、密码重置、安全组设置及OSSUTIL工具的安装与配置,通过实验创建并管理存储桶,上传下载文件,确保资源及时释放。第二部分则聚焦于OSSFS工具的应用,演示如何将对象存储挂载为磁盘,进行大文件加载与模型训练,强调环境搭建(如Conda环境)及依赖安装步骤,确保实验结束后正确清理AccessKey和相关资源。整个过程注重操作细节与安全性,帮助用户高效利用云资源完成实验任务。
35 10
|
1天前
|
人工智能 Serverless API
尽享红利,Serverless构建企业AI应用方案与实践
本次课程由阿里云云原生架构师计缘分享,主题为“尽享红利,Serverless构建企业AI应用方案与实践”。课程分为四个部分:1) Serverless技术价值,介绍其发展趋势及优势;2) Serverless函数计算与AI的结合,探讨两者融合的应用场景;3) Serverless函数计算AIGC应用方案,展示具体的技术实现和客户案例;4) 业务初期如何降低使用门槛,提供新用户权益和免费资源。通过这些内容,帮助企业和开发者快速构建高效、低成本的AI应用。
29 11
|
1天前
|
存储 人工智能 开发工具
AI场景下的对象存储OSS数据管理实践
本文介绍了对象存储(OSS)在AI业务中的应用与实践。内容涵盖四个方面:1) 对象存储作为AI数据基石,因其低成本和高弹性成为云上数据存储首选;2) AI场景下的对象存储实践方案,包括数据获取、预处理、训练及推理阶段的具体使用方法;3) 国内主要区域的默认吞吐量提升至100Gbps,优化了大数据量下的带宽需求;4) 常用工具介绍,如OSSutil、ossfs、Python SDK等,帮助用户高效管理数据。重点讲解了OSS在AI训练和推理中的性能优化措施,以及不同工具的特点和应用场景。
24 9
|
8天前
|
机器学习/深度学习 人工智能 监控
AI视频监控技术的核心优势与实践
AI视频监控技术结合了计算机视觉、深度学习和大数据分析,能够实时分析监控画面,识别异常行为和场景变化。其核心在于从“被动记录”转型为“主动识别”,提升监控效率并减少安全隐患。主要应用场景包括泳池管理、健身器械区域、人员密度预警和异常事件检测。系统架构支持多种摄像头设备,采用边缘计算和Docker部署,具备实时性、高准确率和扩展性等优势。未来将优化复杂场景适应性和实时计算负载,进一步提高系统性能。
|
10天前
|
人工智能 Cloud Native 调度
阿里云容器服务在AI智算场景的创新与实践
本文源自张凯在2024云栖大会的演讲,介绍了阿里云容器服务在AI智算领域的创新与实践。从2018年推出首个开源GPU容器共享调度方案至今,阿里云容器服务不断推进云原生AI的发展,包括增强GPU可观测性、实现多集群跨地域统一调度、优化大模型推理引擎部署、提供灵活的弹性伸缩策略等,旨在为客户提供高效、低成本的云原生AI解决方案。
|
2天前
|
缓存 人工智能 负载均衡
AI革新迭代:如何利用代理IP提升智能系统性能
在人工智能快速发展的背景下,智能系统的性能优化至关重要。本文详细介绍了如何利用代理IP提升智能系统性能,涵盖数据加速与缓存、负载均衡、突破地域限制、数据传输优化和网络安全防护等方面。结合具体案例和代码,展示了代理IP在实际应用中的价值和优势。
11 0
下一篇
DataWorks