LLaMA由Meta AI(前身为Facebook AI)开发,是一个大型语言模型家族,旨在使尖端NLP的访问民主化。与GPT模型相比,它针对较小规模的设置进行了优化,使研究人员和开发人员更容易使用。
特点:
Ollama是一个旨在简化LLaMA模型部署和使用的框架。它使开发人员能够在本地或云环境中运行LLaMA模型,使其更易于实际应用程序访问。
第一步运行curl -fsSL https://ollama.com/install.sh | sh安装ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama
$ curl -fsSL https://ollama.com/install.sh | sh # >>> Installing ollama to /usr/local # [sudo] wy 的密码: # >>> Downloading Linux amd64 bundle # ######################################################################## 100.0% # >>> Creating ollama user... # >>> Adding ollama user to render group... # >>> Adding ollama user to video group... # >>> Adding current user to ollama group... # >>> Creating ollama systemd service... # >>> Enabling and starting ollama service... # Created symlink /etc/systemd/system/default.target.wants/ollama.service → /etc/systemd/system/ollama.service. # >>> The Ollama API is now available at 127.0.0.1:11434. # >>> Install complete. Run "ollama" from the command line. # WARNING: No NVIDIA/AMD GPU detected. Ollama will run in CPU-only mode.
快速运行Meta的模型llama3.2:1b
$ ollama pull llama3.2:1b # pulling manifest # pulling 74701a8c35f6... 12% ▕████████████ ▏ 156 MB/1.3 GB 11 MB/s 1m42s
此处可能存在网络问题,解决网络问题
查看已经下载的模型
$ ollama ls # NAME ID SIZE MODIFIED # llama3.2:1b baf6a787fdff 1.3 GB 39 minutes ago
运行模型
$ ollama run llama3.2:1b # >>> hi # Hello. How can I assist you today?
使用api的方式调用模型
$ curl http://localhost:11434/api/generate -d '{ > "model": "llama3.2:1b", > "prompt":"Why is the sky blue?" > }' # {"model":"llama3.2:1b","created_at":"2025-02-24T06:18:38.472602481Z","response":"The","done":false} # {"model":"llama3.2:1b","created_at":"2025-02-24T06:18:38.695488158Z","response":" sky","done":false}
$ curl http://localhost:11434/api/chat -d '{ "model": "llama3.2:1b", "messages": [ { "role": "user", "content": "why is the sky blue?" } ] }' # {"model":"llama3.2:1b","created_at":"2025-02-24T06:19:04.859558101Z","message":{"role":"assistant","content":"The"},"done":false} # {"model":"llama3.2:1b","created_at":"2025-02-24T06:19:05.097138857Z","message":{"role":"assistant","content":" sky"},"done":false}
查看运行中的模型
$ ollama ps # NAME ID SIZE PROCESSOR UNTIL # llama3.2:1b baf6a787fdff 2.2 GB 100% CPU About a minute from now
$ lspci | grep -i vga # 03:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41) # 25:00.0 VGA compatible controller: NVIDIA Corporation Device 2204 (rev a1)
$ nvidia-smi # +---------------------------------------------------------------------------------------+ # | NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 | # |-----------------------------------------+----------------------+----------------------+ # | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | # | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | # | | | MIG M. | # |=========================================+======================+======================| # | 0 NVIDIA GeForce RTX 3090 Off | 00000000:25:00.0 Off | N/A | # | 0% 30C P8 12W / 350W | 16MiB / 24576MiB | 0% Default | # | | | N/A | # +-----------------------------------------+----------------------+----------------------+ # # +---------------------------------------------------------------------------------------+ # | Processes: | # | GPU GI CI PID Type Process name GPU Memory | # | ID ID Usage | # |=======================================================================================| # | 0 N/A N/A 4292 G /usr/lib/xorg/Xorg 4MiB | # | 0 N/A N/A 6635 G /usr/lib/xorg/Xorg 4MiB | # +---------------------------------------------------------------------------------------+
NVIDIA GeForce RTX 3090 24G显存
NVIDIA GeForce RTX 3090
$ curl -fsSL https://ollama.com/install.sh | sh $ ollama pull llama3.2:1b $ ollama run llama3.2:1b "hi" Hello. Is there something I can help you with or would you like to chat?
查看模型是否成功加载到GPU中
$ ollama ps NAME ID SIZE PROCESSOR UNTIL llama3.2:1b baf6a787fdff 2.7 GB 100% GPU 3 minutes from now
How do I use Ollama behind a proxy?
$ sudo mkdir -p /etc/systemd/system/ollama.service.d $ cd /etc/systemd/system/ollama.service.d $ sudo vim http-proxy.conf
http-proxy.conf写入以下内容:
http-proxy.conf
[Service] Environment="HTTPS_PROXY=http://127.0.0.1:7890"
这里需要有运行在7890口的加速服务器
$ sudo systemctl daemon-reload $ sudo systemctl restart ollama.service
使用环境变量OLLAMA_MODELS可以更改默认存储位置
OLLAMA_MODELS
如何管理Ollama服务器可以排队的最大请求数?
Ollama如何处理并发请求?faq
OpenAI兼容性问题?openai