LLama2 本地部署-部署指南-Liunx(abc.htmltoo.com)

LLama2 本地部署大数据 AI

ihunter 2023-9-26 55

[百度收录]

[搜狗收录]

[360收录]

https://github.com/FlagAlpha/Llama2-Chinese

https://huggingface.co/FlagAlpha

https://hub.docker.com/r/longerhuya/llama2-chinese-7b

https://hub.docker.com/r/ninthkat/jupyterlab-pytorch-cuda

# Installing the NVIDIA Container Toolkit

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \

sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo

yum install -y nvidia-container-toolkit

-Configuring Docker

nvidia-ctk runtime configure --runtime=docker

systemctl restart docker

docker run -d --restart=always --name llama2 -p 9999:8888 -p 15550:15550 -p 15551:15551 -v /data/site/docker/data/llama2:/home/jovyan -e TZ='Asia/Shanghai' -v /etc/localtime:/etc/localtime:ro --shm-size 12G ninthkat/jupyterlab-pytorch-cuda:latest

http://g.htmltoo.com:9999

-p 15550:15550 -p 15551:15551

bitnami/pytorch

--shm-size 16G

nvcr.io/nvidia/pytorch:21.08-py3

docker exec -it llama2 /bin/bash

-安装conda

cd /notebooks

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

bash Miniconda3-latest-Linux-x86_64.sh

export PATH="/root/miniconda3/bin:$PATH"

-创建 conda 环境:

python --version

conda --version

conda create -n llama2 python=3.10.11

-安装依赖库

activate llama2

conda init

docker exec -it llama2 env \LANG=C.UTF-8 /bin/bash

cd /home/jovyan

activate llama2

git clone https://github.com/facebookresearch/llama.git

cd llama

-安装依赖库：

pip install -e .

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

-代码及模型权重拉取

git clone https://github.com/FlagAlpha/Llama2-Chinese.git

-拉取 Llama2-Chinese-13b-Chat 模型权重及代码

cd Llama2-Chinese

git clone git clone https://huggingface.co/FlagAlpha/Llama2-Chinese-13b-Chat

-文件大小查看：

du -sh Llama2-Chinese-13b-Chat

-输出：

25G Llama2-Chinese-13b-Chat

---终端测试

-进入python环境：

python3

-输入代码

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained('Llama2-Chinese-13b-Chat',device_map='auto',torch_dtype=torch.float16,load_in_8bit=True)
model =model.eval()
tokenizer = AutoTokenizer.from_pretrained('Llama2-Chinese-13b-Chat',use_fast=False)
tokenizer.pad_token = tokenizer.eos_token
input_ids = tokenizer(['<s>Human: 介绍一下深圳\n</s><s>Assistant: '], return_tensors="pt",add_special_tokens=False).input_ids.to('cuda')        
generate_input = {
    "input_ids":input_ids,
    "max_new_tokens":512,
    "do_sample":True,
    "top_k":50,
    "top_p":0.95,
    "temperature":0.3,
    "repetition_penalty":1.3,
    "eos_token_id":tokenizer.eos_token_id,
    "bos_token_id":tokenizer.bos_token_id,
    "pad_token_id":tokenizer.pad_token_id
}
generate_ids  = model.generate(**generate_input)
text = tokenizer.decode(generate_ids[0])
print(text)

---页面测试

-使用 gradio 搭建页面

pip3 install gradio -i https://pypi.tuna.tsinghua.edu.cn/simple

---加载模型并启动服务

vi /notebooks/Llama2-Chinese/examples/chat_gradio.py

到94行：
    demo.queue().launch(share=False, debug=True, server_name="0.0.0.0")
1
修改为：
    demo.queue().launch(share=False, debug=True, server_name="0.0.0.0", server_port=15550)

启动脚本：

python3 examples/chat_gradio.py --model_name_or_path Llama2-Chinese-13b-Chat

-如果出现下面的错误：

  File "/notebooks/Llama2-Chinese/examples/chat_gradio.py", line 94
    demo.queue().launch(share=False, debug=True， server_name="0.0.0.0")
                                               ^
SyntaxError: invalid character '，' (U+FF0C)

-则按照下面的步骤修改代码：

vi  /notebooks/Llama2-Chinese/examples/chat_gradio.py
:94 
修改中文逗号，为英文逗号,
94    demo.queue().launch(share=False, debug=True， server_name="0.0.0.0")
=>
94    demo.queue().launch(share=False, debug=True, server_name="0.0.0.0")

---测试

http://g.htmltoo.com:15550

llama2的模型下载需要去官网申请，申请可能需要科学上网，下载不需要，

申请地址：

https://ai.meta.com/resources/models-and-libraries/llama-downloads/

cd /opt

git clone https://github.com/facebookresearch/llama.git

cd llama

pip3 install -e .

pip3 install --upgrade torch torchvision -i https://pypi.tuna.tsinghua.edu.cn/simple fastNLP

llama2有7B、13B和70B三个版本，分别为70 亿、130 亿和 700 亿三种参数变体，参数越多对配置要求越高。

每个版本有可调参的版本和chat版本，我这里选择第一个7B的版本。

Llama-2-7b

Llama-2-7b-chat

Llama-2-13b

Llama-2-13b-chat

Llama-2-70b

Llama-2-70b-chat

bash download.sh

按照提示，贴上邮件内提供的下载的URL，选择需要下载的版本，然后等待下载完成，7B的文件13G比较大得等半天，其他的版本更大。

https://blog.csdn.net/cecere/article/details/132120423

https://blog.csdn.net/zengNLP/article/details/131965453

https://zhuanlan.zhihu.com/p/647067870

追梦赤子心：domsn.com

http://www.htmltoo.com/

签名：这个人很懒，什么也没有留下！

猜你喜欢：

最新回复 (0)

只看楼主

频道：

论坛：

我的：

LLama2 本地部署大数据 AI

猜你喜欢：

ihunter

作者最近主题：

LLama2 本地部署 大数据 AI

猜你喜欢：

ihunter

作者最近主题：

LLama2 本地部署大数据 AI