python

关注公众号 jb51net

关闭
首页 > 脚本专栏 > python > torchserve模型部署

Pytorch基础教程之torchserve模型部署解析

作者:山顶夕景

torchserve是基于netty网络框架实现的,底层使用EpollServerSocketChannel服务进行网络通信,通过epoll多路复用技术实现高并发网络连接处理,这篇文章主要介绍了Pytorch基础教程之torchserve模型部署和推理,需要的朋友可以参考下

note

一、torchserve和archiver模块

在这里插入图片描述

pip:
    - torch-workflow-archiver
    - torch-model-archiver 
    - torchserve 

二、Speech2Text Wav2Vec2模型部署

2.1 准备模型和自定义handler

# 1. 导入huggingface模型
from transformers import AutoModelForCTC, AutoProcessor
import os
modelname = "facebook/wav2vec2-base-960h"
model = AutoModelForCTC.from_pretrained(modelname)
processor = AutoProcessor.from_pretrained(modelname)
modelpath = "model"
os.makedirs(modelpath, exist_ok=True)
model.save_pretrained(modelpath)
processor.save_pretrained(modelpath)
# 2. 自定义handler
import torch
import torchaudio
from transformers import AutoProcessor, AutoModelForCTC
import io
class Wav2VecHandler(object):
    def __init__(self):
        self._context = None
        self.initialized = False
        self.model = None
        self.processor = None
        self.device = None
        # Sampling rate for Wav2Vec model must be 16k
        self.expected_sampling_rate = 16_000
    def initialize(self, context):
        """Initialize properties and load model"""
        self._context = context
        self.initialized = True
        properties = context.system_properties
        # See https://pytorch.org/serve/custom_service.html#handling-model-execution-on-multiple-gpus
        self.device = torch.device("cuda:" + str(properties.get("gpu_id")) if torch.cuda.is_available() else "cpu")
        model_dir = properties.get("model_dir")
        self.processor = AutoProcessor.from_pretrained(model_dir)
        self.model = AutoModelForCTC.from_pretrained(model_dir)
    def handle(self, data, context):
        """Transform input to tensor, resample, run model and return transcribed text."""
        input = data[0].get("data")
        if input is None:
            input = data[0].get("body")
        # torchaudio.load accepts file like object, here `input` is bytes
        model_input, sample_rate = torchaudio.load(io.BytesIO(input), format="WAV")
        # Ensure sampling rate is the same as the trained model
        if sample_rate != self.expected_sampling_rate:
            model_input = torchaudio.functional.resample(model_input, sample_rate, self.expected_sampling_rate)
        model_input = self.processor(model_input, sampling_rate = self.expected_sampling_rate, return_tensors="pt").input_values[0]
        logits = self.model(model_input)[0]
        pred_ids = torch.argmax(logits, axis=-1)[0]
        output = self.processor.decode(pred_ids)
        return [output]

在自定义 Handler 中,需要实现以下方法:

2.2 打包模型和启动模型api服务

# 打包部署模型文件, 把model部署到torchserve 
torch-model-archiver --model-name Wav2Vec2 --version 1.0 --serialized-file model/pytorch_model.bin --handler ./handler.py --extra-files "model/config.json,model/special_tokens_map.json,model/tokenizer_config.json,model/vocab.json,model/preprocessor_config.json" -f
mv Wav2Vec2.mar model_store
# 启动model服务, 加载前面打包的model, 并以grpc和http接口向外提供推理服务
torchserve --start --model-store model_store --models Wav2Vec2=Wav2Vec2.mar --ncs
# Once the server is running, let's try it with:
curl -X POST http://127.0.0.1:8080/predictions/Wav2Vec2 --data-binary '@./sample.wav' -H "Content-Type: audio/basic"
# 暂停torchserve serving
torchserve --stop

2.3 相关参数记录

torch-model-archiver:用来打包模型

torch-model-archiver:用来打包模型
usage: torch-model-archiver [-h] --model-name MODEL_NAME
                            [--serialized-file SERIALIZED_FILE]
                            [--model-file MODEL_FILE] --handler HANDLER
                            [--extra-files EXTRA_FILES]
                            [--runtime {python,python2,python3}]
                            [--export-path EXPORT_PATH]
                            [--archive-format {tgz,no-archive,default}] [-f]
                            -v VERSION [-r REQUIREMENTS_FILE]

torchserve:该组件用来加载前面打包的模型,并以grpc和http等接口往外提供推理服务

Reference

[1] https://github.com/pytorch/serve

[2] Torch Model archiver for TorchServe

[3] https://github.com/pytorch/serve/tree/master/examples/speech2text_wav2vec2

[4] https://huggingface.co/docs/transformers/model_doc/wav2vec2

[5] https://github.com/pytorch/serve/tree/master/model-archiver

[6] pytorch 模型部署.nlper

[7] cURL - 学习/实践

[8] Serving PyTorch Models Using TorchServe(by using transformer model for example)

[9] 四种常见的 POST 提交数据方式

[10] TorchServe 详解:5 步将模型部署到生产环境

[11] PyTorch最新工具torchserve用于0.部署模型

到此这篇关于Pytorch基础教程之torchserve模型部署和推理的文章就介绍到这了,更多相关torchserve模型部署内容请搜索脚本之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持脚本之家!

您可能感兴趣的文章:
阅读全文