whisperX本地部署

开源项目地址：https://github.com/m-bain/whisperX

1、支持快速实时转录

2、支持时间戳对齐

3、支持说话人分类

4、支持语音活动检测

5、效率提升

环境配置：

1、安装python3.10 ：安装的时候记得勾选Add python to PATH，安装完成cmd中输入python -V 出现Python版本即可

2、安装CUDA12.1：安装完成cmd中输入nvidia-smi出现Nvidia版本即可

3、安装cuDNN for CUDA12.x（需要先注册）：将3个文件夹拷贝到CUDA安装目录

4、安装PyTorch：

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

需要关掉梯子（开梯子会报错），最好挂个VPN，速度快很多

安装完成后，在python下输入

import torch 
torch.cuda.is_available()

如果返回true则代表Torch可以使用GPU显卡了

5、安装whisperX：pip install git+https://github.com/m-bain/whisperx.git

6、安装ffmpeg

7、~~使用https://github.com/Pikurrot/whisper-gui进行测试~~（最终未能成功）

pip install gradio
git clone https://github.com/Pikurrot/whisper-gui
python main.py --autolaunch

选择模型后，第一次使用会下载，需要等待

出现无法使用cuda的情况，官方项目Issue里也有人遇到，没能解决，看了下代码，应该是环境配置里gpu_support被设置为None了

尝试解决：

1、安装Miniconda，配置环境

conda create --name whisperx python=3.10
conda activate whisperx

2、安装pytorch

conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

3、安装whisperx

pip install git+https://github.com/m-bain/whisperx.git

4、安装gradio

pip install gradio

5、~~安装whisper-gui~~（最终未能成功）

git clone https://github.com/Pikurrot/whisper-gui

崩溃，还是无法使用cuda

尝试定位问题

import os
import time
import torch
import os   #引用OS
import whisperx
import gc 

def whisperx_test():
    #device = "cpu"
    device = "cuda" 
    model_size = "large-v3"
    audio_file = "16k16bit.wav"
    batch_size = 16
    compute_type = "int8" 

    # widnow CPU
    #model = whisperx.load_model("tiny", device, compute_type=compute_type)

    # window GPU
    model = whisperx.load_model("tiny", device, compute_type=compute_type)

    audio = whisperx.load_audio(audio_file)
    result = model.transcribe(audio, batch_size=batch_size)

    print(result["segments"])

if __name__ == "__main__":
    # print(torch.cuda.is_available())
    start_time = time.time()  # 开始时间
    print("start time:", start_time)
    # whisper_test()
    # faster_whisper_test()
    whisperx_test()
    end_time = time.time()  # 结束时间
    print("Execution time: ", end_time - start_time, "seconds")

报错：AssertionError: Torch not compiled with CUDA enabled

应该是Torch环境安装问题

import torch 
torch.cuda.is_available()

结果变成了false

重新安装一遍Torch，问题解决，python测试代码正常，但是whisper-gui还是不能用，也许有BUG。

参考链接：https://www.cnblogs.com/aehyok/p/18183895

whisperX本地部署

环境配置：

尝试解决：

发表评论

发表回复取消回复

环境配置：

尝试解决：

发表评论

发表回复 取消回复

发表回复取消回复