在Linux上编译带CUDA的OpenCV

本文介绍编译包含cuda模块的opencv,可以在GPU上完成opencv的操作,加速opencv的处理速度,本文用于记录在Linux上编译opencv的过程、问题

编译环境

  • 系统:Ubuntu 16.04

  • 显卡:RTX3090

  • cmake:3.19.3

  • gcc:6.5.0

注:cmake、gcc系统自带版本不是3.19.3、6.5.0,为解决cmake过程中文件下载问题,升级cmake到3.19.3,但是感觉没有作用;gcc原始是5.x.x,升级之后避免了一些错误,是有效的

软件依赖

  • cuda 11.1
  • cudnn 8.0.5

默认在以上环境下,编译opencv4.4.0,包括C++和Python3的接口。(编译v4.4.0之前,使用v4.2.0进行编译,一直无法找到cudnn,可能是cuda及cudnn版本较高,较低版本的opencv还未适配,所以建议对opencv4.4.0及以上版本进行编译,类似经历出现在[1]

下载源码

github下载opencv4.4.0及opencv_contrib-4.4.0

解压后,将opencv_contrib-4.4.0放进opencv4.4.0目录下,并新建编译文件夹(build),文件目录结构如下:

image-20210205104342803

安装依赖

以下参考:How to install OpenCV 4.2.0 with CUDA 10.1 on Ubuntu 20.04 LTS (Focal Fossa)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#更新系统
sudo apt update
sudo apt upgrade

#编译工具
sudo apt install build-essential cmake pkg-config unzip yasm git checkinstall

#Image I/O libs
sudo apt install libjpeg-dev libpng-dev libtiff-dev

#Video/Audio Libs — FFMPEG, GSTREAMER, x264 and so on.
sudo apt install libavcodec-dev libavformat-dev libswscale-dev libavresample-dev
sudo apt install libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev
sudo apt install libxvidcore-dev x264 libx264-dev libfaac-dev libmp3lame-dev libtheora-dev
sudo apt install libfaac-dev libmp3lame-dev libvorbis-dev

#OpenCore — Adaptive Multi Rate Narrow Band (AMRNB) and Wide Band (AMRWB) speech codec
sudo apt install libopencore-amrnb-dev libopencore-amrwb-dev

#Cameras programming interface libs
sudo apt-get install libdc1394-22 libdc1394-22-dev libxine2-dev libv4l-dev v4l-utils
cd /usr/include/linux
sudo ln -s -f ../libv4l1-videodev.h videodev.h

#GTK lib for the graphical user functionalites coming from OpenCV highghui module
sudo apt-get install libgtk-3-dev

#Python libraries for python3(本文安装至conda下,无次步骤)
sudo apt-get install python3-dev python3-pip
sudo -H pip3 install -U pip numpy
sudo apt install python3-testresources

#Parallelism library C++ for CPU
sudo apt-get install libtbb-dev

#Optimization libraries for OpenCV
sudo apt-get install libatlas-base-dev gfortran

#Optional libraries
sudo apt-get install libprotobuf-dev protobuf-compiler
sudo apt-get install libgoogle-glog-dev libgflags-dev
sudo apt-get install libgphoto2-dev libeigen3-dev libhdf5-dev doxygen

cmake生成待编译文件

进入build目录下,执行以下命令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
cmake -D CMAKE_BUILD_TYPE=RELEASE \
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D CMAKE_C_COMPILER=/usr/bin/gcc-6 \
-D INSTALL_PYTHON_EXAMPLES=ON \
-D INSTALL_C_EXAMPLES=ON \
-D OPENCV_ENABLE_NONFREE=ON \
-D BUILD_opencv_python3=ON \
-D WITH_CUDA=ON \
-D WITH_CUDNN=ON \
-D WITH_TBB=ON \
-D OPENCV_DNN_CUDA=ON \
-D ENABLE_FAST_MATH=1 \
-D CUDA_FAST_MATH=1 \
-D CUDA_ARCH_BIN=8.6 \
-D WITH_CUBLAS=1 \
-D OPENCV_GENERATE_PKGCONFIG=ON \
-D OPENCV_EXTRA_MODULES_PATH=/home/xx/soft/opencv_gpu/opencv-4.4.0/opencv_contrib-4.4.0/modules \
-D PYTHON3_EXECUTABLE=/home/xx/anaconda3/envs/py37/bin/python3.7m \
-D PYTHON3_INCLUDE_DIR=/home/xx/anaconda3/envs/py37/include/python3.7m \
-D PYTHON3_LIBRARY=/home/xx/anaconda3/envs/py37/lib/libpython3.7m.so \
-D PYTHON3_NUMPY_INCLUDE_DIRS=/home/xx/anaconda3/envs/py37/lib/python3.7/site-packages/numpy/core/include \
-D PYTHON3_PACKAGES_PATH=/home/xx/anaconda3/envs/py37/lib/python3.7/site-packages \
-D PYTHON_DEFAULT_EXECUTABLE=/home/xx/anaconda3/envs/py37/bin/python3.7m \
-D CUDNN_LIBRARY=/usr/local/cuda/lib64/libcudnn.so.8.0.5 \
-D CUDNN_INCLUDE_DIR=/usr/local/cuda/include \
-D CUDA_CUDA_LIBRARY=/usr/local/cuda/lib64/stubs/libcuda.so \
-D OPENCV_PYTHON3_INSTALL_PATH=/home/xx/anaconda3/envs/py37/lib/python3.7/site-packages \
-D WITH_WEBP=OFF \
-D WITH_OPENCL=OFF \
-D ETHASHLCL=OFF \
-D ENABLE_CXX11=ON \
-D BUILD_EXAMPLES=OFF \
-D OPENCV_ENABLE_NONFREE=ON \
-D WITH_OPENGL=ON \
-D WITH_GSTREAMER=ON \
-D WITH_V4L=ON \
-D WITH_QT=OFF \
-D BUILD_opencv_python3=ON \
-D BUILD_opencv_python2=OFF \
-D HAVE_opencv_python3=ON ..

关键参数说明

1
2
3
BUILD_opencv_python3:
CUDA_ARCH_BIN:显卡算力,Nvidia官网查询,RTX3090对应8.6
OPENCV_GENERATE_PKGCONFIG:生成pkg-config,这个务必打开,不然安装成功找不到opencv

cmake后最终确认得到以下输出:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
-- General configuration for OpenCV 4.4.0 =====================================
Version control: unknown
--
Extra modules:
Location (extra): /home/xx/soft/opencv_gpu/opencv-4.4.0/opencv_contrib-4.4.0/modules
Version control (extra): unknown
--
Platform:
Timestamp: 2021-02-05T02:31:19Z
Host: Linux 4.15.0-133-generic x86_64
CMake: 3.19.3
CMake generator: Unix Makefiles
CMake build tool: /usr/bin/make
Configuration: RELEASE
--
CPU/HW features:
Baseline: SSE SSE2 SSE3
requested: SSE3
Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
requested: SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
SSE4_1 (17 files): + SSSE3 SSE4_1
SSE4_2 (2 files): + SSSE3 SSE4_1 POPCNT SSE4_2
FP16 (1 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
AVX (5 files): + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
AVX2 (31 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
AVX512_SKX (7 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX
--
C/C++:
Built as dynamic libs?: YES
C++ standard: 11
C++ Compiler: /usr/bin/c++ (ver 6.5.0)
C++ flags (Release): -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wno-psabi -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG -DNDEBUG
C++ flags (Debug): -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wno-psabi -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -g -O0 -DDEBUG -D_DEBUG
C Compiler: /usr/bin/gcc-6
C flags (Release): -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Winit-self -Wno-psabi -Wno-comment -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -DNDEBUG -DNDEBUG
C flags (Debug): -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Winit-self -Wno-psabi -Wno-comment -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -g -O0 -DDEBUG -D_DEBUG
Linker flags (Release): -Wl,--gc-sections -Wl,--as-needed
Linker flags (Debug): -Wl,--gc-sections -Wl,--as-needed
ccache: NO
Precompiled headers: NO
Extra dependencies: m pthread cudart_static dl rt nppc nppial nppicc nppidei nppif nppig nppim nppist nppisu nppitc npps cublas cudnn cufft -L/usr/local/cuda/lib64 -L/usr/lib/x86_64-linux-gnu
3rdparty dependencies:
--
OpenCV modules:
To be built: alphamat aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann freetype fuzzy gapi hdf hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency sfm shape stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab xfeatures2d ximgproc xobjdetect xphoto
Disabled: python2 world
Disabled by dependency: -
Unavailable: cnn_3dobj cvv java js julia matlab ovis viz
Applications: tests perf_tests apps
Documentation: NO
Non-free algorithms: YES
--
GUI:
GTK+: YES (ver 3.18.9)
GThread : YES (ver 2.48.2)
GtkGlExt: NO
OpenGL support: NO
VTK support: NO
--
Media I/O:
ZLib: /usr/lib/x86_64-linux-gnu/libz.so (ver 1.2.8)
JPEG: /usr/lib/x86_64-linux-gnu/libjpeg.so (ver 80)
PNG: /usr/lib/x86_64-linux-gnu/libpng.so (ver 1.2.54)
TIFF: /usr/lib/x86_64-linux-gnu/libtiff.so (ver 42 / 4.0.6)
JPEG 2000: OpenJPEG (ver 2.4.0)
OpenEXR: build (ver 2.3.0)
HDR: YES
SUNRASTER: YES
PXM: YES
PFM: YES
--
Video I/O:
DC1394: YES (2.2.4)
FFMPEG: YES
avcodec: YES (56.60.100)
avformat: YES (56.40.101)
avutil: YES (54.31.100)
swscale: YES (3.1.101)
avresample: YES (2.1.0)
GStreamer: YES (1.8.3)
v4l/v4l2: YES (linux/videodev2.h)
--
Parallel framework: TBB (ver 4.4 interface 9002)
--
Trace: YES (with Intel ITT)
--
Other third-party libraries:
Lapack: YES (/usr/lib/libopenblas.so)
Eigen: YES (ver 3.2.92)
Custom HAL: NO
Protobuf: build (3.5.1)
--
NVIDIA CUDA: YES (ver 11.1, CUFFT CUBLAS FAST_MATH)
NVIDIA GPU arch: 86
NVIDIA PTX archs:
--
cuDNN: YES (ver 8.0.5)
--
Python 3:
Interpreter: /home/xx/anaconda3/envs/py37/bin/python3.7m (ver 3.7.9)
Libraries: /home/xx/anaconda3/envs/py37/lib/libpython3.7m.so (ver 3.7.9)
numpy: /home/xx/anaconda3/envs/py37/lib/python3.7/site-packages/numpy/core/include (ver 1.19.2)
install path: /home/xx/anaconda3/envs/py37/lib/python3.7/site-packages/cv2/python-3.7
--
Python (for build): /home/xx/anaconda3/envs/py37/bin/python3.7m
--
Java:
ant: NO
JNI: NO
Java wrappers: NO
Java tests: NO
--
Install to: /usr/local

注:系统用户名使用xx替代

gcc编译

在build目录下执行以下命令

1
make -j16

注:虽然直接make出问题的概率更小,但是为了加速编译速度,这里指使用16个多线程进行编译,这里多线程数量一般不大于CPU核心数,可以通过nproc命令查看CPU核心数。[2]

安装到系统

在build目录下执行以下命令

1
sudo make install

配置环境变量

执行以下命令

1
2
sudo /bin/bash -c 'echo "/usr/local/lib" >> /etc/ld.so.conf.d/opencv.conf'
sudo ldconfig

如果想将opencv安装到Python,建议在cmake编译参数中使用-D OPENCV_PYTHON3_INSTALL_PATH指定安装目录

1
-D OPENCV_PYTHON3_INSTALL_PATH=/home/xx/anaconda3/envs/py37/lib/python3.7/site-packages \

查看opencv是否安装成功

1
2
pkg-config --modversion opencv
pkg-config --libs opencv4

编译过程遇到的问题

编译opencv主要有两个过程,cmake和make,错误主要出现在cmake阶段,根据出现的错误情况,主要分为以下几类

下载失败

cmake过程中,需要下载的文件会放在opencv4.4.0下的隐藏目录.cache下,如果下载失败,可以手动下载,放入相应目录解决。

xfeatures2d[3]

1
2
3
4
5
6
7
8
9
10
11
boostdesc_bgm.i
boostdesc_bgm_bi.i
boostdesc_bgm_hd.i
boostdesc_lbgm.i
boostdesc_binboost_064.i
boostdesc_binboost_128.i
boostdesc_binboost_256.i
vgg_generated_120.i
vgg_generated_64.i
vgg_generated_80.i
vgg_generated_48.i

进入opencv4.4.0下的隐藏目录.cache/xfeatures2d,执行以下命令解决:

1
2
3
4
5
6
7
8
9
10
11
12
13
cd boostdesc
curl https://raw.githubusercontent.com/opencv/opencv_3rdparty/34e4206aef44d50e6bbcd0ab06354b52e7466d26/boostdesc_lbgm.i > 0ae0675534aa318d9668f2a179c2a052-boostdesc_lbgm.i
curl https://raw.githubusercontent.com/opencv/opencv_3rdparty/34e4206aef44d50e6bbcd0ab06354b52e7466d26/boostdesc_binboost_256.i > e6dcfa9f647779eb1ce446a8d759b6ea-boostdesc_binboost_256.i
curl https://raw.githubusercontent.com/opencv/opencv_3rdparty/34e4206aef44d50e6bbcd0ab06354b52e7466d26/boostdesc_binboost_128.i > 98ea99d399965c03d555cef3ea502a0b-boostdesc_binboost_128.i
curl https://raw.githubusercontent.com/opencv/opencv_3rdparty/34e4206aef44d50e6bbcd0ab06354b52e7466d26/boostdesc_binboost_064.i > 202e1b3e9fec871b04da31f7f016679f-boostdesc_binboost_064.i
curl https://raw.githubusercontent.com/opencv/opencv_3rdparty/34e4206aef44d50e6bbcd0ab06354b52e7466d26/boostdesc_bgm_hd.i > 324426a24fa56ad9c5b8e3e0b3e5303e-boostdesc_bgm_hd.i
curl https://raw.githubusercontent.com/opencv/opencv_3rdparty/34e4206aef44d50e6bbcd0ab06354b52e7466d26/boostdesc_bgm_bi.i > 232c966b13651bd0e46a1497b0852191-boostdesc_bgm_bi.i
curl https://raw.githubusercontent.com/opencv/opencv_3rdparty/34e4206aef44d50e6bbcd0ab06354b52e7466d26/boostdesc_bgm.i > 0ea90e7a8f3f7876d450e4149c97c74f-boostdesc_bgm.i
cd vgg
curl https://raw.githubusercontent.com/opencv/opencv_3rdparty/fccf7cd6a4b12079f73bbfb21745f9babcd4eb1d/vgg_generated_120.i > 151805e03568c9f490a5e3a872777b75-vgg_generated_120.i
curl https://raw.githubusercontent.com/opencv/opencv_3rdparty/fccf7cd6a4b12079f73bbfb21745f9babcd4eb1d/vgg_generated_64.i > 7126a5d9a8884ebca5aea5d63d677225-vgg_generated_64.i
curl https://raw.githubusercontent.com/opencv/opencv_3rdparty/fccf7cd6a4b12079f73bbfb21745f9babcd4eb1d/vgg_generated_48.i > e8d0dcd54d1bcfdc29203d011a797179-vgg_generated_48.i
curl https://raw.githubusercontent.com/opencv/opencv_3rdparty/fccf7cd6a4b12079f73bbfb21745f9babcd4eb1d/vgg_generated_80.i > 7cd47228edec52b6d82f46511af325c5-vgg_generated_80.i

注:如何curl无法下载,可以通过浏览器挨个下载,并按照以上进行命名各文件

ippicv

手动从github下载,放入.cache/ppicv目录下

**face_landmark_model.dat **

手动从github下载,放入.cache/data目录下,注意文件名前部是该文件的md5值,可以通过命令md5sum file计算该值

软件未安装(Not Found)

此类错误比较简单,缺什么安装什么,比如安装过程中遇到以下缺失软件的解决办法

tesserocr安装失败[4]

1
2
sudo apt-get install libleptonica-dev libtesseract-dev
python -m pip install tesserocr

lapacke.h缺失[5]

明明已经按照该软件,但是OpenBLAS一致没找到该文件,但是搜索文件发现该文件在/usr/include/下,只能手动拷贝ls /usr/include/lapacke*文件至/usr/include/openblas/目录下

Could NOT find CUDNN: Found unsuitable version “…”, but required is at least “7.5” (found /usr/local/cuda-10.2/lib64/libcudnn.so)

这是在编译opencv4.2.0时出现的错误,实际已经安装cudnn,也满足7.5以上的要求,但是就是找不,可以在cmake中加入参数:-D CUDNN_VERSION='8.0'解决[1:1],但是后续还会遇到其他问题,编译opencv4.4.0时,该问题不再出现

No package ‘gtk±3.0’ found

1
sudo apt-get install libgtk-3-dev

卸载opencv

通过源码安装的opencv,可以进入编译目录下(build)执行以下命令,卸载opencv

1
sudo make uninstall

使用cuda模块的简单例子

查看opencv的cuda模块支持的功能,进入python终端,输入以下命令

1
2
3
import cv2
dir(cv2.cuda)
dir(cv2.cuda_GpuMat())

例子:在GPU做resize

1
2
3
4
5
6
7
8
9
10
11
12
#读取图片
frame=cv2.imread('test.jpg')

#上传到gpu进行处理
gpu_frame=cv2.cuda_GpuMat()
gpu_frame.upload(frame)
print(gpu_frame.cudaPtr())

#resize
gpu_resframe=cv2.cuda.resize(gpu_frame,(1024,512))
cpu_resfram=gpu_resframe.download()
print(cpu_resfram.shape)

  1. Jetson Nano编译安装opencv4.3.0并使能cuDNN加速 ↩︎ ↩︎

  2. Ubuntu20.04+GeForce RTX 2080 SUPER+cuda11.1+cudnn8.0.4+openCV4.4.0编译 ↩︎

  3. Opencv-3.4.0编译时报错缺少boostdesc_bgm.i等文件 ↩︎

  4. error while trying to install tesserocr ↩︎

  5. Building against OpenBLAS complains about missing lapacke.h ↩︎