环境描述
- window10
- python3.6.8 [MSC v.1916 64 bit (AMD64)]
- pytorch 1.10.0+cu113
- cuda版本 11.1
使用mingw32编译
编译成目标文件通过,但是链接阶段发生问题,典型错误为
D:\ProgramData\mingw64\bin\g++.exe -shared -s build\temp.win-amd64-3.6\Release\mylinear.o build\temp.win-amd64-3.6\Release\mylinear_cpp.cp36-win_amd64.def -LD:\Python36WindowsTensorflow\Python36\lib\site-packages\torch\lib -LD:\Python36WindowsTensorflow\Python36\libs -LD:\Python36WindowsTensorflow\Python36\PCbuild\amd64 -lc10 -ltorch -ltorch_cpu -ltorch_python -lpython36 -lmsvcr120 -o build\lib.win-amd64-3.6\mylinear_cpp.cp36-win_amd64.pyd build\temp.win-amd64-3.6\Release\mylinear.o:mylinear.cpp:(.text+0x160): undefined reference to `__imp__ZN2at4_ops13transpose_int4callERKNS_6TensorExx' build\temp.win-amd64-3.6\Release\mylinear.o:mylinear.cpp:(.text+0x171): undefined reference to `__imp__ZN2at4_ops2mm4callERKNS_6TensorES4_' build\temp.win-amd64-3.6\Release\mylinear.o:mylinear.cpp:(.text+0x18c): undefined reference to `__imp__ZN3c1019UndefinedTensorImpl10_singletonE' ... ... build\temp.win-amd64-3.6\Release\mylinear.o:mylinear.cpp:(.text$_ZZN8pybind1112cpp_function10initializeIRPFSt6vectorIN2at6TensorESaIS4_EES4_S4_S4_ES6_JS4_S4_S4_EJNS_4nameENS_5scopeENS_7siblingEA18_cEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNESR_[_ZZN8pybind1112cpp_function10initializeIRPFSt6vectorIN2at6TensorESaIS4_EES4_S4_S4_ES6_JS4_S4_S4_EJNS_4nameENS_5scopeENS_7siblingEA18_cEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNESR_]+0x186): undefined reference to `__imp_THPVariableClass' build\temp.win-amd64-3.6\Release\mylinear.o:mylinear.cpp:(.text$_ZZN8pybind1112cpp_function10initializeIRPFSt6vectorIN2at6TensorESaIS4_EES4_S4_S4_ES6_JS4_S4_S4_EJNS_4nameENS_5scopeENS_7siblingEA18_cEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNESR_[_ZZN8pybind1112cpp_function10initializeIRPFSt6vectorIN2at6TensorESaIS4_EES4_S4_S4_ES6_JS4_S4_S4_EJNS_4nameENS_5scopeENS_7siblingEA18_cEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNESR_]+0x228): undefined reference to `__imp__ZN3c106detail23torchInternalAssertFailEPKcS2_jS2_S2_' build\temp.win-amd64-3.6\Release\mylinear.o:mylinear.cpp:(.text$_ZZN8pybind1112cpp_function10initializeIRPFSt6vectorIN2at6TensorESaIS4_EES4_S4_S4_ES6_JS4_S4_S4_EJNS_4nameENS_5scopeENS_7siblingEA18_cEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNESR_[_ZZN8pybind1112cpp_function10initializeIRPFSt6vectorIN2at6TensorESaIS4_EES4_S4_S4_ES6_JS4_S4_S4_EJNS_4nameENS_5scopeENS_7siblingEA18_cEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNESR_]+0x274): undefined reference to `__imp__ZN3c1019UndefinedTensorImpl10_singletonE' collect2.exe: error: ld returned 1 exit status error: command 'D:\\ProgramData\\mingw64\\bin\\g++.exe' failed with exit status 1
推测原因为,由于python和pytorch都是有msvc编译的出的静态库.lib和动态库.dll,如果采用g++链接,会由于不同编译器上对函数的编码规则,而无法找到相关的引用。这个在官方文档中也有提到,如下
A small note on compilers: Due to ABI versioning issues, the compiler you use to build your C++ extension must be ABI-compatible with the compiler PyTorch was built with. In practice, this means that you must use GCC version 4.9 and above on Linux. For Ubuntu 16.04 and other more-recent Linux distributions, this should be the default compiler already. On MacOS, you must use clang (which does not have any ABI versioning issues). In the worst case, you can build PyTorch from source with your compiler and then build the extension with that same compiler.
采用msvc编译
需要在python的安装目录D:\Python36WindowsTensorflow\Python36\Lib\distutils\disutils.cfg中作如下改变(如果没有这个文件,自行创建即可)
[build] #compiler=mingw32 compiler=msvc [build_ext] #compiler=mingw32 compiler=msvc
当然,可能需要升级msvc的版本。在使用vs2015的编译工具cl.exe,由于对c++14中常量表达式的支持不好,会遇到很多错误。升级到使用vs2017的工具包能完美解决。
D:\Program Files\Microsoft Visual Studio\2017\BuildTools\VC\Tools\MSVC\14.16.27023\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:D:\Python36WindowsTensorflow\Python36\lib\site-packages\torch\lib /LIBPATH:D:\Python36WindowsTensorflow\Python36\libs /LIBPATH:D:\Python36WindowsTensorflow\Python36\PCbuild\amd64 "/LIBPATH:D:\Program Files\Microsoft Visual Studio\2017\BuildTools\VC\Tools\MSVC\14.16.27023\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.10240.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\8.1\lib\winv6.3\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_mylinear_cpp build\temp.win-amd64-3.6\Release\mylinear.obj /OUT:build\lib.win-amd64-3.6\mylinear_cpp.cp36-win_amd64.pyd /IMPLIB:build\temp.win-amd64-3.6\Release\mylinear_cpp.cp36-win_amd64.lib 正在创建库 build\temp.win-amd64-3.6\Release\mylinear_cpp.cp36-win_amd64.lib 和对象 build\temp.win-amd64-3.6\Release\mylinear_cpp.cp36-win_amd64.exp 正在生成代码 已完成代码的生成