- 投稿日:2020-09-06T02:50:07+09:00
TF2.1でGPUが認識されているのに実行するときにcudnnでエラーが出る / Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
環境
Windows10
python3.7.4
tensorflow2.1.0
cuda10.2
cudnn7.6.5はじめに
GPUは認識されている
from tensorflow.python.client import device_lib device_lib.list_local_devices() # 出力結果 [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 12939604985444578121 , name: "/device:GPU:0" device_type: "GPU" memory_limit: 4990763008 locality { bus_id: 1 links { } } incarnation: 15893135237303968832 physical_device_desc: "device: 0, name: GeForce GTX 1660 SUPER, pci bus id: 0000:09:00.0, compute capability: 7.5" ]線形層などでは大丈夫だったが、Conv2Dを含むコードを実行時にエラーが発生
~~~略~~~ 2020-09-06 02:09:49.391503: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2020-09-06 02:09:50.835593: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED 2020-09-06 02:09:50.836159: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED ~~~略~~~ tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]どうやらメモリーの問題のようで、GPUメモリーの使用に制限をかけると上手くいくらしい
試したこと1
使用を抑えるコード
import tensorflow as tf tf.config.gpu.set_per_process_memory_fraction(0.75) tf.config.gpu.set_per_process_memory_growth(True) →結果 AttributeError: module 'tensorflow_core._api.v2.config' has no attribute 'gpu'バージョンが違うのか、無いと言われてしまった
試したこと2
他の方法もあったので試した
以下のコードを頭につけるimport tensorflow as tf gpus = tf.config.experimental.list_physical_devices('GPU') if gpus: try: # Currently, memory growth needs to be the same across GPUs for gpu in gpus: tf.config.experimental.set_memory_growth(gpu, True) logical_gpus = tf.config.experimental.list_logical_devices('GPU') print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs") except RuntimeError as e: # Memory growth must be set before GPUs have been initialized print(e)→上手くいった!!
参考
- 投稿日:2020-09-06T02:50:07+09:00
TF2.1でGPUが認識されているのにcudnnでエラー / Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
環境
Windows10
python3.7.4
tensorflow2.1.0
cuda10.2
cudnn7.6.5はじめに
GPUは認識されている
from tensorflow.python.client import device_lib device_lib.list_local_devices() # 出力結果 [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 12939604985444578121 , name: "/device:GPU:0" device_type: "GPU" memory_limit: 4990763008 locality { bus_id: 1 links { } } incarnation: 15893135237303968832 physical_device_desc: "device: 0, name: GeForce GTX 1660 SUPER, pci bus id: 0000:09:00.0, compute capability: 7.5" ]線形層などでは大丈夫だったが、Conv2Dを含むコードを実行時にエラーが発生
~~~略~~~ 2020-09-06 02:09:49.391503: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2020-09-06 02:09:50.835593: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED 2020-09-06 02:09:50.836159: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED ~~~略~~~ tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]どうやらメモリーの問題のようで、GPUメモリーの使用に制限をかけると上手くいくらしい
試したこと1
使用を抑えるコード
import tensorflow as tf tf.config.gpu.set_per_process_memory_fraction(0.75) tf.config.gpu.set_per_process_memory_growth(True) →結果 AttributeError: module 'tensorflow_core._api.v2.config' has no attribute 'gpu'バージョンが違うのか、無いと言われてしまった
試したこと2
他の方法もあったので試した
以下のコードを頭につけるimport tensorflow as tf gpus = tf.config.experimental.list_physical_devices('GPU') if gpus: try: # Currently, memory growth needs to be the same across GPUs for gpu in gpus: tf.config.experimental.set_memory_growth(gpu, True) logical_gpus = tf.config.experimental.list_logical_devices('GPU') print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs") except RuntimeError as e: # Memory growth must be set before GPUs have been initialized print(e)→上手くいった!!
参考