- 投稿日:2020-07-29T22:20:11+09:00
Mac mini で TensorFlow v2.3.0 と PlaidML を比較計測してみました
こちらは実行時のコマンドと、そのログを記載しています。
記事の内容はこちらをご参照ください。計測結果一覧
- mnist_mlp.py (customized)
framework CPU load elapsed time TensorFlow v2.3.0 89 % 16.170 sec PlaidML + Keras 42 % 23.334 sec
- mnist_cnn.py (customized)
framework CPU load elapsed time TensorFlow v2.3.0 92 % 188.279 sec PlaidML + Keras 37 % 316.005 sec
measure : MLP
- using keras/examples/mnist_mlp.py (customized)
TensorFlow v2.3.0 (CPU)
- time : 16.170s
$ time python3 mnist_mlp.py 60000 train samples 10000 valid samples 2020-07-29 13:26:05.609231: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f9143ee1640 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-07-29 13:26:05.609262: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense (Dense) (None, 512) 401920 _________________________________________________________________ dropout (Dropout) (None, 512) 0 _________________________________________________________________ dense_1 (Dense) (None, 512) 262656 _________________________________________________________________ dropout_1 (Dropout) (None, 512) 0 _________________________________________________________________ dense_2 (Dense) (None, 10) 5130 ================================================================= Total params: 669,706 Trainable params: 669,706 Non-trainable params: 0 _________________________________________________________________ 469/469 [==============================] - 11s 23ms/step - loss: 0.2473 - accuracy: 0.9233 - val_loss: 0.1034 - val_accuracy: 0.9680 Valid loss: 0.10344783961772919 Valid acc.: 0.9679999947547913 real 0m16.170s user 0m31.537s sys 0m4.165s
- iostat : CPU load : 89 %
$ iostat 5 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 4.00 0 0.00 19 8 73 4.74 3.39 2.58 24.89 9 0.22 56 12 33 4.68 3.40 2.59 0.00 0 0.00 75 14 11 4.79 3.44 2.61 5.33 1 0.00 61 11 28 4.89 3.48 2.63 21.97 47 1.00 13 8 79 4.73 3.47 2.63PlaidML v0.6.4 (GPU) and Keras v2.2.4
PLAIDML_DEVICE_IDS
- opencl_amd_ati_radeon_hd_6630m.0
time : 23.334s
$ time python3 mnist_mlp.py 60000 train samples 10000 valid samples INFO:plaidml:Opening device "opencl_amd_ati_radeon_hd_6630m.0" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense_1 (Dense) (None, 512) 401920 _________________________________________________________________ dropout_1 (Dropout) (None, 512) 0 _________________________________________________________________ dense_2 (Dense) (None, 512) 262656 _________________________________________________________________ dropout_2 (Dropout) (None, 512) 0 _________________________________________________________________ dense_3 (Dense) (None, 10) 5130 ================================================================= Total params: 669,706 Trainable params: 669,706 Non-trainable params: 0 _________________________________________________________________ Train on 60000 samples, validate on 10000 samples Epoch 1/1 60000/60000 [==============================] - 18s 306us/step - loss: 0.2518 - acc: 0.9220 - val_loss: 0.0986 - val_acc: 0.9714 Valid loss: 0.09862979149818421 Valid acc.: 0.9714 real 0m23.334s user 0m17.709s sys 0m6.655s
- iostat : CPU load : 42 %
$ iostat 5 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 45.59 8 0.37 5 6 90 1.70 2.14 2.31 25.30 9 0.21 18 10 72 1.89 2.17 2.32 29.01 16 0.45 29 10 62 1.97 2.19 2.32 4.00 0 0.00 27 12 61 1.98 2.18 2.32 0.00 0 0.00 27 12 61 2.06 2.19 2.33 4.00 0 0.00 29 12 59 1.97 2.17 2.32 14.74 48 0.69 29 13 58 1.97 2.17 2.32 0.00 0 0.00 13 5 82 2.06 2.19 2.32 4.00 0 0.00 12 5 83 1.97 2.17 2.31
measure : CNN
- using keras/examples/mnist_cnn.py (customized)
TensorFlow v2.3.0 (CPU)
- time : 188.279s
(tf2) $ time python3 mnist_cnn.py x_train shape: (60000, 28, 28, 1) 60000 train samples 10000 valid samples 2020-07-29 16:23:59.387600: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fce05190fc0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-07-29 16:23:59.387652: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 26, 26, 32) 320 _________________________________________________________________ conv2d_1 (Conv2D) (None, 24, 24, 64) 18496 _________________________________________________________________ average_pooling2d (AveragePo (None, 12, 12, 64) 0 _________________________________________________________________ dropout (Dropout) (None, 12, 12, 64) 0 _________________________________________________________________ flatten (Flatten) (None, 9216) 0 _________________________________________________________________ dense (Dense) (None, 128) 1179776 _________________________________________________________________ dropout_1 (Dropout) (None, 128) 0 _________________________________________________________________ dense_1 (Dense) (None, 10) 1290 ================================================================= Total params: 1,199,882 Trainable params: 1,199,882 Non-trainable params: 0 _________________________________________________________________ 469/469 [==============================] - 153s 327ms/step - loss: 2.2950 - accuracy: 0.1257 - val_loss: 2.2737 - val_accuracy: 0.2647 Valid loss: 2.273723840713501 Valid acc.: 0.2646999955177307 real 3m8.279s user 8m30.347s sys 0m29.370s
- iostat : CPU load : 92 %
$ iostat 5 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 56.38 179 9.87 6 6 87 2.18 2.48 2.45 15.38 217 3.26 7 6 87 2.32 2.50 2.46 30.12 211 6.22 9 8 83 2.30 2.49 2.46 64.89 56 3.53 42 8 50 2.83 2.60 2.50 6.00 0 0.00 83 8 8 3.09 2.66 2.52 12.80 5 0.06 83 8 8 3.40 2.73 2.54 8.00 1 0.00 83 8 8 3.53 2.77 2.56 21.78 13 0.27 84 8 8 3.73 2.82 2.58 0.00 0 0.00 83 8 8 3.83 2.86 2.59 4.00 0 0.00 84 8 8 3.92 2.89 2.60 0.00 0 0.00 84 8 8 4.01 2.93 2.62 13.80 6 0.08 84 8 8 4.25 2.99 2.64 20.96 14 0.29 84 8 8 4.31 3.03 2.66 18.07 23 0.41 84 8 8 4.52 3.09 2.68 0.00 0 0.00 84 8 8 6.00 3.42 2.80 38.34 8 0.31 84 8 8 5.84 3.43 2.81 0.00 0 0.00 84 8 8 6.01 3.51 2.84 0.00 0 0.00 84 8 7 6.17 3.58 2.87 16.00 0 0.00 84 8 7 6.40 3.67 2.90 18.50 17 0.30 84 8 8 6.45 3.73 2.93 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 5.33 1 0.00 84 8 8 6.33 3.75 2.94 4.00 0 0.00 84 8 8 6.14 3.75 2.95 6.40 2 0.01 84 8 8 6.21 3.81 2.97 0.00 0 0.00 84 8 8 6.11 3.83 2.98 0.00 0 0.00 84 8 8 6.26 3.89 3.01 21.03 14 0.30 84 8 8 6.32 3.95 3.03 0.00 0 0.00 84 8 8 6.38 4.00 3.06 35.60 8 0.28 84 8 8 6.19 4.00 3.06 0.00 0 0.00 84 8 8 6.17 4.03 3.08 80.00 0 0.02 84 9 8 6.40 4.11 3.11 0.00 0 0.00 84 8 8 6.04 4.08 3.11 18.29 14 0.25 84 8 8 6.12 4.12 3.13 8.33 2 0.02 81 8 12 5.87 4.11 3.13 4.00 1 0.00 85 7 8 5.80 4.12 3.14 17.28 31 0.53 81 8 11 5.90 4.17 3.16 37.58 40 1.47 71 8 21 5.66 4.15 3.16 9.14 1 0.01 4 6 90 5.21 4.08 3.14 20.53 17 0.34 3 6 91 4.87 4.03 3.13PlaidML v0.6.4 (GPU) and Keras v2.2.4
- time : 316.005s
(tf2) $ time python3 mnist_cnn.py x_train shape: (60000, 28, 28, 1) 60000 train samples 10000 valid samples INFO:plaidml:Opening device "opencl_amd_ati_radeon_hd_6630m.0" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_1 (Conv2D) (None, 26, 26, 32) 320 _________________________________________________________________ conv2d_2 (Conv2D) (None, 24, 24, 64) 18496 _________________________________________________________________ average_pooling2d_1 (Average (None, 12, 12, 64) 0 _________________________________________________________________ dropout_1 (Dropout) (None, 12, 12, 64) 0 _________________________________________________________________ flatten_1 (Flatten) (None, 9216) 0 _________________________________________________________________ dense_1 (Dense) (None, 128) 1179776 _________________________________________________________________ dropout_2 (Dropout) (None, 128) 0 _________________________________________________________________ dense_2 (Dense) (None, 10) 1290 ================================================================= Total params: 1,199,882 Trainable params: 1,199,882 Non-trainable params: 0 _________________________________________________________________ Train on 60000 samples, validate on 10000 samples Epoch 1/1 60000/60000 [==============================] - 298s 5ms/step - loss: 0.3072 - acc: 0.9063 - val_loss: 0.0794 - val_acc: 0.9753 Valid loss: 0.07935381038188934 Valid acc.: 0.9753 real 5m16.005s user 4m50.810s sys 0m11.139s
- iostat : CPU load : 37 %
$ iostat 5 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 15.44 27 0.41 6 7 87 2.13 2.60 2.40 4.00 0 0.00 4 6 90 2.12 2.59 2.39 4.00 0 0.00 20 7 73 2.27 2.61 2.40 0.00 0 0.00 27 4 69 2.33 2.62 2.41 4.00 0 0.00 27 4 69 2.62 2.67 2.43 0.00 0 0.00 28 4 68 2.65 2.68 2.43 20.46 25 0.50 26 4 70 2.60 2.67 2.43 0.00 0 0.00 27 4 69 2.79 2.71 2.44 23.30 11 0.26 27 4 69 3.05 2.76 2.46 32.07 12 0.37 28 4 68 2.96 2.75 2.46 36.24 10 0.36 29 6 65 3.13 2.79 2.47 6.00 1 0.00 27 4 69 3.04 2.77 2.47 16.55 36 0.58 27 4 68 3.03 2.78 2.47 4.00 0 0.00 28 5 68 2.95 2.76 2.47 4.00 0 0.00 28 5 67 2.95 2.77 2.47 31.86 17 0.53 27 4 69 2.88 2.75 2.47 14.05 8 0.12 29 6 64 2.89 2.76 2.47 20.72 8 0.16 27 4 69 2.90 2.76 2.48 22.78 29 0.65 28 4 68 2.82 2.75 2.47 26.55 2 0.06 27 3 69 2.76 2.74 2.47 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 27.79 12 0.31 27 4 70 2.70 2.72 2.47 22.35 3 0.07 28 4 69 2.64 2.71 2.46 36.31 10 0.36 30 7 62 2.67 2.72 2.47 34.19 13 0.43 30 4 66 2.86 2.75 2.48 21.58 31 0.66 31 6 64 2.87 2.76 2.49 18.50 2 0.03 28 5 67 2.96 2.78 2.49 4.00 0 0.00 26 3 70 2.88 2.76 2.49 21.76 7 0.14 27 4 69 2.89 2.77 2.49 27.33 1 0.03 27 4 69 2.82 2.75 2.49 35.08 5 0.18 28 4 68 2.75 2.74 2.49 16.68 28 0.45 28 5 67 2.77 2.75 2.49 4.00 0 0.00 27 4 69 2.87 2.77 2.50 9.33 1 0.01 27 3 70 2.80 2.75 2.50 0.00 0 0.00 26 4 70 2.74 2.74 2.49 41.42 12 0.48 28 4 68 2.84 2.76 2.50 35.20 1 0.03 28 5 68 3.09 2.81 2.52 19.91 12 0.24 27 4 69 3.16 2.83 2.53 0.00 0 0.00 26 3 71 3.07 2.82 2.53 9.27 23 0.20 27 4 70 2.98 2.81 2.52 0.00 0 0.00 26 3 71 2.98 2.81 2.53 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 4.00 0 0.00 27 4 70 2.99 2.81 2.53 0.00 0 0.00 26 3 71 2.99 2.82 2.53 28.24 8 0.23 26 4 70 2.99 2.82 2.53 0.00 0 0.00 26 3 70 2.91 2.80 2.53 4.00 0 0.00 27 4 69 2.84 2.79 2.53 34.93 8 0.28 27 4 69 2.77 2.78 2.52 0.00 0 0.00 26 3 71 2.63 2.75 2.51 0.00 0 0.00 26 3 71 2.58 2.74 2.51 22.71 10 0.23 26 4 70 2.53 2.72 2.51 0.00 0 0.00 26 3 71 2.49 2.71 2.50 4.00 1 0.00 26 4 70 2.45 2.70 2.50 4.00 1 0.00 27 4 70 2.41 2.69 2.50 4.00 1 0.00 26 4 70 2.46 2.69 2.50 27.88 21 0.57 26 4 70 2.50 2.70 2.50 28.75 8 0.22 26 4 70 2.54 2.70 2.51 4.00 0 0.00 26 4 70 2.58 2.71 2.51 4.00 1 0.00 26 3 71 2.61 2.71 2.51 4.00 1 0.00 26 3 71 2.56 2.70 2.51 4.00 0 0.00 26 3 71 2.60 2.70 2.51 4.00 0 0.00 28 4 68 2.55 2.69 2.51 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 37.43 1 0.05 26 5 68 2.50 2.68 2.50 4.00 1 0.00 26 4 70 2.78 2.73 2.52 4.00 1 0.00 26 5 68 2.72 2.72 2.52 33.07 9 0.29 24 7 69 2.90 2.76 2.53 14.00 2 0.02 26 6 68 2.99 2.78 2.54 7.47 3 0.02 16 5 79 2.83 2.75 2.53 17.47 17 0.29 2 6 92 2.68 2.72 2.52 0.00 0 0.00 2 6 92 2.47 2.68 2.51
setup log
(tf2) $ plaidml-setup PlaidML Setup (0.6.4) Thanks for using PlaidML! Some Notes: * Bugs and other issues: https://github.com/plaidml/plaidml * Questions: https://stackoverflow.com/questions/tagged/plaidml * Say hello: https://groups.google.com/forum/#!forum/plaidml-dev * PlaidML is licensed under the Apache License 2.0 Default Config Devices: No devices. Experimental Config Devices: llvm_cpu.0 : CPU (LLVM) opencl_amd_ati_radeon_hd_6630m.0 : AMD ATI Radeon HD 6630M (OpenCL) opencl_cpu.0 : Intel CPU (OpenCL) Using experimental devices can cause poor performance, crashes, and other nastiness. Enable experimental device support? (y,n)[n]:y Multiple devices detected (You can override by setting PLAIDML_DEVICE_IDS). Please choose a default device: 1 : llvm_cpu.0 2 : opencl_amd_ati_radeon_hd_6630m.0 3 : opencl_cpu.0 Default device? (1,2,3)[1]:2 Selected device: opencl_amd_ati_radeon_hd_6630m.0 Almost done. Multiplying some matrices... Tile code: function (B[X,Z], C[Z,Y]) -> (A) { A[x,y : X,Y] = +(B[x,z] * C[z,y]); } Whew. That worked. Save settings to /Users/nobi/.plaidml? (y,n)[y]: Success!
error log
AttributeError: module 'tensorflow' has no attribute 'get_default_graph'
- AttributeError: module 'tensorflow' has no attribute 'get_default_graph'
- it caused by compatibility btwn Keras 2.2.4 and TF.
EOF
- 投稿日:2020-07-29T22:20:11+09:00
Mac mini で TensorFlow v2.3.0 と PlaidML を比較計測してみました(実行ログ)
こちらは実行時のコマンドと、そのログを記載しています。
記事の内容はこちらをご参照ください。計測結果一覧
- mnist_mlp.py (customized)
framework CPU load elapsed time TensorFlow v2.3.0 89 % 16.170 sec PlaidML + Keras 42 % 23.334 sec
- mnist_cnn.py (customized)
framework CPU load elapsed time TensorFlow v2.3.0 92 % 188.279 sec PlaidML + Keras 37 % 316.005 sec
measure : MLP
- using keras/examples/mnist_mlp.py (customized)
TensorFlow v2.3.0 (CPU)
- time : 16.170s
(tf2) $ time python3 mnist_mlp.py 60000 train samples 10000 valid samples 2020-07-29 13:26:05.609231: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f9143ee1640 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-07-29 13:26:05.609262: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense (Dense) (None, 512) 401920 _________________________________________________________________ dropout (Dropout) (None, 512) 0 _________________________________________________________________ dense_1 (Dense) (None, 512) 262656 _________________________________________________________________ dropout_1 (Dropout) (None, 512) 0 _________________________________________________________________ dense_2 (Dense) (None, 10) 5130 ================================================================= Total params: 669,706 Trainable params: 669,706 Non-trainable params: 0 _________________________________________________________________ 469/469 [==============================] - 11s 23ms/step - loss: 0.2473 - accuracy: 0.9233 - val_loss: 0.1034 - val_accuracy: 0.9680 Valid loss: 0.10344783961772919 Valid acc.: 0.9679999947547913 real 0m16.170s user 0m31.537s sys 0m4.165s
- iostat : CPU load : 89 %
$ iostat 5 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 4.00 0 0.00 19 8 73 4.74 3.39 2.58 24.89 9 0.22 56 12 33 4.68 3.40 2.59 0.00 0 0.00 75 14 11 4.79 3.44 2.61 5.33 1 0.00 61 11 28 4.89 3.48 2.63 21.97 47 1.00 13 8 79 4.73 3.47 2.63PlaidML v0.6.4 (GPU) and Keras v2.2.4
PLAIDML_DEVICE_IDS
- opencl_amd_ati_radeon_hd_6630m.0
time : 23.334s
(tf2) $ time python3 mnist_mlp.py 60000 train samples 10000 valid samples INFO:plaidml:Opening device "opencl_amd_ati_radeon_hd_6630m.0" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense_1 (Dense) (None, 512) 401920 _________________________________________________________________ dropout_1 (Dropout) (None, 512) 0 _________________________________________________________________ dense_2 (Dense) (None, 512) 262656 _________________________________________________________________ dropout_2 (Dropout) (None, 512) 0 _________________________________________________________________ dense_3 (Dense) (None, 10) 5130 ================================================================= Total params: 669,706 Trainable params: 669,706 Non-trainable params: 0 _________________________________________________________________ Train on 60000 samples, validate on 10000 samples Epoch 1/1 60000/60000 [==============================] - 18s 306us/step - loss: 0.2518 - acc: 0.9220 - val_loss: 0.0986 - val_acc: 0.9714 Valid loss: 0.09862979149818421 Valid acc.: 0.9714 real 0m23.334s user 0m17.709s sys 0m6.655s
- iostat : CPU load : 42 %
$ iostat 5 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 45.59 8 0.37 5 6 90 1.70 2.14 2.31 25.30 9 0.21 18 10 72 1.89 2.17 2.32 29.01 16 0.45 29 10 62 1.97 2.19 2.32 4.00 0 0.00 27 12 61 1.98 2.18 2.32 0.00 0 0.00 27 12 61 2.06 2.19 2.33 4.00 0 0.00 29 12 59 1.97 2.17 2.32 14.74 48 0.69 29 13 58 1.97 2.17 2.32 0.00 0 0.00 13 5 82 2.06 2.19 2.32 4.00 0 0.00 12 5 83 1.97 2.17 2.31
measure : CNN
- using keras/examples/mnist_cnn.py (customized)
TensorFlow v2.3.0 (CPU)
- time : 188.279s
(tf2) $ time python3 mnist_cnn.py x_train shape: (60000, 28, 28, 1) 60000 train samples 10000 valid samples 2020-07-29 16:23:59.387600: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fce05190fc0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-07-29 16:23:59.387652: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 26, 26, 32) 320 _________________________________________________________________ conv2d_1 (Conv2D) (None, 24, 24, 64) 18496 _________________________________________________________________ average_pooling2d (AveragePo (None, 12, 12, 64) 0 _________________________________________________________________ dropout (Dropout) (None, 12, 12, 64) 0 _________________________________________________________________ flatten (Flatten) (None, 9216) 0 _________________________________________________________________ dense (Dense) (None, 128) 1179776 _________________________________________________________________ dropout_1 (Dropout) (None, 128) 0 _________________________________________________________________ dense_1 (Dense) (None, 10) 1290 ================================================================= Total params: 1,199,882 Trainable params: 1,199,882 Non-trainable params: 0 _________________________________________________________________ 469/469 [==============================] - 153s 327ms/step - loss: 2.2950 - accuracy: 0.1257 - val_loss: 2.2737 - val_accuracy: 0.2647 Valid loss: 2.273723840713501 Valid acc.: 0.2646999955177307 real 3m8.279s user 8m30.347s sys 0m29.370s
- iostat : CPU load : 92 %
$ iostat 5 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 56.38 179 9.87 6 6 87 2.18 2.48 2.45 15.38 217 3.26 7 6 87 2.32 2.50 2.46 30.12 211 6.22 9 8 83 2.30 2.49 2.46 64.89 56 3.53 42 8 50 2.83 2.60 2.50 6.00 0 0.00 83 8 8 3.09 2.66 2.52 12.80 5 0.06 83 8 8 3.40 2.73 2.54 8.00 1 0.00 83 8 8 3.53 2.77 2.56 21.78 13 0.27 84 8 8 3.73 2.82 2.58 0.00 0 0.00 83 8 8 3.83 2.86 2.59 4.00 0 0.00 84 8 8 3.92 2.89 2.60 0.00 0 0.00 84 8 8 4.01 2.93 2.62 13.80 6 0.08 84 8 8 4.25 2.99 2.64 20.96 14 0.29 84 8 8 4.31 3.03 2.66 18.07 23 0.41 84 8 8 4.52 3.09 2.68 0.00 0 0.00 84 8 8 6.00 3.42 2.80 38.34 8 0.31 84 8 8 5.84 3.43 2.81 0.00 0 0.00 84 8 8 6.01 3.51 2.84 0.00 0 0.00 84 8 7 6.17 3.58 2.87 16.00 0 0.00 84 8 7 6.40 3.67 2.90 18.50 17 0.30 84 8 8 6.45 3.73 2.93 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 5.33 1 0.00 84 8 8 6.33 3.75 2.94 4.00 0 0.00 84 8 8 6.14 3.75 2.95 6.40 2 0.01 84 8 8 6.21 3.81 2.97 0.00 0 0.00 84 8 8 6.11 3.83 2.98 0.00 0 0.00 84 8 8 6.26 3.89 3.01 21.03 14 0.30 84 8 8 6.32 3.95 3.03 0.00 0 0.00 84 8 8 6.38 4.00 3.06 35.60 8 0.28 84 8 8 6.19 4.00 3.06 0.00 0 0.00 84 8 8 6.17 4.03 3.08 80.00 0 0.02 84 9 8 6.40 4.11 3.11 0.00 0 0.00 84 8 8 6.04 4.08 3.11 18.29 14 0.25 84 8 8 6.12 4.12 3.13 8.33 2 0.02 81 8 12 5.87 4.11 3.13 4.00 1 0.00 85 7 8 5.80 4.12 3.14 17.28 31 0.53 81 8 11 5.90 4.17 3.16 37.58 40 1.47 71 8 21 5.66 4.15 3.16 9.14 1 0.01 4 6 90 5.21 4.08 3.14 20.53 17 0.34 3 6 91 4.87 4.03 3.13PlaidML v0.6.4 (GPU) and Keras v2.2.4
- time : 316.005s
(tf2) $ time python3 mnist_cnn.py x_train shape: (60000, 28, 28, 1) 60000 train samples 10000 valid samples INFO:plaidml:Opening device "opencl_amd_ati_radeon_hd_6630m.0" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_1 (Conv2D) (None, 26, 26, 32) 320 _________________________________________________________________ conv2d_2 (Conv2D) (None, 24, 24, 64) 18496 _________________________________________________________________ average_pooling2d_1 (Average (None, 12, 12, 64) 0 _________________________________________________________________ dropout_1 (Dropout) (None, 12, 12, 64) 0 _________________________________________________________________ flatten_1 (Flatten) (None, 9216) 0 _________________________________________________________________ dense_1 (Dense) (None, 128) 1179776 _________________________________________________________________ dropout_2 (Dropout) (None, 128) 0 _________________________________________________________________ dense_2 (Dense) (None, 10) 1290 ================================================================= Total params: 1,199,882 Trainable params: 1,199,882 Non-trainable params: 0 _________________________________________________________________ Train on 60000 samples, validate on 10000 samples Epoch 1/1 60000/60000 [==============================] - 298s 5ms/step - loss: 0.3072 - acc: 0.9063 - val_loss: 0.0794 - val_acc: 0.9753 Valid loss: 0.07935381038188934 Valid acc.: 0.9753 real 5m16.005s user 4m50.810s sys 0m11.139s
- iostat : CPU load : 37 %
$ iostat 5 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 15.44 27 0.41 6 7 87 2.13 2.60 2.40 4.00 0 0.00 4 6 90 2.12 2.59 2.39 4.00 0 0.00 20 7 73 2.27 2.61 2.40 0.00 0 0.00 27 4 69 2.33 2.62 2.41 4.00 0 0.00 27 4 69 2.62 2.67 2.43 0.00 0 0.00 28 4 68 2.65 2.68 2.43 20.46 25 0.50 26 4 70 2.60 2.67 2.43 0.00 0 0.00 27 4 69 2.79 2.71 2.44 23.30 11 0.26 27 4 69 3.05 2.76 2.46 32.07 12 0.37 28 4 68 2.96 2.75 2.46 36.24 10 0.36 29 6 65 3.13 2.79 2.47 6.00 1 0.00 27 4 69 3.04 2.77 2.47 16.55 36 0.58 27 4 68 3.03 2.78 2.47 4.00 0 0.00 28 5 68 2.95 2.76 2.47 4.00 0 0.00 28 5 67 2.95 2.77 2.47 31.86 17 0.53 27 4 69 2.88 2.75 2.47 14.05 8 0.12 29 6 64 2.89 2.76 2.47 20.72 8 0.16 27 4 69 2.90 2.76 2.48 22.78 29 0.65 28 4 68 2.82 2.75 2.47 26.55 2 0.06 27 3 69 2.76 2.74 2.47 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 27.79 12 0.31 27 4 70 2.70 2.72 2.47 22.35 3 0.07 28 4 69 2.64 2.71 2.46 36.31 10 0.36 30 7 62 2.67 2.72 2.47 34.19 13 0.43 30 4 66 2.86 2.75 2.48 21.58 31 0.66 31 6 64 2.87 2.76 2.49 18.50 2 0.03 28 5 67 2.96 2.78 2.49 4.00 0 0.00 26 3 70 2.88 2.76 2.49 21.76 7 0.14 27 4 69 2.89 2.77 2.49 27.33 1 0.03 27 4 69 2.82 2.75 2.49 35.08 5 0.18 28 4 68 2.75 2.74 2.49 16.68 28 0.45 28 5 67 2.77 2.75 2.49 4.00 0 0.00 27 4 69 2.87 2.77 2.50 9.33 1 0.01 27 3 70 2.80 2.75 2.50 0.00 0 0.00 26 4 70 2.74 2.74 2.49 41.42 12 0.48 28 4 68 2.84 2.76 2.50 35.20 1 0.03 28 5 68 3.09 2.81 2.52 19.91 12 0.24 27 4 69 3.16 2.83 2.53 0.00 0 0.00 26 3 71 3.07 2.82 2.53 9.27 23 0.20 27 4 70 2.98 2.81 2.52 0.00 0 0.00 26 3 71 2.98 2.81 2.53 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 4.00 0 0.00 27 4 70 2.99 2.81 2.53 0.00 0 0.00 26 3 71 2.99 2.82 2.53 28.24 8 0.23 26 4 70 2.99 2.82 2.53 0.00 0 0.00 26 3 70 2.91 2.80 2.53 4.00 0 0.00 27 4 69 2.84 2.79 2.53 34.93 8 0.28 27 4 69 2.77 2.78 2.52 0.00 0 0.00 26 3 71 2.63 2.75 2.51 0.00 0 0.00 26 3 71 2.58 2.74 2.51 22.71 10 0.23 26 4 70 2.53 2.72 2.51 0.00 0 0.00 26 3 71 2.49 2.71 2.50 4.00 1 0.00 26 4 70 2.45 2.70 2.50 4.00 1 0.00 27 4 70 2.41 2.69 2.50 4.00 1 0.00 26 4 70 2.46 2.69 2.50 27.88 21 0.57 26 4 70 2.50 2.70 2.50 28.75 8 0.22 26 4 70 2.54 2.70 2.51 4.00 0 0.00 26 4 70 2.58 2.71 2.51 4.00 1 0.00 26 3 71 2.61 2.71 2.51 4.00 1 0.00 26 3 71 2.56 2.70 2.51 4.00 0 0.00 26 3 71 2.60 2.70 2.51 4.00 0 0.00 28 4 68 2.55 2.69 2.51 disk0 cpu load average KB/t tps MB/s us sy id 1m 5m 15m 37.43 1 0.05 26 5 68 2.50 2.68 2.50 4.00 1 0.00 26 4 70 2.78 2.73 2.52 4.00 1 0.00 26 5 68 2.72 2.72 2.52 33.07 9 0.29 24 7 69 2.90 2.76 2.53 14.00 2 0.02 26 6 68 2.99 2.78 2.54 7.47 3 0.02 16 5 79 2.83 2.75 2.53 17.47 17 0.29 2 6 92 2.68 2.72 2.52 0.00 0 0.00 2 6 92 2.47 2.68 2.51
setup log
(tf2) $ plaidml-setup PlaidML Setup (0.6.4) Thanks for using PlaidML! Some Notes: * Bugs and other issues: https://github.com/plaidml/plaidml * Questions: https://stackoverflow.com/questions/tagged/plaidml * Say hello: https://groups.google.com/forum/#!forum/plaidml-dev * PlaidML is licensed under the Apache License 2.0 Default Config Devices: No devices. Experimental Config Devices: llvm_cpu.0 : CPU (LLVM) opencl_amd_ati_radeon_hd_6630m.0 : AMD ATI Radeon HD 6630M (OpenCL) opencl_cpu.0 : Intel CPU (OpenCL) Using experimental devices can cause poor performance, crashes, and other nastiness. Enable experimental device support? (y,n)[n]:y Multiple devices detected (You can override by setting PLAIDML_DEVICE_IDS). Please choose a default device: 1 : llvm_cpu.0 2 : opencl_amd_ati_radeon_hd_6630m.0 3 : opencl_cpu.0 Default device? (1,2,3)[1]:2 Selected device: opencl_amd_ati_radeon_hd_6630m.0 Almost done. Multiplying some matrices... Tile code: function (B[X,Z], C[Z,Y]) -> (A) { A[x,y : X,Y] = +(B[x,z] * C[z,y]); } Whew. That worked. Save settings to /Users/nobi/.plaidml? (y,n)[y]: Success!
error log
AttributeError: module 'tensorflow' has no attribute 'get_default_graph'
- AttributeError: module 'tensorflow' has no attribute 'get_default_graph'
- it caused by compatibility btwn Keras 2.2.4 and TF.
EOF