- 投稿日:2020-09-11T16:10:01+09:00
cos類似度行列の実装【Pytorch, Tensorflow】
SimCLRなどの対照学習(Contrastive Learning)の手法で,特徴量空間における類似度の指標として用いられるものの一つにCos(コサイン)類似度があります.
Tensorflow,Pytorchそれぞれで実装を行ったので,メモ程度に記録しておきます.(参考までに)
Pytorch
# input_sizeは (batchsize*次元数) def cosine_matrix(a, b): dot = torch.matmul(a, torch.t(b)) norm = torch.matmul(torch.norm(a, dim=1).unsqueeze(-1), torch.norm(b, dim=0).unsqueeze(0)) return dot / normTensorflow
def cosine_matrix(a, b): a_normed, _ = tf.linalg.normalize(a, axis=-1) b_normed, _ = tf.linalg.normalize(b, axis=-1) matrix = tf.matmul(a_normed, b_normed, transpose_b=True) return matrix
- 投稿日:2020-09-11T08:02:10+09:00
Tensorflow-2.2.0 ソースからのビルド
0.Intro
記事中のハードウェア環境の記載の通り僕は機械学習環境に初代がi7搭載された旧式のマシンを使っているでAVX2、FMAの都合上あるバージョン以降のTensorflowを使う場合はソースからビルドする必要があります。
Tensorflowのビルドは環境の設定が非常にシビアで今回も一度ビルドエラーとなりました。本エントリではTensorflow 2.2のビルドに成功したでOSインストール後からの手順を公開させて頂きます。1.環境構築
・開発環境のインストール
sudo apt-get install build-essential libssl-dev libbz2-dev libreadline-dev libsqlite3-dev zip unzip nkf・Nouveauドライバの無効化
NVIDIAのグラフィックカードの場合,デフォルトでnouveauというドライバが使用されている.
チェック方法lsmod | grep -i nouveauNVIDIAのドライバと競合する恐れがあるので無効化しておく.
/etc/modprobe.d/blacklist-nouveau.confを作成し,以下の設定を記述する.blacklist nouveau options nouveau modeset=0カーネルモジュールをblacklistに追加した後,再読み込み.
sudo update-initramfs -uリブート
sudo reboot・CUDA Toolkit 10.1 update2 Archive のインストール
$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin $ sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600 $ wget http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb $ sudo dpkg -i cuda-repo-ubuntu1804-10-1-local-10.1.243-418.87.00_1.0-1_amd64.deb $ sudo apt-key add /var/cuda-repo-10-1-local-10.1.243-418.87.00/7fa2af80.pub $ sudo apt-get update $ sudo apt-get -y install cuda・cudnnインストール
$ tar -xzvf cudnn-10.1-linux-x64-v7.6.5.32.tgz Copy the following files into the CUDA Toolkit directory, and change the file permissions. $ sudo cp cuda/include/cudnn.h /usr/local/cuda/include $ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64 $ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn* $ sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.1_amd64.deb Install the developer library, for example: $ sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.1_amd64.deb Install the code samples and the cuDNN Library User Guide, for example: $ sudo dpkg -i libcudnn7-doc_7.6.5.32-1+cuda10.1_amd64.deb・libcupti-devをapt-getでインストール
sudo apt-get install libcupti-dev・パスの設定
export PATH=/usr/local/cuda-10.1/bin:${PATH} export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64:${LD_LIBRARY_PATH}ここで一度リブート
その後CUDA,cudnnに付属のサンプルを動かして動作確認
cudnnにあるmnistの実行方法$ ar vx libcudnn7-doc_7.6.5.32-1+cuda10.1_amd64.deb $ tar Jxvf data.tar.xz $ cd ~/cudnn_samples_v7/mnistCUDNN でmake し実行してみる。・anacondaインストール
bash Anaconda3-2020.07-Linux-x86_64.sh・python3.7環境の作成と必要モジュールのインストール
conda create -n ml_env python=3.7 conda install six conda install mock conda install scikit-learn conda install -c conda-forge keras-applications conda install -c conda-forge keras-preprocessing pip install numpy==1.18.0最後のnumpyを最新版から1.18.0に置き換えるのがポイントで最新版の1.19.1だとTensorflowのビルドの最終盤で
ERROR: /home/aptx4869/github/tensorflow/tensorflow/python/tools/BUILD:281:1 C++ compilation of rule '//tensorflow/python:bfloat16_lib' failed (Exit 1) INFO: Elapsed time: 20.016s, Critical Path: 7.56s INFO: 0 processes. FAILED: Build did NOT complete successfullyでコケました。
https://github.com/tensorflow/tensorflow/issues/40688
この記事辺りから色々探っているとnumpyが原因のようです。1.19.0に問題があるようで1.19.1だとBUG FIXしたようなことが書いてるのですが上手くいきませんでした。よって成功事例が報告されている1.18.0にてビルド再実施。conda環境なので
conda install numpy=1.18.0
で置き換えることが出来るかと思いきやバージョン指定が上手くいかなかったのでpipにて置き換えました。・bazelのインストール
tensorflow-2.2のソースにあるconfigure.pyに_TF_BAZELRC_FILENAME = '.tf_configure.bazelrc' _TF_WORKSPACE_ROOT = '' _TF_BAZELRC = '' _TF_CURRENT_BAZEL_VERSION = None _TF_MIN_BAZEL_VERSION = '2.0.0' _TF_MAX_BAZEL_VERSION = '2.0.0'とあるので2.0.0を取得しインストール
bash bazel-2.0.0-installer-linux-x86_64.sh
・tensorflow-2.2のソース展開
githubよりtensorflow-2.2.0のソースを取得$unzip tensorflow-2.2.0.zip 展開・tensorflow-2.2コンパイル
(ml_env) XXXX@XXXX:~/tensorflow-2.2.0$ ~/tensorflow-2.2.0/configure
でビルドを構成。Do you wish to build TensorFlow with CUDA support? [y/N]:で y
Please specify a list of comma-separated CUDA compute capabilities you want to build with. You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 6.1]:で 6.1
と答えた以外はEnterで進みました。
(ml_env) XXXX@XXXX: ~./bin/bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_packageでビルド開始。
ここで一回目実行したときはコンパイル前の諸々のfetchの時にエラーになりました。しかしもう一度上記のコマンドを叩くと無事にコンパイルまで到達しました。ビルド終了までだいたい5時間程度でした。
・パッケージ作成
~/tensorflow-2.2.0/configure/bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg・パッケージインストール
pip install /tmp/tensorflow-2.2.0-cp37-cp37m-linux_x86_64.whl・動作確認
```
Python 3.7.9 (default, Aug 31 2020, 12:42:55)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.import tensorflow as tf
tf.test.gpu_device_name()
2020-09-10 22:56:01.627824: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2672785000 Hz
2020-09-10 22:56:01.638981: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55b8bdf41c00 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-10 22:56:01.639018: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-09-10 22:56:01.665330: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-09-10 22:56:01.901801: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-10 22:56:01.902813: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55b8bdfae070 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-10 22:56:01.902856: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1060 6GB, Compute Capability 6.1
2020-09-10 22:56:01.912217: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-10 22:56:01.913353: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:03:00.0 name: GeForce GTX 1060 6GB computeCapability: 6.1
coreClock: 1.7845GHz coreCount: 10 deviceMemorySize: 5.93GiB deviceMemoryBandwidth: 178.99GiB/s
2020-09-10 22:56:01.928570: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-09-10 22:56:02.112599: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-09-10 22:56:02.211270: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-09-10 22:56:02.253332: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-09-10 22:56:02.434494: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-09-10 22:56:02.472776: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-09-10 22:56:02.843482: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-09-10 22:56:02.843662: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-10 22:56:02.844592: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-10 22:56:02.845390: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-09-10 22:56:02.856635: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-09-10 22:56:02.868992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-10 22:56:02.869015: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-09-10 22:56:02.869030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-09-10 22:56:02.887511: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-10 22:56:02.888358: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-10 22:56:02.889182: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/device:GPU:0 with 5637 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:03:00.0, compute capability: 6.1)
'/device:GPU:0'ハードウェア環境
[CPU]
(ml_env) XXXX@XXXX:~$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz stepping : 5 microcode : 0x1d cpu MHz : 1603.722 cache size : 8192 KB physical id : 0[Mem]
(ml_env) XXXX@XXXX:~$ cat /proc/meminfo MemTotal: 20543584 kB MemFree: 19978580 kB MemAvailable: 20031036 kB Buffers: 34636 kB Cached: 277696 kB[GPU]
(ml_env) XXXX@XXXX:~$ lspci | grep -i nvidia 03:00.0 VGA compatible controller: NVIDIA Corporation GP106 [GeForce GTX 1060 6GB] (rev a1) 03:00.1 Audio device: NVIDIA Corporation GP106 High Definition Audio Controller (rev a1)バージョン一覧
[OS]
(ml_env) XXXX@XXXX:~$ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=18.04 DISTRIB_CODENAME=bionic DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS"[CUDA]
(ml_env) XXXX@XXXX:~$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243[cudnn]
(ml_env) XXXX@XXXX:~$ cat /usr/include/cudnn.h | grep CUDNN_MAJOR -A 2 #define CUDNN_MAJOR 7 #define CUDNN_MINOR 6 #define CUDNN_PATCHLEVEL 5 -- #define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL) #include "driver_types.h"[GCC]
(ml_env) XXXX@XXXX:~$ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.5.0-3ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)[Anaconda]
・Python(ml_env) XXXX@XXXX:~$ python3 -V Python 3.7.9・インストール済みパッケージ一覧
(ml_env) XXXX@XXXX:~$ conda list # packages in environment at /home/XXXX/anaconda3/envs/ml_env: # # Name Version Build Channel _libgcc_mutex 0.1 main absl-py 0.10.0 pypi_0 pypi astunparse 1.6.3 pypi_0 pypi ca-certificates 2020.6.20 hecda079_0 conda-forge cachetools 4.1.1 pypi_0 pypi certifi 2020.6.20 py37hc8dfbb8_0 conda-forge chardet 3.0.4 pypi_0 pypi gast 0.3.3 pypi_0 pypi google-auth 1.21.1 pypi_0 pypi google-auth-oauthlib 0.4.1 pypi_0 pypi google-pasta 0.2.0 pypi_0 pypi grpcio 1.32.0 pypi_0 pypi h5py 2.10.0 nompi_py37h90cd8ad_104 conda-forge hdf5 1.10.6 nompi_h3c11f04_101 conda-forge idna 2.10 pypi_0 pypi importlib-metadata 1.7.0 pypi_0 pypi keras-applications 1.0.8 py_1 conda-forge keras-preprocessing 1.1.0 py_0 conda-forge ld_impl_linux-64 2.33.1 h53a641e_7 libblas 3.8.0 17_openblas conda-forge libcblas 3.8.0 17_openblas conda-forge libedit 3.1.20191231 h14c3975_1 libffi 3.3 he6710b0_2 libgcc-ng 9.1.0 hdf63c60_0 libgfortran-ng 7.5.0 hdf63c60_16 conda-forge liblapack 3.8.0 17_openblas conda-forge libopenblas 0.3.10 pthreads_hb3c22a3_4 conda-forge libstdcxx-ng 9.1.0 hdf63c60_0 markdown 3.2.2 pypi_0 pypi mock 4.0.2 py_0 ncurses 6.2 he6710b0_1 numpy 1.18.0 pypi_0 pypi oauthlib 3.1.0 pypi_0 pypi openssl 1.1.1g h516909a_1 conda-forge opt-einsum 3.3.0 pypi_0 pypi pip 20.2.2 py37_0 protobuf 3.13.0 pypi_0 pypi pyasn1 0.4.8 pypi_0 pypi pyasn1-modules 0.2.8 pypi_0 pypi python 3.7.9 h7579374_0 python_abi 3.7 1_cp37m conda-forge readline 8.0 h7b6447c_0 requests 2.24.0 pypi_0 pypi requests-oauthlib 1.3.0 pypi_0 pypi rsa 4.6 pypi_0 pypi scipy 1.4.1 pypi_0 pypi setuptools 49.6.0 py37_0 six 1.15.0 py_0 sqlite 3.33.0 h62c20be_0 tensorboard 2.2.2 pypi_0 pypi tensorboard-plugin-wit 1.7.0 pypi_0 pypi tensorflow 2.2.0 pypi_0 pypi tensorflow-estimator 2.2.0 pypi_0 pypi termcolor 1.1.0 pypi_0 pypi tk 8.6.10 hbc83047_0 urllib3 1.25.10 pypi_0 pypi werkzeug 1.0.1 pypi_0 pypi wheel 0.35.1 py_0 wrapt 1.12.1 pypi_0 pypi xz 5.2.5 h7b6447c_0 zipp 3.1.0 pypi_0 pypi zlib 1.2.11 h7b6447c_3[Bazel]
(ml_env) XXXX@XXXX:~$ ./bin/bazel version Build label: 2.0.0 Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar Build time: Thu Dec 19 12:30:18 2019 (1576758618) Build timestamp: 1576758618 Build timestamp as int: 1576758618以上