5.9 KiB

Raw Blame History

Installation

Install with pip

Binary Python wheels are published on PyPI and can be directly installed with pip:

pip install ctranslate2

The Python wheels have the following requirements:

OS: Linux (x86-64, AArch64), macOS (x86-64, ARM64), Windows (x86-64)
Python version: >= 3.7
pip version: >= 19.3 to support manylinux2014 wheels

The Linux and Windows Python wheels support GPU execution. Install [CUDA](https://developer.nvidia.com/cuda-toolkit) 11.2 or above to use the GPU.

If you plan to run models with convolutional layers (e.g. for speech recognition), you should also install [cuDNN 8](https://developer.nvidia.com/cudnn).

On Windows [the Visual C++ runtime](https://www.microsoft.com/en-US/download/details.aspx?id=48145) is required. It is installed in most of the systems, but if is not the case, you have to download it and install it.

Install with Docker

Docker images can be downloaded from the GitHub Container registry:

docker pull ghcr.io/opennmt/ctranslate2:latest-ubuntu20.04-cuda11.2

The images include:

the NVIDIA libraries cuBLAS and cuDNN to support GPU execution
the C++ library installed in /opt/ctranslate2
the Python module installed in the Python system packages
the translator executable, which is the image entrypoint:

docker run --rm ghcr.io/opennmt/ctranslate2:latest-ubuntu20.04-cuda11.2 --help

The Docker image supports GPU execution. Install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/overview.html) to use GPUs from Docker.

Install from sources

Download the source code

Clone the CTranslate2 Git repository and its submodules.

git clone --recursive https://github.com/OpenNMT/CTranslate2.git

Compile the C++ library

Compiling the library requires a compiler supporting C++17 and CMake 3.15 or greater.

mkdir build && cd build
cmake ..
make -j4
make install

By default, the library is compiled with the Intel MKL backend which should be installed separately. See the {ref}installation:build options to select or add another backend.

Compile the Python wrapper

Once the C++ library is installed, you can compile the Python wrapper which uses pybind11. This step requires the Python development libraries to be installed on the system.

cd python
pip install -r install_requirements.txt
python setup.py bdist_wheel
pip install dist/*.whl

If you installed the C++ library in a custom directory, you should configure additional environment variables:

* When running `setup.py`, set `CTRANSLATE2_ROOT` to the CTranslate2 install directory.
* When running your Python application, add the CTranslate2 library path to `LD_LIBRARY_PATH`.

Build options

The following options can be set with -DOPTION=VALUE during the CMake configuration:

CMake option	Values (default in bold)	Description
BUILD_CLI	OFF, ON	Compiles the command line clients
BUILD_TESTS	OFF, ON	Compiles the tests
CMAKE_CXX_FLAGS	compiler flags	Defines additional compiler flags
CMAKE_INSTALL_PREFIX	path	Defines the installation path of the library
CUDA_ARCH_LIST	Auto	List of CUDA architectures to compile for (see `cuda_select_nvcc_arch_flags` in the CMake documentation)
CUDA_DYNAMIC_LOADING	OFF, ON	Enables the dynamic loading of CUDA libraries at runtime instead of linking against them (requires CUDA >= 11)
CUDA_NVCC_FLAGS	compiler flags	Defines additional compilation flags for `nvcc`
ENABLE_CPU_DISPATCH	OFF, ON	Compiles CPU kernels for multiple ISA and dispatches at runtime (should be disabled when explicitly targeting an architecture with the `-march` compilation flag)
ENABLE_PROFILING	OFF, ON	Enables the integrated profiler (usually disabled in production builds)
OPENMP_RUNTIME	INTEL, COMP, NONE	Selects the OpenMP runtime: INTEL: Intel OpenMP COMP: OpenMP runtime provided by the compiler NONE: no OpenMP runtime (a custom threading implementation will be used)
WITH_CUDA	OFF, ON	Compiles with the CUDA backend
WITH_CUDNN	OFF, ON	Compiles with the cuDNN backend
WITH_DNNL	OFF, ON	Compiles with the oneDNN backend (a.k.a. DNNL)
WITH_MKL	OFF, ON	Compiles with the Intel MKL backend
WITH_ACCELERATE	OFF, ON	Compiles with the Apple Accelerate backend
WITH_OPENBLAS	OFF, ON	Compiles with the OpenBLAS backend
WITH_RUY	OFF, ON	Compiles with the Ruy backend

Some build options require additional dependencies. See their respective documentation for installation instructions.

-DWITH_CUDA=ON requires CUDA >= 11.0
-DWITH_CUDNN=ON requires cuDNN >= 8
-DWITH_MKL=ON requires Intel MKL >= 2019.5
-DWITH_DNNL=ON requires oneDNN >= 3.0
-DWITH_ACCELERATE=ON requires Accelerate
-DWITH_OPENBLAS=ON requires OpenBLAS

Multiple backends can be enabled for a single build, for example:

-DWITH_MKL=ON -DWITH_CUDA=ON: enable CPU and GPU support
-DWITH_MKL=ON -DWITH_DNNL=ON: during runtime, the library will select Intel MKL when running on Intel and oneDNN when running on AMD
-DWITH_OPENBLAS=ON -DWITH_RUY=ON: use Ruy for quantized models and OpenBLAS for non quantized models

5.9 KiB Raw Blame History