77 lines
3.7 KiB
Markdown
77 lines
3.7 KiB
Markdown
# Model conversion
|
|
|
|
The core CTranslate2 implementation is framework agnostic. The logic that is specific to each framework is moved to a conversion step that loads supported models into a unified representation. The weights are then optionally quantized and saved into an optimized binary format.
|
|
|
|
## Supported frameworks
|
|
|
|
The Python module includes a [conversion API](python/ctranslate2.converters.rst) and conversion scripts for multiple frameworks:
|
|
|
|
* [Fairseq](guides/fairseq.md)
|
|
* [Marian](guides/marian.md)
|
|
* [OpenNMT-py](guides/opennmt_py.md)
|
|
* [OpenNMT-tf](guides/opennmt_tf.md)
|
|
* [OPUS-MT](guides/opus_mt.md)
|
|
* [Transformers](guides/transformers.md)
|
|
|
|
## Model structure
|
|
|
|
The conversion produces a model directory containing a binary model file, a JSON configuration, and one or more vocabulary files:
|
|
|
|
```text
|
|
config.json
|
|
model.bin
|
|
source_vocabulary.txt
|
|
target_vocabulary.txt
|
|
```
|
|
|
|
```{tip}
|
|
The Python API exposes the function [`ctranslate2.contains_model`](python/ctranslate2.contains_model.rst) to check if a directory is a CTranslate2 model.
|
|
```
|
|
|
|
## Quantization and reduced precision
|
|
|
|
The converters support reducing the weights precision to save on space and possibly accelerate the model execution. See the [Quantization](quantization.md) documentation.
|
|
|
|
## Backward compatibility
|
|
|
|
New versions of CTranslate2 are backward compatible with models that were previously converted. This compatibility is rarely broken, even for major versions.
|
|
|
|
```{attention}
|
|
Forward compatibility is not guaranteed, however. The CTranslate2 version loading the model should not be older than the version that converted the model.
|
|
```
|
|
|
|
## Portability
|
|
|
|
Converted models are portable in the sense they can be loaded on another machine using a different operating system or CPU architecture.
|
|
|
|
The only restriction is the 2 machines must use the same [endianness](https://en.wikipedia.org/wiki/Endianness).
|
|
|
|
## Add a new converter
|
|
|
|
You can write your own converter as long as the model architecture is supported by CTranslate2. The converter should populate a model specification with trained weights.
|
|
|
|
```{tip}
|
|
See the [existing converters](https://github.com/OpenNMT/CTranslate2/tree/master/python/ctranslate2/converters) which could be used as templates.
|
|
```
|
|
|
|
### Model specification
|
|
|
|
A model specification defines the structures and names of the model weights. Converters should fill out this specification with weights coming from a trained model.
|
|
|
|
In the Python code, a model specification is represented as nested [`LayerSpec`](python/ctranslate2.specs.LayerSpec.rst) objects, where intermediate objects define weights scopes and leaf objects define the weights name and value. This is similar to how you would define a model in PyTorch (using `nn.Module`) or TensorFlow (using `tf.Module`).
|
|
|
|
The final structure defines the full name of each weight that the C++ code should read when building the model. For example, a weight that can be accessed with `root.encoder.embeddings.weight` (where `root` is the top-level `LayerSpec` object) will have for name `encoder/embeddings/weight` in the serialized model.
|
|
|
|
Changes in this structure are tracked by a revision number (see next section).
|
|
|
|
### Model serialization
|
|
|
|
The model serialization is defined in the Python file [`model_spec.py`](https://github.com/OpenNMT/CTranslate2/blob/master/python/ctranslate2/specs/model_spec.py). It is a simple binary serialization that is easy and fast to load from C++.
|
|
|
|
Converted models have 2 levels of versioning to manage backward compatibility:
|
|
|
|
1. Binary version: the structure of the binary file
|
|
2. Model specification revision: the variable names expected by each model.
|
|
|
|
For example, adding a new field in the binary file will increment (1), but changing a variable name will increment (2).
|