tabby

History

Meng Zhang 5e9584dac9 Merge commit '1eafd7353492049f05b1065dad189c5d9c7be59b' as 'crates/ctranslate2-bindings/ctranslate2'		2023-06-04 15:48:57 -07:00
..
opennmt_ende_wmt14	Merge commit '1eafd7353492049f05b1065dad189c5d9c7be59b' as 'crates/ctranslate2-bindings/ctranslate2'	2023-06-04 15:48:57 -07:00
opus_mt_ende	Merge commit '1eafd7353492049f05b1065dad189c5d9c7be59b' as 'crates/ctranslate2-bindings/ctranslate2'	2023-06-04 15:48:57 -07:00
README.md	Merge commit '1eafd7353492049f05b1065dad189c5d9c7be59b' as 'crates/ctranslate2-bindings/ctranslate2'	2023-06-04 15:48:57 -07:00
benchmark.py	Merge commit '1eafd7353492049f05b1065dad189c5d9c7be59b' as 'crates/ctranslate2-bindings/ctranslate2'	2023-06-04 15:48:57 -07:00
benchmark_all.py	Merge commit '1eafd7353492049f05b1065dad189c5d9c7be59b' as 'crates/ctranslate2-bindings/ctranslate2'	2023-06-04 15:48:57 -07:00
requirements.txt	Merge commit '1eafd7353492049f05b1065dad189c5d9c7be59b' as 'crates/ctranslate2-bindings/ctranslate2'	2023-06-04 15:48:57 -07:00

README.md

Benchmark tools

This directory contains some scripts to benchmark translation systems.

Requirements

Python 3
Docker

python3 -m pip install -r requirements.txt

Usage

python3 benchmark.py <IMAGE> <SOURCE> <REFERENCE>

The Docker image must contain 3 executable files at its root:

/tokenize $input $output
/detokenize $input $output
/translate $device $input $output, where:
- $device is "CPU" or "GPU"
- $input is the path to the tokenized input file
- $output is the path where the tokenized output should be written

The benchmark script will report multiple metrics. The results can be aggregated over multiple runs using the option --num_samples N. See python3 benchmark.py -h for additional options.

Note: the script focuses on raw decoding performance so the following steps are not included in the translation time:

tokenization
detokenization
model initialization (obtained by translating an empty file)

Reproducing the benchmark numbers from the README

We use the script benchmark_all.py to produce the benchmark numbers in the main README. The script builds all Docker images defined in subdirectories and reports the results as a Markdown table. The execution can take up to 3 hours.

# Run CPU benchmark:
python3 benchmark_all.py cpu

# Run GPU benchmark:
python3 benchmark_all.py gpu