To determine the mapping between the GPU card type and its compute capability, please visit this page
Tabby supports replicating models on multiple GPUs to increase throughput. You can specify the devices for model replication by using the --device-indices option.
Follow the instructions provided in the Model Spec.
Please note that the spec is unstable and does not adhere to semver.