Vulkan

Cross-vendor GPU-compute API; the portable ML fallback.

Vulkan is a low-level, cross-platform graphics and GPU-compute API from the Khronos Group — the open, vendor-neutral counterpart to CUDA and Metal. It’s an open, royalty-free standard rather than a single piece of “open-source software”: the specification and reference tooling (headers, validation layers, conformance tests) are openly published, and driver implementations range from fully open source (AMD/Intel via Mesa’s RADV and ANV) to proprietary (NVIDIA’s). Its draw for ML is exactly that neutrality: a single Vulkan compute backend runs on NVIDIA, AMD, Intel, and many mobile/embedded GPUs, so it’s the portable fallback when you can’t assume an NVIDIA card (and thus CUDA) is present. The trade-off is that it’s lower-level and verbose, and its ML-library ecosystem is far thinner than CUDA’s, so it usually delivers less peak throughput on NVIDIA hardware than CUDA does. In local inference, GGML ships a Vulkan backend, which is what lets llama.cpp accelerate .gguf models on AMD and Intel GPUs that CUDA and Metal can’t touch.