Metal

Apple’s GPU-compute API; powers MLX and Metal-backed ML.

Metal is Apple’s low-level interface for graphics and GPU computing, its answer to CUDA and Vulkan , but exclusive to Apple hardware (Mac, iPhone, iPad). For ML it matters because Apple-silicon chips share one pool of memory between CPU and GPU (unified memory), so Metal lets models use the GPU without copying their weights across the slow link that separates the CPU and GPU on a typical PC. The compute side is exposed through MPS (Metal Performance Shaders) and the higher-level MPS Graph; PyTorch’s mps device and MLX both sit on top of it. In local inference, GGML ships a Metal backend, which is how llama.cpp and Ollama speed up .gguf models on a Mac instead of falling back to the CPU.