Metal

Apple’s GPU-compute API; powers MLX and Metal-backed ML.

Metal is Apple’s low-level graphics and GPU-compute API — its answer to CUDA and Vulkan, but exclusive to Apple hardware (Mac, iPhone, iPad). For ML it matters because Apple-silicon chips share memory between CPU and GPU (unified memory), so Metal lets models use the GPU without copying weights across a PCIe bus. The compute side is exposed through MPS (Metal Performance Shaders) and the higher-level MPS Graph; PyTorch’s mps device and MLX both sit on top of it. In local inference, GGML ships a Metal backend, which is how llama.cpp and Ollama accelerate .gguf models on a Mac instead of falling back to the CPU.