[ samkhn Home | Comments or bugs? Contact me ]


Linux Graphics Primer

Published: March 29, 2023

Userspace programs use libdrm to talk to the DRM subsystem. libdrm provides a set of methods that use ioctl syscall to talk to DRM. $ ls /dev/dri/ prints card and renderD, where N is a non-negative int. Let's say card0 and renderD128. card0 is the file that represents the graphics card. This is the primary node. renderD128 represents a render node, which is provided by DRM as a character device. The DRM API requires that unprivelged clients to authenticate to a DRM master client. To allow some processes to access GPU resources without auth, render nodes were introduces. Render nodes provide fewer controls (cannot do privelged ioctl or kernel mode switches (more on that below). Targeted for off screen rendering and GPGPU. GPU driver must advertise that it supports render nodes in order for the render file to appear.

DRM = direct rendering manager. DRM manages multiple programs trying to access GPU hardware (before it was possible for one process to hijack it completely, DRM subsystem was introduced to resolve this). DRM core API consists of a generic API (which includes GEM and KMS) and a specific driver API. Vendors implement a DRM driver that registers to the DRM core, implements core parts of the DRM API and can also provide its own functions. Example driver.

DRM core has two memory managers: TTM (translation table manager) and GEM (graphics execution manager). TTM can support both UMA and NUMA architectures but is more complex than GEM. Provides control over VRAM that a CPU won't be able to read/write. Vendor driver implementations will usually create a struct that wraps/implements TTM (example with Radeon). struct drm_global_reference contains a field called enum ttm_global_types global_type;. Important parts of TTM are buffer objects and graphics address remapping table (GART) which gives a GPU driver DMA to host system memory for texture, polygon meshes. (note: GART system was later used for I/O virtualization with disk controllers, neat!) scheduling and fences (for a GPU to tell userspace that some resource is no longer used). struct drm_sched_[rq|fence|entity|job]; struct drm_gpu_scheduler;

Any linux device driver can implement the DMA buffer sharing API to share DMA buffers across devices. This was used to implement PRIME, a system for sharing framebuffers between DRM drivers of integrated and discrete graphics cards, thus allowing for GPU hotswapping. (two new ioctls: convert GEM handle to DMA-BUF handle, convert DMA-BUF handle to GEM handle). Further reading

GEM is the graphics execution manager and was built in response to TTM complexity. It abstracts away NUMA complexity. struct drm_gem_[create|open|close|flink] p; int ret = ioctl (fd, DRM_IOCTL_GEM_[CREATE|OPEN|CLOSE|FLINK], &p); NOTE: some TTM drivers will still implement GEM API e.g. Radeon.

KMS is the kernel mode-switch. It manages the display controller and the last step of the rendering pipeline. KMS sits on the die of the GPU and communicates with the monitor. If you want to change the resolution or refresh rate, these are the bits you change.
  • Each CRTC represents a scanout engine of the display controller pointing to a framebuffer. CRTC takes the pixel data in the framebuffer and generates a video mode timing signal using a PLL circuit. There can be multiple CRTCs (one for each display, each with their own frame buffer). There can also be multiple framebuffers (including secondary ones for real time rendering use cases).
  • Planes are memory objects containing buffers that feed the CRTC framebuffer.
  • Connectors represent physical connector (VGA, DVI, DisplayPort, HDMI). Kernel stores connection status, EDID (extended display identification) data, DPMS (display power management signaling e.g. put monitor to sleep). Connectors can receive signals from one encoder at a time. Each connector type can only support a subset of encoders.
  • Kernel mode-switch chooses frequency and propogates it to encoder clocks. Encoders represent the hardware block that converts digital signal of buffer to analog signal that a display can show. e.g. transition-minimized differential signaling used by HDMI and DVI.


At this point you're inside the GPU. Checkout AMD and NVIDIA docs for more here :). Check out NVIDIA PTX ISA or AMD talk: From Source to ISA