HomeDocumentationAndroid Under The Surface
Android Under The Surface
15

From Render Tree to Pixels: The Display Pipeline

Android Display Pipeline and SurfaceFlinger for Flutter

March 28, 2026

Your build() method returns a widget tree. Flutter's rendering pipeline converts that into a render tree, then a layer tree during paint, then a series of GPU commands. Those commands produce pixels in a buffer. Then what?

The buffer doesn't go to the screen. Not directly. There are more steps between "Impeller finished rendering a frame" and "photons leave the display panel" than most developers expect. These steps involve a system compositor, hardware acceleration, display timing, and multiple processes coordinating through the kernel.

This post traces a single frame from the moment Impeller produces it to the moment it appears on the display. Understanding this pipeline explains why jank happens where it happens, what "frames dropped" really means, and why 60fps is a harder target than it looks.

The surface: your app's canvas

Your Flutter app doesn't render to "the screen." It renders to a Surface — a producer-consumer buffer queue owned by the system.

When FlutterActivity starts, it creates a SurfaceView (or a TextureView, depending on the configuration). The Android window manager allocates a Surface for this view. The Surface is backed by a BufferQueue — a kernel-managed queue of graphics buffers that acts as a pipeline between a producer (your app) and a consumer (the system compositor).

The BufferQueue typically contains 2 or 3 buffers (double or triple buffering):

javascript
┌──────────────────────────────────────────┐
│              BufferQueue                  │
│                                          │
│  Buffer 0: [being displayed]             │
│  Buffer 1: [being rendered to by app]    │
│  Buffer 2: [free, waiting for app]       │
│                                          │
│  Producer: Flutter engine (Impeller)     │
│  Consumer: SurfaceFlinger                │
└──────────────────────────────────────────┘

When Impeller finishes rendering a frame, it calls eglSwapBuffers() (for OpenGL ES) or presents the image (for Vulkan). This enqueues the completed buffer into the BufferQueue. The next free buffer is dequeued for the engine to render the next frame into.

The BufferQueue is the handoff point between your process and the system compositor. It's backed by shared memory — the graphics buffers are mmap'd into both your process's address space and the compositor's address space. The buffer data doesn't get copied; both processes have page table entries pointing to the same physical pages (the GPU's memory, usually). The kernel manages the synchronisation.

SurfaceFlinger: the compositor

SurfaceFlinger is the system process that composites all visible surfaces into the final image displayed on screen. It's one of the first processes started at boot (before Zygote), and it runs for the device's entire uptime.

At any given moment, the display shows a composite of multiple surfaces:

  • Your Flutter app's surface (the biggest one, usually)
  • The status bar surface
  • The navigation bar surface (if using gesture navigation, this might be translucent)
  • The wallpaper surface (visible behind transparent areas)
  • Notification surfaces (if a heads-up notification is showing)
  • System dialog surfaces (permission dialogs, ANR dialogs)

Each of these is a separate Surface with its own BufferQueue, produced by a separate process. SurfaceFlinger's job is to combine them into a single image, respecting z-order, transparency, transforms, and clipping.

SurfaceFlinger receives buffers from each Surface's BufferQueue and decides how to composite them. It has two strategies:

Hardware Composer (HWC) vs GPU composition

Modern Android devices have a Hardware Composer (HWC) — dedicated hardware that can composite multiple surfaces directly, without using the GPU. The HWC reads from each surface's buffer and blends them in hardware as the display scans out each line.

SurfaceFlinger asks the HWC: "Can you composite these 5 surfaces directly?" The HWC examines the surfaces — their positions, sizes, pixel formats, transforms, blend modes — and either accepts all of them, some of them, or rejects them.

HWC composition (the fast path):

javascript
Surface A (status bar)     ─┐
Surface B (your Flutter app) ─┼─→ HWC → Display panel
Surface C (nav bar)         ─┘

No GPU involved. The HWC reads directly from the surface buffers as the display refreshes. Minimal latency, minimal power consumption. This is the common case for a full-screen Flutter app with standard system chrome.

GPU composition (the fallback):

javascript
Surface A ─┐                    ┌─→ Display panel
Surface B  ─┼─→ GPU renders ─→ HWC
Surface C ─┘    into framebuffer

If the HWC can't handle the composition (too many layers, unsupported transforms, unusual pixel formats), SurfaceFlinger falls back to GPU composition. It uses OpenGL ES to render all the surfaces into a single framebuffer, then sends that framebuffer to the HWC as a single layer. This is slower and uses GPU resources that your app might need.

For a typical Flutter app — one surface covering most of the screen, with system bars on top — HWC composition handles everything. GPU composition kicks in when the layout is unusual: multiple app windows in split-screen, picture-in-picture overlays, complex system UI animations.

VSync: the heartbeat

The display panel refreshes at a fixed rate — 60Hz (16.67ms per frame), 90Hz (11.11ms), 120Hz (8.33ms), or variable (LTPO displays that switch between rates). Each refresh is a VSync signal generated by the display hardware and delivered to the system.

VSync is the timing signal that synchronises the entire rendering pipeline:

javascript
VSync 0                VSync 1                VSync 2
  │                      │                      │
  ├── App renders ──────┤                      │
  │   frame N            ├── SF composites ───┤
  │                      │   frame N           ├── Display shows
  │                      │                      │   frame N
  │                      │                      │
  │   (16.67ms budget)   │   (16.67ms budget)   │

At 60Hz, the pipeline works like this:

  1. VSync 0: The Flutter engine receives the VSync signal. The Dart UI thread runs: build(), layout, paint, producing a layer tree. The raster thread receives the layer tree and starts rendering GPU commands. This must complete within 16.67ms.
  1. VSync 1: The engine enqueues the completed buffer into the BufferQueue. SurfaceFlinger wakes up, dequeues buffers from all visible surfaces, and composites them (via HWC or GPU). This must also complete within 16.67ms.
  1. VSync 2: The display panel scans out the composited framebuffer. The user sees the frame.

At minimum, a frame has two frames of latency from render to display (one for the app, one for SurfaceFlinger). With triple buffering, this can be three. This is normal and invisible to the user — what matters isn't absolute latency but consistent latency.

What a dropped frame looks like

A "dropped frame" (jank) happens when your app doesn't finish rendering within its VSync budget:

javascript
VSync 0                VSync 1                VSync 2                VSync 3
  │                      │                      │                      │
  ├── App renders ───────┼── still rendering ──┤                      │
  │   frame N            │   (overran budget)   ├── SF composites ───┤
  │                      │                      │   frame N           │
  │                      │   frame N+1 SKIPPED  │                      │
  │                      │                      │                      │

The engine started rendering frame N at VSync 0 but didn't finish by VSync 1. SurfaceFlinger had no new buffer to dequeue from your surface at VSync 1, so it reused the previous frame's buffer. The display shows the same image for two refresh cycles. This is visible as a stutter — a brief pause in animation, a scroll that hitches, a transition that hesitates.

The user doesn't see "frame dropped" — they see an inconsistency in motion. The eye is extraordinarily sensitive to timing irregularities in motion. A single dropped frame in a smooth scroll is perceptible. Consistent 30fps (every frame takes 33ms) looks smoother than inconsistent 60fps with occasional 33ms frames mixed in, because the brain processes rhythm more comfortably than it handles disruption.

The Flutter rendering pipeline in detail

Let's trace a single frame from Dart to the BufferQueue:

javascript
VSync signal arrives (from SurfaceFlinger via Choreographer)
  │
  ▼
[Dart UI thread]
  ├── Animation ticks: update AnimationController values
  ├── Build phase: walk dirty widgets, call build() methods
  ├── Layout phase: compute sizes and positions (RenderObject.performLayout)
  ├── Compositing bits update: determine which RenderObjects need their own layer
  ├── Paint phase: walk the render tree, build the Layer tree
  ├── Compositing: assemble the scene from layers, hand to engine
  │
  ▼ Layer tree handed to raster thread
  │
[Raster thread]
  ├── Impeller receives the layer tree
  ├── Resolves textures (image decoding if needed, on I/O thread)
  ├── Generates GPU command buffers
  │     ├── Vertex data for each layer
  │     ├── Shader programs selected/compiled
  │     ├── Texture binds
  │     └── Draw calls
  ├── Submits command buffers to GPU driver
  │     → kernel ioctl on /dev/dri/renderD128 or /dev/mali0 (Post 3)
  │
  ▼ GPU executes commands asynchronously
  │
[GPU hardware]
  ├── Processes vertex and fragment shaders
  ├── Rasterises triangles
  ├── Writes pixel data to the Surface buffer
  │
  ▼ Buffer complete
  │
[Raster thread]
  ├── eglSwapBuffers() / VkQueuePresentKHR()
  │     → enqueues buffer into BufferQueue
  │
  ▼ Buffer available for compositor
  │
[SurfaceFlinger process]
  ├── Dequeues buffer from app's BufferQueue
  ├── Composites with other surfaces (status bar, nav bar)
  ├── Sends to HWC or GPU composition
  │
  ▼
[Display hardware]
  └── Scans out framebuffer → photons leave panel → user sees frame

The total time for this pipeline is typically 20-40ms from VSync to photons. The app's budget is 16.67ms (at 60Hz) for the Dart + raster work. If the app finishes in time, the pipeline flows smoothly. If not, the frame slips to the next VSync cycle.

Impeller vs Skia: what changed

Before Impeller, Flutter used Skia for rendering. The pipeline was similar in structure but different in a critical detail: shader compilation.

Skia compiled GPU shaders lazily — the first time a particular visual effect was rendered (a blur, a gradient, a clip), Skia would compile the necessary shader program. Shader compilation is expensive (1-20ms depending on the shader and the GPU driver). This caused first-frame jank: the first time you scrolled a list with shadows, or opened a page with a complex background, one or two frames would drop as shaders compiled. Subsequent frames were smooth because the shaders were cached. This was called "shader jank" or "first-run jank," and it was Flutter's most distinctive performance problem.

Impeller eliminated this by precompiling all shaders at build time. When the Flutter engine initialises, Impeller's shaders are already compiled into the engine binary. No lazy compilation, no first-run jank. The tradeoff: the engine binary is slightly larger (more precompiled shader variants), and Impeller can't generate arbitrarily complex shaders at runtime (Skia could). In practice, this tradeoff is overwhelmingly positive — predictable frame times matter more than theoretically unbounded shader flexibility.

Impeller also changed the GPU API. On Android, Impeller uses Vulkan (where available, which is most devices running Android 9+) or falls back to OpenGL ES. Vulkan gives Impeller more control over GPU resource management, command buffer submission, and synchronisation, which helps reduce driver overhead and improve frame time consistency.

For a more detailed guide, visit our Impeller guide.

Where jank comes from

With the full pipeline visible, the sources of jank map to specific stages:

Build phase too slow. Deep widget trees with expensive build() methods. Each call to build() is Dart code running on the UI thread. If your build() method does O(n) work (iterating a large list to build widgets), the build phase can overrun the budget. This is pure CPU work on the Dart UI thread.

Layout phase too slow. Complex layouts with intrinsic size calculations, CustomMultiChildLayout, or deep nesting of layout constraints. Layout is also CPU work on the UI thread. The layout algorithm's complexity depends on the render tree structure.

Paint phase generates too many draw commands. Many overlapping translucent layers, complex clip paths, or excessive use of saveLayer (which Opacity, ShaderMask, and ColorFiltered widgets can trigger). Each saveLayer allocates an offscreen buffer, renders into it, then composites it back — a GPU round trip.

Raster thread overloaded. Impeller can't generate GPU commands fast enough. This happens with very complex scenes (hundreds of draw calls) or large textures. Visible in the DevTools frame chart as a long raster bar.

GPU bottleneck. The GPU takes too long to execute the commands. Happens with expensive fragment shaders (blurs, complex effects) or high fill rate (many pixels to shade, especially on high-resolution displays). Mid-range and budget GPUs hit this earlier than flagships.

SurfaceFlinger composition slow. Rare for Flutter apps, but possible if the system is under load (many windows visible, GPU composition required).

Garbage collection pause. The Dart GC occasionally pauses the UI thread or raster thread to perform collection. Modern Dart GC is concurrent and generational, so pauses are typically under 1ms, but a poorly-timed pause during a tight frame can tip it over the budget.

Observing the pipeline

Flutter DevTools shows frame timing from the engine's perspective: UI thread time (build, layout, paint) and Raster thread time (GPU command generation). If the UI bar is long, the CPU work is the bottleneck. If the Raster bar is long, the GPU work is.

`adb shell dumpsys SurfaceFlinger` shows the compositor's view: which surfaces are active, their buffer states, HWC vs GPU composition mode, and frame statistics.

`adb shell dumpsys gfxinfo your.package.name` shows frame timing from the Android framework perspective. For Flutter apps, the numbers you care about are "Total frames rendered" and "Janky frames." The percentile breakdowns show your app's actual frame time distribution.

bash
adb shell dumpsys gfxinfo your.package.name framestats

This gives per-frame timestamps: when the VSync was received, when the render started, when it finished, when the buffer was enqueued. You can see exactly where each slow frame spent its time.

Perfetto captures the full picture: Dart UI thread work, raster thread work, GPU execution, SurfaceFlinger composition, and display VSync — all on the same timeline. This is the definitive tool for understanding where a janky frame went wrong, because it shows cross-process interactions that no single-process tool can capture.

Display refresh rate adaptation

Modern phones don't have a fixed refresh rate. LTPO displays can switch between 1Hz and 120Hz dynamically. The system chooses a refresh rate based on content: during animation, 120Hz; during static content, 60Hz or lower; during always-on display, 1Hz.

Flutter tells the system what frame rate it needs by setting a "frame rate category" on its Surface. When animating, Flutter requests the highest available rate. When idle (sitting in epoll_wait, Post 3), it requests nothing, and the system can drop to a lower rate to save power.

This is why an idle Flutter app uses almost no power — the display drops to a low refresh rate, the CPU sleeps in epoll_wait, and the GPU is idle. The rendering pipeline only activates when something changes. A single setState() call triggers a frame, which triggers a VSync request, which wakes the pipeline. When the frame is displayed and nothing else has changed, the pipeline goes back to sleep.

Triple buffering and latency

Android uses triple buffering by default: three buffers in the BufferQueue. While the display shows buffer A, SurfaceFlinger composites buffer B, and the app renders into buffer C.

Triple buffering smooths out timing irregularities — if the app occasionally takes 18ms instead of 16.67ms, the extra buffer absorbs the variance without dropping a frame. The cost is one extra frame of latency (the pipeline is three stages deep instead of two).

For most apps, this latency is imperceptible. For games and latency-sensitive input (drawing apps, musical instruments), the extra frame of latency matters. Flutter doesn't provide a way to switch to double buffering — the BufferQueue configuration is managed by the Android framework and SurfaceFlinger.

The full picture

From build() to photons, the path crosses:

  1. Your Dart code (UI thread) — building and laying out the frame
  2. The Flutter engine's C++ code (raster thread) — generating GPU commands
  3. The GPU kernel driver (kernel space) — executing GPU commands
  4. The GPU hardware — rasterising pixels into a buffer
  5. The BufferQueue (kernel-managed shared memory) — handoff to compositor
  6. SurfaceFlinger (separate process) — compositing all surfaces
  7. The Hardware Composer (dedicated silicon) — final composition
  8. The display panel (hardware) — photons

Eight software/hardware stages, three processes (your app, SurfaceFlinger, kernel), two privilege levels (user space and kernel space), and dedicated hardware. All executing within 16.67ms, 60 times per second, on a phone running a dozen other apps.

The next post looks at the sandbox: the permission model, SELinux, and the isolation mechanisms that protect apps from each other and the system from apps.

This is Post 8 of the Android Under the Surface series. Previous: The Activity Lifecycle and Why Flutter Fights It. Next: The Sandbox: Permissions, SELinux, and Isolation.*

Related Topics

surfaceflinger flutterandroid display pipelineimpeller rendering androidvsync flutterandroid hardware composerflutter frame renderingandroid gpu pipeline flutter

Ready to build your app?

Flutter apps built on Clean Architecture — documented, tested, and yours to own. See which plan fits your project.